Posted by admin on 03 6th, 2010


Could The New Google Spider Be Causing Issues With Websites?

In this article, we hope to share with you the many aspects that this important subject has to offer you.

Around the time Google announced “Big Daddy,” there was a new Googlebot wandering the web. while then I’ve heard stories from clients of webspots and attendrs ready down and previously unindexed content receiving indexed.

I episode digging into this and you’d be startled at what I found out.

First, let’s look at the timeline of actions:

From this point forward, we will let you in on little secrets that will help you implement this subject into your life.

In belated September some shrewd spider watchers over at Webmasterworld dotted single Googlebot activity. In reality, it was in this thread: http://www.webmasterworld.com/forum3/25897-9-10.htm that the bot was first reported on. It upset some posters who thought that perhaps this could be standard users masquerading as the famed bot.

Early on it also appeared that the new bot wasn’t obeying the Robots.txt heading. This is the protocol which allows or denies crawling to parts of a webspot.

Speculation grew on what the new crawler was pending dull Cutts citeed a new Google adversity records midpoint http://www.mattcutts.com/blog/good-magazines/#note-5293. For those that don’t know, dull Cutts is a older wangle with Google and one of the few Google employees chatting to us “standard folk.” This cite happened in November.

There wasn’t greatly cite of Big Daddy pending early January of this year when dull again blogged about it asking for comment. http://www.mattcutts.com/blog/bigdaddy/

greatly comment was given on the accuracy of the fallout. There were also those that asked if the Mozilla Googlebot (known as “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)” in your visitor wood) and Big Daddy were connected, but no answer was made.

Now I’m ready to open some of my own speculation:

I do in reality think the two are connected. In reality, I think this new crawler will eventually return the old crawlers just as Big Daddy will return the existing records infrastructure. http://www.textbooklinkbrokers.com/bwood/notes/310_0_1_0_C/

Why is this important?

Based on my observations, this crawler may be able to do so greatly more than the old crawler.

For one, it emulates a newer browser. The old bot was based on the Lynx textbook based browser. While I’m assured Google added skin as time went on, the prime Lynx browser is just that prime.

Which explains why Google couldn’t pact with effects like JavaScript, CSS and blaze.

However, with the new spider, built on the Mozilla engine, there are so many possibilities.

Just look at what your Mozilla or Firefox browser can do itself render CSS, read and implement JavaScript and other scripting languages, even emulate other browsers.

But that’s not all.

I’ve talked to a few of my clients and their spots are receiving hammered by this new spider. It has gotten so bad that some of their attendrs have departed down because of the level of passage from this one spider!

On the bonus margin, I have clients who went from a few hundred thousand indexed pages to over 10 million in just a few weeks! factually while December, 2005 there’s been a 3500% grow in indexed pages over an 8 week phase! Just so you know, this is also the client’s spot that went down because of the titanic level of crawling episode.

But that’s still not all.

I have another client which uses IP recognition to attend content based on a guise’s geographic setting. If you live in the US you get American content and pricing; if you live in the UK you get UK content and pricing. As you may think, the UK, US, Canadian and Australian content is all very alike. In reality about the only thing noticeably different is the pricing view.

This is my trouble if the duplicate content gets indexed by Google what will they do? There’s a good gamble that the spot would be penalized or even banned for violation of the webmaster value guidelines set onward by Google here: http://www.google.com/webmasters/guidelines.html#value

This is why we implemented IP recognition so that Googlebot, which crawls from US IP addresses only sees one kind of the spot.

However, a check of the attendr wood shows that this new Googlebot has been visiting not only the US content but also the content of the other sections of the spot. openly, I hunted to verify that the IP recognition was effective. It is. This leads me to marvel then; can this browser spoof its setting and/or use a alternate?

envisage that the browser is smart enough to do some of its own adversitying by viewing the spot from manifold IP addresses. If that’s the basis then those who pretense spots are ready to have troubles.

In any basis, from the partial observations I’ve made, this new Google both the records midpoint and the spider are ready to change the way we do effects.

Knowing the ins and outs of this topic will help you to fully understand the importance of this entire subject.

Post a Comment


No Comments »

No comments yet.

RSS feed for comments on this post. TrackBack URL

Leave a comment