Mike Grehan says...

Random musings about search marketing, flying around the planet, networking and people watching.

Sunday, December 04, 2005

The sandbox: About as real as PageRank is vital.

I've been looking through some posts on this whole sandbox issue, because I know that it will get yet another mention in the organic listings session at SES Chicago.

I find it to be mind numbing just reading around the forum threads what people are saying about this non issue. By that, I mean a non issue to those brands, products and services that are loved and cherished by both their audience and search engines alike

I wrote a piece over at ClickZ some time ago trying to explain in simple terms what is usually confused as "the sandbox - and that's the fact that your website is crap and nobody is interested in it.

Because, you see, if you do have anything that any search engine end user wants to see or know about - the search engine better damn well make it available, or the end user will simply go somewhere else and look for it.

Now before anyone takes umbrage at what I'm saying, let's a have a sweet moment of honesty. Take a look at your web site. Yes, I know that you're all proud of it because you had something to do with it, or you built it, or whatever. But try and remove yourself from these emotional feelings about your work, or the product of your own management and see if you can come up with answers to these short questions:

What makes this offering any different to anything else online?

What is it on this site which would literally, compel people to talk about it and link to it?

Why do I honestly believe this site deserves to be in the top ten rankings at search engines?

Now. Go the competitor's web site which is ranking in the top ten (in the case of where you're not) and ask the same questions. Now compare the answers (providing you've been honest) and you may find the answer to what you believe to be the "sandbox".

And I guarantee it's a business solution you need and nothing at all to do with any magic code or indexing issues.

Let me pose another question here. Go around the forums as I did, (if you can endure a lot of pap with the frequent little gem of intelligence - the usual constitution of forum posts) and tell me this: How many of the posts said something like "I'm the webmaster of the [place hard worked for, well marketed and recognised brand name here] and I'm in something called the sandbox at Google."

Couldn't find one, huh?

No neither can I. But I can sure find a lot of people whinging about this whole (non) issue who think that because they have a web site - repeat, web site, not good business model - that it should automatically do well because the pages have the keywords on, and everything.

Here's a thing, linkmaster Ken McGaffin, discovered a while back when he was doing some research that, the Financial Times had hidden links on its home page.

Now, according to webmaster guidelines, that's a bit of a no-no at Google. But, hey, guess what... No ban, no problem, no big deal.


Because if I do a search for just those two important letters "ft", if I don't find the Financial Times right up there - I'll go to Yahoo! to find it.

Let's assume for a moment that, the Financial Times launches a brand new mini site to promote its free pocket guide to the fastest growing European companies. Yes, they bought the domain two months ago and the site is coming straight out of the blue sky... Do you think that they'll even notice this so called "sandbox" thingy, whatever...

Let me have a quick stab at what I believe, is the only slow-down process for a new site. In particular, a site which is attached to an already successful brand, i.e. it's a site which, like a new calf, is born with legs to stand on in minutes anyway.

Let's talk about how search engines use cache to reduce overheads. Let's think about how much easier it would be if, instead of interrogating the inverted index for every single query (even the inverted index method can use a lot of overhead if it's used for every single query), what if answers to certain popular queries were cached, and in fact, what if a search engine could use a predictive method of prefetching query results according to the time of day and user behaviour analysis?

To do that you'd need a tiered index. And you'd need a lot of user behaviour analysis. I've written about this before. And I'm about to embark on a white paper (as I still struggle with this damned third edition of my book) on methods of reducing the overhead and quickening response time for queries by having different tiers of indexing and different levels of caching at search engines.

How long would it take to get into the "tried and tested results for a popular query" cache?

I'd like to tell more right now. But as I'm still an amateur blogger and there's an Italian restaurant beckoning...


  • At 6:11 PM, Gerald said…

    > as I still struggle with this damned third edition of my book

    great, you mention it. i did not have the courage to ask for it ;-) any ideas when it will appear.

  • At 8:33 PM, MikeG said…


    I've had to sign some NDA's in order to get some advance information to use. And I've also been delayed waiting for information to use so that I don't have to republish it so soon after.

    Early part of the new year and it should be good to go.



  • At 12:23 PM, Brian Turner said…

    Ah, Mike - looking at what people call the sandbox now is to look at a process that has become very complex - and on forums, a word much abused.

    A few years back, you could apparently drop a few hundred thousand links for semi-competitive keywords, and expect to rank well for them after a month or so.

    This was when text link advertising really began to explode, and it seems that Google tweaked something to reduce the impact of links too quickly.

    It was something only volume link builders (people who deal with links in the tens and hundred thousands) really noticed at first - content SEO's saw nothing, of course.

    Here's a short history of how the issue was originally covered:

    Google Sandboxing - an early history

    It seems to have begun as a 3 month delay on link volume - I posted an example of tracked rankings on Google to try and illustrate what we meant by sandboxing then:

    It seems Google saw from forums that something they were doing was limiting SEO efforts to manipulate their rankings - and that this was good for Google. So they expanded the concept.

    I then posted an example of how 2 domains - receiving the same link building work - were ranking very differently. It seemed that the age of the domain had become a key factor in the links being discounted, with older domains being allowed more freedom with links.

    Google Sandbox: Age of site and Allegra

    This was before Google's big patent on ranking factors was published, which described use of historical data for ranking purposes (among other things):

    Information retrieval based on historical data

    I stopped trying to cover the issue then, as "sandboxing" had already become a dirty word on forums - the debates were pointless.

    Since then Google appears to have developed the sandbox further across a more complicated range of factors. At its heart is the prevention of manipulation of rankings in Google.

    Now, of course, sandboxing has become a euphemism for "crap site, can't rank", and public discussions on sandboxing seem to pit misperception against misperception, so unlilkely to reach useful conclusions. If some SEO's don't see an effect - that's great for them.

    An interesting suggestion of how Google could apply historical data for counting/discounting links, was covered at Threadwatch a while back:
    Google and the Golden Ratio.

    My first impression of the claim for using phi was "tinhat conspiracy". But after thought, it would have a brilliant simplicity than can certainly appeal to Google.

    Rand Fishkin has written about the Sandbox a lot as well:
    2005 Analysis of Google's Sandbox

    and even has a Sandbox detection tool on his site - though I personally think the actual sandboxing process has grown too complex to make easy judegements on these days.

    Overall, I hope there's some useful background reading there for you if you wish to cut through the invective of forums. And I had to register a blogger account just to make this reply, so well done on that. :)

    Anyway, have a great Christmas, and hope the family is recovering well.


Post a Comment

Links to this post:

Create a Link

<< Home