Experiences on the Front Lines of User Interfaces and Web Development

Google's ajax search and how they keep their referer accurate

In recent months, Google rolled out a new search interface for modern browsers that does "instant" search as you type.  This all happens via ajax without reloading the page.  Why does any of that matter?  Well, I was curious how Google handled sending a decent HTTP Referer to websites that rely on search URLs that have the search term in them.

In Chrome 8, if you type this into your address:

Then type a search term, the URL gets morphed into something like this:

Here is the important stuff of the above URL; everything else is tracking and other superfluous parameters:

Notice the URL doesn't include a "?", but instead is a "#".  The "#" is an anchor which can be manipulated via Javascript without reloading the entire page.  And, according to the HTTP RFC, the referer must not contain the URL fragment (i.e. the anchor).

So did Google break web analytics that rely on seeing a referring URL that has a search query? (e.g. http://www.google.com/search?q=nick+burwell).  The short answer is no, they didn't break it.. instead they have some clever hacks in place.  Below is the longer answer if you care to know how.

First, they do some tricks so that the returned HTML is reasonable.  The search results show the final URL as the "href" so that when you view source or hover over the title, the link looks correct:
<a href="http://www.nickburwell.com" class="l vst noline" onmousedown="return rwt(this,'','','','1','AFQjCNGNS6_ln-c98gFq3JnZ0DRXnZdvBA','zSMLp00vqdRWdHmlnka0Lg','0CBMQFjAA')"><em>Nick Burwell</em> Designs</a>

But on mousedown some Javascript is run that morphs the link into this:
<a href="/url?sa=t&amp;source=web&amp;cd=1&amp;ved=0CBMQFjAA&amp;url=http%3A%2F%2Fwww.nickburwell.com%2F&amp;ei=SrpATejsLpHAsAPPtpzoCg&amp;usg=AFQjCNGNS6_ln-c98gFq3JnZ0DRXnZdvBA&amp;sig2=zSMLp00vqdRWdHmlnka0Lg" class="l vst noline" onmousedown="return rwt(this,'','','','1','AFQjCNGNS6_ln-c98gFq3JnZ0DRXnZdvBA','zSMLp00vqdRWdHmlnka0Lg','0CBMQFjAA')"><em>Nick Burwell</em> Designs</a>

That intermediate page above returns the following HTML:
var a = parent, b = parent.google, c = location;
if ( a != window && b )
  if( b.r )
    b.r = 0;
    a.location.href = "http://www.nickburwell.com/";

  <META http-equiv="refresh" content="0;URL='http://www.nickburwell.com/'">

This effectively has the browser load this page, then do a javascript redirect to the final page. The result is that this blank page which has
in it will then show up as the referrer when nickburwell.com is loaded.  It's smart.

Also, by using location.replace( ) instead of location = '', the browser replaces the entry in its history, so that when users hit the "back" button, they go back to the ajax search page, rather than seeing a blank white page that has the redirect code on it.
comments powered by Disqus