Blocking Google Web Accelerator From Stealing Your Visitors' Private Information

As much as I generally love and use Google regularly, its new "Web Accelerator" sure seems to go against their in-house slogan "Don't Be Evil."

This new utility from Google is supposed to "pre-fetch" links on the pages you're viewing, so while you're reading one thing it's already downloading the next likely links (it seems to skip divs with ids titled "nav" and "navbar" - which may be another way to protect your pages) that you may click on, so the perceived speed of the internet increases for you.

This could be a useful type of tool in the future, but Google's gotten it all wrong.

Firstly, since you're downloading and installing this application on your own computer, they could simply cache the pre-fetched pages to your own HD, without anyone else seeing them. But that's not what Google does. Instead they're keeping a copy of EVERYTHING you view (except for HTTPS connections, they claim, but some people have reported that when they try to view their bank account statement using the GWA they're seeing other people's account info - oops!) on their own servers, potentially violating privacy policies and trade secrets (although anyone who views truly sensitive information using a programme that they KNOW copies what they're viewing over to Google is just asking for trouble).

If that wasn't bad enough, the GWA is also "clicking on" just about every link in a web page, including those with "GET" variables that DO things, like say, delete message board posts.

Makers of web applications are furious that the GWA wipes out entire forums, PIMs, and other precious data because it just loads pages willy-nilly, and ignores Javascript prompts asking "Are you SURE you really want to DELETE all of your PRECIOUS DATA?". Some indicators seem to suggest it goes as far as to select the "Yes" answer on these types of pop-ups, but that is as of yet unconfirmed.

In any event, if you make a web app or service that allows people to log in to view private data that can be deleted, you probably want to protect your users from Google's newest atrocity.

Here's a handy way to stop Google from caching your web pages (this is only really useful if you're displaying private info to your users - but I'm passing it along anyway for anyone who doesn't want Google keeping a copy of pages that a search engine spider ordinarily would never see because it couldn't log in - but since the GWA sees everything your user sees then it gets "logged in" - heck, it even knows everyone's user names and passwords).

Create a .htaccess file with these contents and put it in your public_html web root directory (i.e. what people see when they go to http://yourdomain.com):

RewriteEngine On RewriteBase / RewriteRule ^\.htaccess$ - [F] # # Block Google's Web Accelerator RewriteCond %{HTTP_X_MOZ} ^prefetch$ RewriteRule ^.*$ - [F]

I've seen some people post that you could just block the GWA IP addresses, but those are likely to change. This way you can just make [F] (forbidden) any request coming from any utility that prefetches using that user agent header.

For a lot more on this new bit of spyware from Google:

37 Signals