Google Search Results Poisoning

Tue, 16 Jan 2007 09:31:13 GMT
by pdp

When GNUCITIZEN was down for almost a week in December last year, I've experienced something that I've remained silent about till now. I wanted to investigate it myself before sharing it with the rest of the world. So, here is the discovery, it is up to you to decide whether it something or absolutely nothing.

![Search Results Poisoning Screen JPG](/files/2007/01/search-results-poisoning-screen.jpg "Search Results Poisoning Screen JPG")

Because GNUCITIZEN was down for almost a week, I was concerned that Google and other search engines will index Wordpress default error page since that was what it was showing when there was no database connectivity. In the following days, after the site went on-line, I tried to query Google by searching for "GNUCITIZEN". I was expecting to see the GNUCITIZEN front page, AttackAPI, the backdooring articles and some Full-Disclosure and Bugtraq posts. To my surprise, the search result was quite different. GNUCITIZEN front page was still there holding on number one, however the rest was all gone. All other links were pointing to some websites I had never seen before.

Usually, I don't care much about Google. GNUCITIZEN is about quality of content not about SEO (Search Engine Optimization). However, I was quite curious to know what had happened. I opened some of the links presented on the search result page and they all seamed to be fine. I tried Technorati and other blog engines to verify that these guys are not using some unknown black SEO techniques but none was found. I examined the search result page and patterns started to emerge. All of the indexed sites were showing parts of the notorious Wordpress default error page that is presented when there is no database connectivity.

Although, I cannot verify what had happened one thing is quite obvious: it seams that Google Search Results can be poisoned. I am sure that other search and aggregation engines that have one or another type of algorithm for sorting and rating content are affected by similar issues.

I believe that Google had indexed the error page from my site and dynamically linked it to other sites that have similar problems. These sites were available on the query result page because the Wordpress error has been indexed for quite sometime. The Google Bot does not understand whether it craws errors or valid pages. It is all content! Here it is a simple scenario how this can be abused.

Mike, the SEO expert, decides to make a small fortune. Mike sets a small network of splogs. Each splog is equipped with a bunch of PPC (Pay per Click) Ads. Once Google Bot arrives to one of Mike's splogs, a mod_rewrite directive matches the user agent and sends the notorious Wordpress error page (other types of error pages are possible too). The Google Bot, will associate Mike's splogs with pages that contain the above mentioned Wordpress failure. This means that, if your website happens to display the Wordpress No Database Connectivity page when Google Bot craws it, users who try to reach you through Google's Search page will get a poisoned result set. Mike has successfully hijacked your keywords.

As I said before, this is not verified. I am just a massager and this article is outlining my experience as a user and my observations as a security researcher. If you believe that I am wrong with my interpretation please leave a comment.

I cannot think of a prevention mechanism for this attack vector since I am not familiar with Google Bot internal logic. Anyway, don't trust your keywords! They all belong to us!

Archived Comments

d-yd-y
Maybe it is a part of some protection against content stealing?
JoeJoe
Google gives penalties to pages/sites offering duplicate content, it places these sites in the supplementals. Since all these error messages are the same, it should've put your site in the supplementals. As you kept your rankings, that must mean google was using the cached version of your pages. Still it is kind of weird, maybe google was using a mix of your cached keywords and the new error message. If you are right about all this - it would be pretty easy for spammers to steal keywords and make huge profits.
pdppdp
Joe, you can say that again. I hope it is not what it seams to be.
Jason DukeJason Duke
I think you're wrong :) I work as an SEO and overall am pretty good at it From the example you gave, grepblogs.com within the screenshot I think it is nothing more than a site that mentioned a post(s) from here and whilst being crawled also had a DB connection issue. They deserved to rank due to the inbound links and domain status the site / pages had obtained and were relevant to the search query of (I presume "gnucitizen") There is definately no poisoining going on from the limited information in your post, but Google poisoning is definately possible, happens and is managable by many with the knowledge of how G works in practice. If you want to discuss it more feel free to get in touch
pdppdp
I am not much into SEO, so I didn't know what was really going on at the time when I had this particular problem. Thanks for the clarification. I appreciate it. Although, SEO is considered a taboo topic in some security circles, I can clearly see its importance to web application security. I am particularly interested in shaping web traffic in terms of fiddling with RSS feeds, spawning related blogs, etc. IMHO, If attackers know how to use and abuse search engines, the damage they can cause is greater then what we can imagine. I am most definitely interested in continuing this topic.
Jason DukeJason Duke
You're welcome mate. I'll happily look more into the specifics if you can share more information but as to your point about SEO being relevant to security aspects you are entirely right. I won't go into specifics here but there is much cross over from SEO -> security in being able to rank as well as from security -> SEO in being able to propogate. I'll stop now but I am sure everyone reading this (whether an experienced SEO or security professional) understands to what I am referring.
AdriaAdria
Why is SEO a taboo in some security circles? Is it just because they're opposed to marketing in general? BTW, I agree with Jason that there's nothing wrong with the search results you showed. The ordinary behavior for any search is that Google will show, at most, two search results from any one website. If people want to see more results from a particular, that's why they have the "More results..." link. However, his other comments are more a sales pitch than anything else, because his fear-mongering about "poisoning" Google search results is just nonsense. Get in touch with him, indeed. I would also challenge the assertion that there is any significant connection between security and SEO. (I would love to have more of a discussion of this, re: your mention of web app security) SEO is marketing, plain and simple, and any attempt to make it seem more technical or geeky or mysterious is just self-aggrandizement. Beyond a basic knowledge of how the internet works, how websites are built, etc (which I would be embarrassed to say was technical to someone who, say, actually writes code), it's just a marketing exercise.
Jason DukeJason Duke
Reflecting back on this, it's pretty obvious I was wrong after all!
pdppdp
what do u mean Jason?
Mike JenningsMike Jennings
Olympics-related search results on Google have been poisoned this weekend, redirecting to "Qooglesearch.com" or bogus sites that trigger Firefox's malware warnings.