Slashdot’s best: defrauding advertising clicks

A poster with a cute name AssFace posted the following on Slashdot. The discussion relates to the recent article on News.com on Google battling with the click-through fraud. I am getting a bunch of 503 error messages off Slashdot.org right now, so figured out I’d save a local copy for myself, as the technology described is quite interesting for anyone involved in Web programming.

[AssFace replies to a comment, suggesting automating the process would be far easier than hiring cheap labor]

With Google it is not as easy as some other companies out there.

Google’s code is placed on the site as a javascript include that then gets rendered to the screen at runtime when a browser executes it.
That means if you have a script hit the page and get the source for it, all you get is the javascript include.

If you write a page that onClick let’s you view the content of the Google IFrame (the Javascript include dumps out an Iframe that then fills with a page off of Google), you will then see more of the code.
They have several layers of javascript and none of the pages render out links directly, so it is hard to scrape them with a bot, since a bot only sees the source.

You could load up the pages individually (outside of the iframe) and take a look at them, but it doesn’t always work and also when you load that page, it sends back a reference to Google of what the site/location/name of the page you are loading looks like.
So if you have a site ballsweat.com that has Google Ads on it so that you can look to see what the ads look like, as you start messing around with it to get a better idea, they will see that it is no longer showing up on the site and instead showing up on your hard drive (or if you like you can put it on your server and then they can read your code that you are using).

That alone will tip them that you are looking into it - but then you could claim that it was someone else and not you (assuming it was on a drive), but then that could also mean that you just use someone else’s site to test.

So anyway, back to getting the data, you would have to load up the source, and then either parse the javascript and execute it to build it the same way a browser does (hopefully there are objects in Windows that let you simulate this and then dump the post rendered contents into a variable which you can scan - don’t know about that),.
OCR is out of the question since that is not going to get you the proper link (the links are listed, but the payment only goes out if you click on the link which first routes it through a Google site so it can register the click and track the stats and then redirects you to the site). When you mouseover it shows the regular site link, but that is done via javascript.

Then you run the issue that Google would have to be retarded to just let a single IP crunch through a ton of ads everyday.
So then you have to worry about spoofing - in this case it could arguably be blind spoofing - but the problem there isn’t that you want to load web pages - that would actually work with blind spoofing (say I am computer A, and I want to tell server B that computer C is connecting to it, and that it should send the page data there), but the problem is again that it is only going to send raw HTML/javascript source down that connection and it is them going to drop off of that machine.
So the site (Google in this case since you loaded a page and then “clicked” a link) registers the hit, but the page never gets rendered, so the Google page is never displayed and the redirect never happens - one could assume that Google is aware of this and wouldn’t count that as a hit since the other page never gets loaded.

So even if you could past all of that (heh, feels like shades of Oceans 11), then there is the issue that Google (technically it isn’t Google, but a series of companies that they farm out the AdWords content - learned that from an investment bank friend that sat in on the IPO workings - yay) monitors this shit and looks for anomalies.
So while you were getting 200 hits a 2 clicks every day for a month, if you all of the sudden are getting 2000 hits and day and 200 clicks, they are going to investigate your site.
If nothing has changed to show that there should be new interest in your site (new ad placement, new content, etc) and they can do searches and see that there aren’t any new sites pointing to you - then all signs point to you cheating.

And then on top of all of that, we can show that a Gaussian distribution is mathematically inappropriate in this instance and something more along the lines of the same Brownian motion entropy watch that is done on stocks (which also involve humans moving in and out of interest of an item) would be more appropriate with the scale distorted to take into account the lack of the negative side and also scaling necessities (essentially taking into account that huge spikes on sites with no changes in traffic can only go so high before looking bad - and that “so high” is not so great).

All of that said, if you could pull all of that off, then you would be able to slowly increase the fake traffic/hits on your site so that you could make more money - but you would want to be very careful about the content on there first.

On top of all of that, I think it should be fairly clear that I have thought about all of this before, I would argue that there are other pay-per-click sites out there that make it much easier to exploit.
Also, this approach is the wrong way of going about it and I could list a few other ways that are technically better and safer, or others that are riskier but would generate more money and distribute the initial source of fault away from yourself - but hey - I won’t! heh.
Since they actually work, I don’t want to be the one responsible for starting some new scam via public posting of the workings.

Morally I debated trying to do it, but the forces of awesome have won out so far.
Also, I would love a job at Google and I don’t suspect that hacking their system is going to win me too many favors.

Posted in Technology at July 20th, 2004. Trackback URI: trackback

No Responses to “Slashdot’s best: defrauding advertising clicks”

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>