As email spam detection has almost completely eliminated unwanted messages from our inboxes, spammers have been seeking out new avenues for luring unsuspecting users to their websites. When blogs became popular, they quickly seized on posting unrelated messages into the comments section where they would attract the notice of both the blog writer and other readers.
The introduction of a captcha (the malformed words you are sometimes asked to decipher when filling out forms for sites) usually blocks the automated engines that post this data, but some organizations have taken to employing cheap labor (unemployed people working from home or overseas) to manually post data to bypass this protection.
Nonetheless, these messages are often blatant and use poor English, making them easy to spot. However, I’ve seen two cases on my blog recently that upped the bar: well-written text that didn’t clearly hawk a product or service trying to worm its way into the comments section. Wordpress requires me to approve any comment before it goes up, and each of these had me puzzled at first as to whether they were legitimate comments or a spammer.
The first one claimed to highlight a problem with my site:
I tried viewing your site in my new iphone 4 and the structure does not seem to be correct. Might wanna check it out on WAP as well as it seems most smartphone layouts are not working with your website.
At first glance, it just sounds like someone trying to helpfully point out a problem. However, a few things immediately struck me as odd. First they reference a “new iphone 4”. An iPhone 4 is not new. We have been on the iPhone 4s for almost nine months. Second, WAP is a much older mobile phone web technology from back when we all had small clamshell phones, and it (thankfully) hasn’t been used for anything serious in years. Also, I look at the site frequently on my iPhone and I know it looks fine.
Suspicious, I decided to look up the phrase “my new iphone 4 and the structure does not seem to be correct” in Google, and sure enough, I received 128,000 hits for this exact same comment on other blogs. Presumably, these were people who didn’t recognize it as junk and let it through. I’m not 100% sure exactly what their angle was. They don’t clearly hawk any product, but their username had an associated aol website that I am guessing they were trying to lure people to.
I received a second one today that was even more devious. On my post about installing carbon monoxide detectors, a person named “Laureen Caleb” wrote:
Smoke Detectors are very important on our homes. The best type of smoke detector are those photoelectric smoke detectors because they do not emit radiation unlike ionization type smoke detectors. *.,,: Kind regards […link to “health and wellbeing website” redacted…]
This is particularly well targeted, since the content is actually related to the post. However, I never discussed at all what type of detector I used, so why were they sending me this?
Suspicious again, I used my experience in geolocating IP addresses to find out where this user comes from. Wordpress identified their IP address as 110.93.89.27, and running a traceroute to it led me to a path that took over 230 milliseconds to traverse, which definitely placed them outside of the US. A look-up in MaxMind’s geolocation database placed it in the Philippines.
The link takes the user back to a healthcare website for researching various ailments, so I am guessing that this is a well thought out campaign to drive traffic. Identify interesting medical concerns, go find blogs that touch on their topics, and post links back to the site. Use cheap labor in the Philippines to spread them far and wide.
Spam like this is hard to detect. These are well written messages that do not have the obvious typographical errors that show them to be fakes, and their marketing messages are hidden. In fact, it’s not really spam at all. It’s more akin to telemarketers, where a real person is on the other end.
If I have to think about whether these comments are legitimate or not, I don’t foresee a future where anti-spam guards are going to be able to detect them. This is going to be a major nuisance.
I think you have actually spelled out how WordPress (or some other blog hosting company) could block these types of comments in the future. Even with cheap overseas labor, it is probably not worth crafting a unique comment for each blog of the money the spammer wants to post a comment to. So WP could simply flag multiple instances of identical comments as suspicious. Also, these comments ultimately only help if they end up driving traffic to another website; WP could block or automatically flag comments with links to websites associated with spam marketing (or with such websites associated with their username).
I have some other ideas about this, but I have to go post this identical comment to a few hundred other blogs first …
Hmm… let’s see… Geolocating your IP address places you in New York with a RoadRunner connection, and that would match up with your Columbia email address… *probably* you are a real person with a real comment 🙂
Nice idea. It would be pretty easy to implement a hashing algorithm to ease the lookups. It might get fooled by trivial comments like “Way to go!” but these could be filtered out pretty easily. For sites hosted on wordpress.com, it would definitely work well.