Jeremy Wagstaff from Wall Street Journal on blog spam, CAPTCHA and bots in this post. Clearly a relevant issue, even my own blog is comment spammed occassionally :-( On a broader scale, this could make authentic comments in the blogosphere less apparent or visible. It might blur interesting conversations. However, I do believe this co-evolutionary battle will be won by the authentic and real people boosting the interesting conversations throughout the blogosphere. It is the relevancy of the comments by real users that prevail.
"Spammers send millions of emails every day, just by pressing a button. The same is true of comment spam. And yet my Indian spamming friend Mr. Kumar isn't a bot. It might not be his name, but I'm pretty sure he's a real person. He's part of the world of sweatshop spam, and a sobering reflection of just how bad the spam wars have gotten.
How do I know he's real? Well, first we need to look at the ongoing battle against comment spam, a problem that could be as big as email spam. As much as 95% of comments posted on blogs are spam, according to Akismet, a company that filters comment spam using a basket of methods similar to those used by email providers to filter email spam. The problem is bad enough for other filtering methods to appear: forcing people who want to comment on a blog or site to register first, for example, or allowing the owner of a site to approve comments before they appear. Or making the would-be commenter complete a test called "Completely Automated Public Turing test to tell Computers and Humans Apart," or CAPTCHA. If you have ever come across a box where you have to type in the letters or numbers you see in a distorted image, you'll know what I'm talking about. CAPTCHA has become popular, not just on blogs but on many Web sites that are attractive to spammers. Most humans, unless they have visual or reading difficulties, can distinguish the characters even if they're behind a maze of lines, or at odd angles, or in different colors. Bots cannot.
Or can they? It is possible to decode some of the images by running them through an optical character recognition program -- the same sort of software used to convert a scanned document to text. Enterprising individuals like a Chinese programmer who goes by the name Wangrun have developed software to decode different CAPTCHA systems. Depending on the complexity of the CAPTCHA image, Anhui province-based Mr. Wangrun charges between $500 and $5,000 per decoder. He declines to say what his customers use the decoders for, but says he has "very many" of them."