May 04, 2008

reCAPTCHA

With the increasing encroachment of spam bots and other automated programs written to generate spam on the Internet, there are a growing number of occasions when a web site needs a CAPTCHA.

A CAPTCHA (Completely Automated Public Turing Test To Tell Computers and Humans Apart) is a program that can tell whether a user is a human or a computer, and has a variety of uses including:

  • Preventing comment spam in blogs
  • Protecting web site registration
  • Protecting online poll integrity
  • Preventing rapid dictionary attacks
  • Excluding search engine bots from accessing certain pages
  • Protect systems vulnerable to e-mail spam

Most CAPATCHA's are images with distorted text - frequently seen at the bottom of web registration forms, and looking something like this:

CAPTCHA example

Some of the original inventors of the CAPTCHA system at Carnegie Mellon University have implemented a means by which some of the effort and time spent by people who are responding to CAPTCHA challenges can be harnessed as a distributed work system.

This system, called reCAPTCHA, works by including "solved" and "unrecognized" elements (images which were not successfully recognized via OCR) in each challenge. The respondent thus answers both elements and roughly half of his or her effort validates the challenge while the other half is captured as work.

If you need a CAPTCHA service for your web site, then the CMU reCAPTCHA service is a nice way to provide that functionality and get your users to give back a little to education by archiving human knowledge through helping to digitize books in the process. There is an ASP.NET library for reCAPTCHA here and library modules for other programming languages and application plug-ins are available here.

Entry categories: Web Services
Posted by Jorgen Thelin at May 4, 2008 04:40 PM - [PermaLink]