reCAPTCHA - neat idea
Dec. 5th, 2007 11:50 amA CAPTCHA is a supposed test to tell if a form is being filled out by a human or a bot. It presents a distorted image and asks you to type in the original word, answer a question, or solve a simple math problem. CAPTCHAs are often used to try and stop spam bots.
CMU has come up with an interesting twist on the technique. reCAPTCHA presents words that failed OCR, not made up images. The puzzle answers supplied by humans are then used to improve these digitized books so they can be better searched, or more accurately computer-read to blind people. Currently reCAPTCHA is working with The Internet Archive, which I consider a worthy endeavor.
CAPTCHAs themselves have some issues, such as reducing accessability. But if you're going to use this technology, reCAPTCHA seems to be a nice way to go about it.
CMU has come up with an interesting twist on the technique. reCAPTCHA presents words that failed OCR, not made up images. The puzzle answers supplied by humans are then used to improve these digitized books so they can be better searched, or more accurately computer-read to blind people. Currently reCAPTCHA is working with The Internet Archive, which I consider a worthy endeavor.
CAPTCHAs themselves have some issues, such as reducing accessability. But if you're going to use this technology, reCAPTCHA seems to be a nice way to go about it.
no subject
Date: 2007-12-05 05:07 pm (UTC)If it failed OCR, how do you know what the right answer is?
no subject
Date: 2007-12-05 05:12 pm (UTC)no subject
Date: 2007-12-05 05:14 pm (UTC)no subject
Date: 2007-12-06 12:00 pm (UTC)no subject
Date: 2007-12-05 05:49 pm (UTC)I would be worried that presenting the words out of context will make it more difficult even for humans to tell what word it should actually be. When the word is unclear, seeing it in the proper context can go a long way toward figuring out what was meant. But most words that were unclear for OCR will probably still be easy for people to get, even out of context.
no subject
Date: 2007-12-05 05:58 pm (UTC)These two systems could tie together unknowingly. Somewhere out there, someone trying to view porn may be feeding a captcha solution to a spam engine which in turn is feeding that solution back to a recaptcha system that is using it to digitize books.
The Internet is so much greater than the sum of its parts.
no subject
Date: 2007-12-05 06:31 pm (UTC)I love it! This image is going to be bouncing around in my head all day.
no subject
Date: 2007-12-05 05:59 pm (UTC)no subject
Date: 2007-12-06 04:52 am (UTC):(
no subject
Date: 2007-12-06 11:59 am (UTC)If you don't wish to do that, I understand the choice.
no subject
Date: 2007-12-06 02:12 pm (UTC)it's a preposterous solution to a problem created by technology and lawyers.
no subject
Date: 2007-12-06 02:13 pm (UTC)no subject
Date: 2007-12-06 02:47 pm (UTC)