Author Topic: CAPTCHA (Read 899 times)

kpier883 · « **on:** May 27, 2009, 08:10:57 AM »

Recently there was a thread in which a comment was made about those goofy looking words that you have to type back into a web page to prove that you are a human rather than a computer when authenticating. These are known as a CAPTCHA which stands for Completely Automated Public Turing test to tell Computers and Humans Apart. I couldn't find that thread to add the comment there, so started a new thread.

Anyway, the next time you have to decipher one of these words, you can feel good about yourself because you may actually be doing something worthwhile as you decipher the word. I read an iteresting article in ComputerWorld, May 11 2009 edition. One of the guys that invented the CAPTCHA technology, Luis von Ahn, came up with a way to put these decipherings to good use. It seems that there is an effort underway to convert the archives (printed) of the New York Times to digital format, thus being able to provide it to many more people. One of the problems they encountered is that due to typefaces, smudges, etc. some of the words cannot be recognized by the Optical Character Recognition program, thus rendering the articles incomplete. By using these hard-to-read words as CAPTCHA challenges, people are actually deciphering the words. From the Article by Gary Anthes, ComputerWorld May 11, 2009:

"(Luis von Ahn) from Carnegie Mellon ... figured out how to harness the energy that millions of people collectively spend on this security measure every day. Words that the Times' optical character recognition software can't read are sent to a free CAPTCHA engine used by various Web services. Users are now deciphering 35 million words a day as they process these CAPTCHAs. Van Ahn says the job, which would have taken years with human editors, will be finished in just a few months."

So, good on ya, for figuring out things that the computer can't!

mlinder · « **Reply #1 on:** May 27, 2009, 08:13:58 AM »

That doesn't make any sense... If the computers can't read them in the first place to make the CAPTCHA image out of the words, how do they know that what we are typing is correct?

kpier883 · « **Reply #2 on:** May 27, 2009, 08:16:12 AM »

That is a good point. I imagine that there is a consensus after the first few times that it is responded to. That is a detail that is not mentioned in the article, but I like the way you are thinking....

Bob Wessner · « **Reply #3 on:** May 27, 2009, 08:19:41 AM »

When the computer/software develops the CAPTCHA challenge you have to match, it already knows the text string you have to match when you key-in the response. It then just compares what you type to the text string it used to develop the graphic CAPTCHA.

mlinder · « **Reply #4 on:** May 27, 2009, 08:20:28 AM »

There's more problems with this than just it trying to see if we are correct. The programs that make these images take actual text, turn it into an image, and ddistort the image as well as add graphical artifacts. It needs to start OUT as a set of digital ascii characters. I mean, the logistics of this are pretty screwed. There isn't a starting point. It's a feedback loop of *mostly wrong*...

mlinder · « **Reply #5 on:** May 27, 2009, 08:22:33 AM »

Quote from: Bob Wessner on May 27, 2009, 08:19:41 AM

When the computer/software develops the CAPTCHA challenge you have to match, it already knows the text string you have to match when you key-in the response. It then just compares what you type to the text string it used to develop the graphic CAPTCHA.

Bob that would work if the computer already knew the text string. In this case, this is recognition software failing. This program was designed to get people to read and type what the recognition software couldn't recognize. Again, if the recognition software knows what the text line is, then this program is moot, it already knows the text line.

kpier883 · « **Reply #6 on:** May 27, 2009, 08:42:00 AM »

Here is an article on the process at the site for RECAPTCHA:

http://recaptcha.net/digitizing.html

Inigo Montoya · « **Reply #7 on:** May 27, 2009, 08:50:24 AM »

Problem is there are bots capable of cracking captcha. So they are already becoming worthless.

Bob Wessner · « **Reply #8 on:** May 27, 2009, 08:51:52 AM »

Hmm? Not sure what the submission test they provide accomplishes. I typed the first couple as I read them and got a "correct" response. I then deliberately mistyped three sets and still received 3 additional "correct" responses.

kpier883 · « **Reply #9 on:** May 27, 2009, 09:00:42 AM »

From the site:

But if a computer can't read such a CAPTCHA, how does the system know the correct answer to the puzzle? Here's how: Each new word that cannot be read correctly by OCR is given to a user in conjunction with another word for which the answer is already known. The user is then asked to read both words. If they solve the one for which the answer is known, the system assumes their answer is correct for the new one. The system then gives the new image to a number of other people to determine, with higher confidence, whether the original answer was correct.

mlinder · « **Reply #10 on:** May 27, 2009, 09:18:34 AM »

Quote from: kpier883 on May 27, 2009, 09:00:42 AM

From the site:

But if a computer can't read such a CAPTCHA, how does the system know the correct answer to the puzzle? Here's how: Each new word that cannot be read correctly by OCR is given to a user in conjunction with another word for which the answer is already known. The user is then asked to read both words. If they solve the one for which the answer is known, the system assumes their answer is correct for the new one. The system then gives the new image to a number of other people to determine, with higher confidence, whether the original answer was correct.

Ah, that makes sense.

Bob Wessner · « **Reply #11 on:** May 27, 2009, 09:36:19 AM »

Quote from: kpier883 on May 27, 2009, 09:00:42 AM

From the site:

But if a computer can't read such a CAPTCHA, how does the system know the correct answer to the puzzle? Here's how: Each new word that cannot be read correctly by OCR is given to a user in conjunction with another word for which the answer is already known. The user is then asked to read both words. If they solve the one for which the answer is known, the system assumes their answer is correct for the new one. The system then gives the new image to a number of other people to determine, with higher confidence, whether the original answer was correct.

I'm having a dense day. Still don't know why;

http://recaptcha.net/digitizing.html

said I was correct no matter which of the two word pairs I mistyped deliberately?

kpier883 · « **Reply #12 on:** May 28, 2009, 07:57:17 AM »

Quote from: Bob Wessner on May 27, 2009, 09:36:19 AM

Quote from: kpier883 on May 27, 2009, 09:00:42 AM
From the site:

But if a computer can't read such a CAPTCHA, how does the system know the correct answer to the puzzle? Here's how: Each new word that cannot be read correctly by OCR is given to a user in conjunction with another word for which the answer is already known. The user is then asked to read both words. If they solve the one for which the answer is known, the system assumes their answer is correct for the new one. The system then gives the new image to a number of other people to determine, with higher confidence, whether the original answer was correct.

I'm having a dense day. Still don't know why;

http://recaptcha.net/digitizing.html

said I was correct no matter which of the two word pairs I mistyped deliberately?

The theory sounds good, but their example page doesn't seem to work. I actuall misspelled both words a couple of times (I thought) and still it said I was correct...

SOHC/4 Owners Club Forums

News:

Author Topic: CAPTCHA (Read 899 times)

kpier883

CAPTCHA

mlinder

Re: CAPTCHA

kpier883

Re: CAPTCHA

Bob Wessner

Re: CAPTCHA

mlinder

Re: CAPTCHA

mlinder

Re: CAPTCHA

kpier883

Re: CAPTCHA

Inigo Montoya

Re: CAPTCHA

Bob Wessner

Re: CAPTCHA

kpier883

Re: CAPTCHA

mlinder

Re: CAPTCHA

Bob Wessner

Re: CAPTCHA

kpier883

Re: CAPTCHA