CAPTCHA

CAPTCHA, an acronym for "Completely Automated Public Turing Test to Tell Computers and Humans Apart," refers to various authentication methods that validate users as humans, and not bots, by testing users with a challenge that is simple for humans but difficult for machines. 

CAPTCHAs prevent scammers and spammers from using bots to fill out web forms for malicious purposes.

The term CAPTCHA was coined in 2003 by a group of computer science researchers at Carnegie Mellon University led by Luis von Ahn and Manuel Blum.

Launched by von Ahn in 2007, reCAPTCHA v1 had a dual aim: To make the text-based CAPTCHA challenge more difficult for bots to crack, and to improve the accuracy of optical character recognition (OCR) being used at the time to digitize printed texts. 

In 2009, Google acquired reCAPTCHA and began using it to digitize texts for Google Books while offering it as a service to other organizations.

According to Google, every time its CAPTCHAs are solved, that human effort helps digitize text, annotate images, and build machine learning datasets. This in turn helps preserve books, improve maps, and solve hard AI problems.

In 2014 Google released reCAPTCHA V2 which asks you click a checkbox and sometimes additionally to solve a verification challenge based on a classic Computer Vision problem of image labeling. In this version of the CAPTCHA challenge, you’re asked to select all of the images that correspond with the clue. It's much easier to tap photos of cats or turkeys than to tediously type a line of distorted text on your phone.

How does Google know when a web user has selected all the images that fit the description? 

If the benefit for Google is us users labelling some data for an AI model, surely, they don’t already know what the images contain in advance. The answer is when Google presents you with a panel of, say, six images, five of the images are already labelled. The web user is asked to identify five images correctly, including, the one Google are looking to label. You only need to correctly identify the four images Google already has labelled, and your answer for the fifth unknown image goes into the AI training dataset.

reCAPTCHA v3, which debuted in 2018, does away with the check box and expands upon the AI-driven risk analysis of no CAPTCHA reCAPTCHA. ReCAPTCHA v3 integrates with a web page via JavaScript API and runs in the background, scoring a user's behavior on a scale of 0.0 (likely a bot) to 1.0 (likely a human). Website owners can set automated actions to trigger at certain moments when a user's score suggests they may be a bot.

Comments