I'm Not A Robot
02 August 2019
CAPTCHA stands for Completely Automated Public Turing Test to tell Computers and Humans apart and it’s a way to filter out bots and fraudulent automated activity from the behaviour of real people. It's an umbrella term describing several different techniques presented to the user to determine if they are human. A CAPTCHA challenge could be a random collection of letters and numbers, text obscured with background noise, puzzle challenges or audio challenges asking the user to enter the letters and numbers heard with a lot of background static noise. All of these are termed CAPTCHA as they're asking the user to demonstrate they're human and not an automated computer program.
The theory is humans are very good at being able to identify distorted text, numbers and audio but not a computer program. A computer program i.e. bot can't reliably identify displayed text or audio and so it's a very effective way to stop bot activity affecting your website.
The problem is CAPTCHA in its many incarnations causes significant challenges for people with disabilities. Asking a user to decipher distorted text may mean vision-impaired people will be unable to complete it. Presenting an audio challenge may mean people with a hearing impairment will have difficulty, reorientating a visual 3D puzzle may affect users with mobility and cognitive impairments and disabilities are rarely isolated, users may have a range of disabilities.
If your security check is relying on some kind of user input to determine the "humanness" of the person at the other end, it is ultimately doomed to failure.
A brief history of Google CAPTCHA
One of the most popular forms of CAPTCHA is provided by Google and its reCAPTCHA. The earliest iteration relied on a user having to decipher distorted text. The user would enter in the displayed text and if it was correct would pass the challenge, be confirmed as human and allowed to continue doing whatever they wanted on the website. Often such a challenge was difficult to complete for anyone, the distortion was too hard to understand, and users would have to click refresh to trigger the display of a new text puzzle.
The user experience of this method was very frustrating and for a user with disabilities that challenge was impossible to pass. The text would be displayed via an image, the image had no text alternative (because that could unintentionally aid the bots) and the only other alternative was the audio fall back option. The audio would announce several letters and numbers and ask the user to re-enter what they heard, but as the audio was combined with static it to was very hard to understand.
Whilst this method of confirming users is still used occasionally, it's been superseded by reCAPTCHA v3 from Google. This had the lofty aim of removing any puzzle challenge at all, all the user had to do is confirm via a tickbox they are not a robot. The checkbox is labelled correctly and has full keyboard support and superficially looks like a great replacement. It uses various indicators to determine if the user interacting with the website is legitimate. If the user has a Gmail account and uses Google services that’s a stronger likelihood that they're real, if the user took time to complete the checkbox, and scrolled around the site that too are good indicators that the user is real and confirmation they're human.
The problem is when one of these checks failed the puzzle challenge returned and the user would be asked to select a combination of images which are street signs, or traffic lights or shop fronts. So even though the checking was becoming more robust to hide the challenge for a user, it still displayed the challenge if those background checks indicated "bot-like" behaviour. Anecdotally there has been discussion that screen readers and other assistive technology trigger the display of the puzzle challenges in the majority of instances because the behaviour from these types of devices were identified as a-typical, outside the ordinary patterns users without disabilities have.
Google has now introduced a new CAPTCHA process which is its version 3 reCAPTCHA. Completely banishing any image challenge and instead returning a probability score indicating the likelihood of bot-like activity. This allows developers to potentially funnel the user through further steps which are accessible but never showing an inaccessible puzzle challenge. This is a great outcome, but unfortunately, the solution is only workable for very large organisations with the resources to dedicate to identifying and implementing accessible alternatives to funnel users through.
Purposely attracting the bots
An alternative often discussed is the honeypot method. Where a hidden form field is on any screen which requires user input, for example, a signup screen. A form field is rendered hidden via CSS and acts as an enticement for bots to reveal themselves by filling in. When the form data is submitted the website checks to see if this hidden field is empty. If it has been filled, you can assume the form data being submitted is from a bot and ignore the input. As users never see the hidden form field, then it should never be filled by a legitimate user as the theory goes.
This approach also has its downsides. If the form field is hidden using the type=hidden input element, the bot may be smart enough to determine the field is a honeypot and ignore it and pass the serverside check. If the field is hidden via CSS, and a user is browsing with CSS turned off (not as unlikely as it sounds) then the user will see the field and potentially complete it causing their input to be ignored. For a high-frequency large organisation or government website, where the integrity of the user has to be confirmed and assured you can't rely on a process that isn’t likely to let in edge cases.
Other alternatives include presenting the user with a basic maths question and asking for the correct answer. The security mechanism would require many 1000's of question and answer combinations to ensure a bot doesn’t encounter the same question and answer combination. Ultimately bots are very clever pieces of autonomous software, if they're smart enough to prob and submit to many 1000's of websites then a basic mathematical problem is probably not a barrier to screen scrape and solve. Notwithstanding the technique may be a barrier for users with cognitive impairments to complete as well.
The problem is all current CAPTCHA variants and alternatives where users are asked to select a number of images or reorientate a picture are very effective at stopping bots and also very effective at stopping people with disabilities from using digital services. At the moment it’s a compromise between securing your system or restricting user's ability to access your site.
CAPTCHA and other alternative security mechanism vendors may discuss compliance against a Voluntary Product Accessibility Template (VPAT) or Web Content Accessibility Guidelines (WCAG 2/2.1) but often this compliance is very limited where their product passes individual checks in isolation.
Their CAPTCHA product may have keyboard support or may have text labels which pass individual accessibility checks, but when combined becomes practically unusable. Just because something may be technically accessible it can also be totally unusable. The only verifiable way to test whether a product is accessible is by requesting evidence that people with a range of disabilities can pass the security mechanism 100% of the time, anything less than this means it is very likely their CAPTCHA replacement product is not effective.
That being said there are CAPTCHA replacement technologies that are beginning to show real promise. An emerging trend is in browser checking where the user isn’t ever exposed to any puzzle or challenge, in fact, no interaction at all takes place. All verification of the user takes place within the browser and ultimately this improves the user experience for everyone.
Security and accessibility can coexist together, but it means looking beyond what a technology vendor says and performing your own independent accessibility checks. The old ways of verifying a user through challenging them to prove they are human are outdated and exclusionary and are no longer acceptable. It’s a challenge but one that the Centre for Inclusive Design has experience in and we can advise on suitable alternatives.
This article was written by Ross Mullen on behalf of Centre for Inclusive Design.