Amazon Mechanical Turk
Published by Jeremy Douglass November 16th, 2005 in Researchers, Features, Software.The mechanical Turk (discussed here earlier) was an 18th-century chess-playing automaton - although behind the clockwork was actually a man hiding in a box. The public was fascinated by an automated approach to a mechanistic but incredibly complex problem (the rules of chess). While a hoax, the Turk was an effective one in that it seemed both an amazing accomplishment and imaginable one. Computational chess was in fact possible - although it would not appear for another two centuries.
Like the 18th century Mechanical Turk, the Amazon Mechanical Turk also hides human beings within the box of automated processes. This time, however, the nature of the ‘machine’ is an openly acknowledged secret. The Amazon webservice takes tasks which are very difficult for contemporary computers (recognition of photo elements, for example) and turns them over to human players, who can make a game out of recognition while earning rewards.
The result could be described as is a human intelligence brokerage, a marketplace for micro-tasks, or a harnessing of the collective intelligence that supposedly characterizes most developments of the “Web 2.0” era. Just as in the distributed computing projects of Distributed.net, the contributions of many users will make difficult problems tractable. Just as in collective editing projects like Wikipedia, the contribution is not computer-power, but thought-power. Amazon’s Turk is less like Wikipedia and more like the Guttenberg Project’s Distributed Proofreaders, however, in that here work is broken up into fine units and completed in a more atomic, unidirectional way, with tasks being completed - as opposed to Wikipedia, which in general has no concept of “done.” An even better comparison might be Google Answers (“Ask a question. Set your price. Get your answer”), in that it is a marketplace for tasks. However the focus of Google Answers is on expert knowledge, while Amazon’s HITs (Human Intelligence Tasks) are typically “extraordinarily difficult for computers, but simple for humans to answer.”
Slashdot readers have already linked the development (although it isn’t clear how) to Luis von Ahn and his thesis on “Human Computation”. von Ahn is one of the principal investigators for the CAPTCHA Project, the “Completely Automated Public Turing Test to Tell Computers and Humans Apart,” which most internet users are now familiar with in the form of distorted text at the gateways of account creation systems such as Hotmail. While CAPTCHA decoder projects like PWNtcha have attempted to respond with improved recognition software, the Amazon Mechanical Turk puts a context and a pricetag to the human intelligence barrier between an automated process and its resolution.
This is not to say that the Amazon Project would be an ideal way for spammers to attack CAPTCHA defense systems, as both the cost and the financial accountability of those who submit jobs are probably deterrents. However it is a good model of how an effectively distributed massive CAPTCHA attack might work. There are already non-task-based models of decentralized communities overcoming web security - the most popular of which is BugMeNot (“Bypass Compulsory Web Registration”). We may look back soon and seen the Amazon Mechanical Turk as another round in the running firefight over restricting access or providing labor to either humans or scripted agents - a battle whose front line is constantly changing.
For artists interested in obfuscated (or, we might say, “aesthetically manipulated”) digital text, it should be interesting to watch these conflicts over text-image readability play themselves out.
Tags: CAPTCHAs, forums, images, commerce, researchers
re:
Do you think that the AMT (not Alan M. Turing, but the Turkish delight) Website collects data about those who participate and use the system? In other words, are users providing service for a payment that acknowledges part of their work (their HITs) but hides the work they do for Amazon product development (it is beta still, right?), sales, and marketting. Does the site, or its developers, use this data for optimization? Amazon already seems to be built on this kind of obvious work (the reviews you write) and hidden work (your contributions to the data on book-buying habits — reader’s of this book also like… in which recommendations are the sign of the data you have contributed?)
But perhaps I should just ask what you mean to say with “supposedly”?
Two answers Mark:
First, I’m sure that there will be some value to the “human subjects”-type data that is a byproduct of AMT operating, although I’m not sure yet how valuable that will be to Amazon. What would be fascinating is what (if any) disclosure rules govern the selling any subject data NOT submitted as work-for-hire… selling it back to the job owner, that is.
Example: I’m a market analyst, interested in how quickly people recognize the brand of a car and whether that is influenced by car color. I submit scads of jobs for “identify which picture has a Honda” etc. and pay - but I don’t care about the work data (car IDs) because that is a red herring - I already knew that. What I care about is reaction time - how fast each job was completed correctly - and Amazon sells me the reaction times, which I correlate against car color. The question: should AMT workers know that they are selling their reaction times in that case - even if they normally give that data away for free?
Second answer: By supposedly, I was actually just referring cynically to the term “Web 2.0,” which is so amorphous that it means just about anything. Social software / collectivizing is one of a number of Web 2.0 buzzwords… AJAX is another, etc….
Google Image Labeler is another recent phenomenon in leveraging collective intelligence - this time presented as a game (like pictionary) with partners. Pairs compete with other teams to label as many photos during a time limit - if both players suggest the same tag, a photo is considered labeled.
It is a great concept to make important data a byproduct of play. Still, the gameplay mechanic means I’m much more likely to label a picture “bird” than “cardinal.” An even bigger concern is that for people intent on winning, their goal is orthogonal to the goal of the system.
When I looked at the ratings board, one of the top rated players was using the account AlwaysSayCool. It reminds me of the Southampton team that won an interated prisoner’s dilemma contest by programming their bots to recognize each other and collude. There are many ways that “are we thinking the same thing?” is a different question from “are we seeing the same picture?” - it would be interesting to find out what kinds of data artifacts get pulled out of the Image Labeler result set.
[via Unmediated]
For an interesting discussion on the research of Luis von Ahn, developer of the ESP Game on which Google Image Labeler is based, see an interesting discussion at O’Reilly Radar of his July 2006 tech talk at Google[video].
The Sheep Market by Aaron Koblin is an interesting art project that sheds some light on both Labeler and the Turk:
The project design notes are quite clever, with references to sheep-like behavior, ‘Dolly’, and other cultural significations of sheep. But most captivating of all is the section on the book Le Petit Prince:
The Sheep Market raises issues not just for the Mechanical Turk, but also for Google Image Labeler, and indeed any informatic game that implicitly demands “label this a sheep!”