The New York Times, June 14, 2010
Edited by Andy Ross
The new IBM supercomputer system Watson can understand a question posed in
natural language and respond with a precise, factual answer. This fall, the
producers of the TV quiz show Jeopardy will pit Watson against some of the
game's best former players.
To test Watson's capabilities against
humans, IBM scientists have began holding live Jeopardy tests. By the end of
one day's testing, the human contestants were impressed, and even slightly
unnerved, by Watson. Several made references to Skynet, the computer system
in the Terminator movies that achieves consciousness and decides humanity
should be destroyed.
IBM has a knack for pitting man against machine.
In 1997, the company's supercomputer Deep Blue famously beat the grandmaster
Garry Kasparov at chess. But this time, IBM wanted a grand challenge that
would meet a real-world need.
When an IBM executive suggested taking
on Jeopardy he was immediately pooh-poohed. Deep Blue played chess well
because the game is perfectly logical and can be reduced easily to math. But
the rules of language are much trickier. Jeopardy's witty, punning questions
are especially tricky. And winning requires finding an answer in a few
David Ferrucci, IBM senior manager for its Semantic Analysis
and Integration department, heads the Watson project. An AI researcher who
has long specialized in question-answering systems, Ferrucci chafed at the
slow progress in the field. But he craved an ambitious goal that would break
new ground. Jeopardy fit the bill.
Computer scientists now use
statistics to analyze huge piles of documents, like books and news stories.
Algorithms take any subject and automatically learn what types of words are
most strongly correlated with it. In theory, this sort of statistical
computation has been possible for decades, but it was impractical. All that
changed in the last ten years. Computer power became drastically cheaper and
the amount of online text exploded.
In 2006, Ferrucci tested IBM's
most advanced system by giving it 500 questions from previous Jeopardy
shows. The results plotted on a graph and compared with human Jeopardy
winners were dismal. But Ferrucci argued that with new hardware he could
make faster progress than ever before. If they could succeed at Jeopardy,
IBM could bring the technology to market as question-answering systems. In
2007, his bosses gave him three to five years and a team of developers.
Watson has enormous speed and memory. Ferrucci's team input millions of
documents into Watson to build up its knowledge base, including books,
reference material, any sort of dictionary, thesauri, folksonomies,
taxonomies, encyclopedias, novels, and so on.
Watson is fast enough
to try thousands of parallel ways of tackling a Jeopardy clue. Ferrucci
decided that previous systems don't work well because no single algorithm
can simulate the human ability to parse language and facts. Instead, Watson
uses parallel algorithms to analyze a question in different ways, generating
hundreds of possible solutions, and ranks these answers according to
plausibility. Watson produces an enormous number of possibilities, then
ranks them by assessing how likely each one is to answer the question.
By 2008, Watson had edged into the Jeopardy winner's cloud on the graph.
IBM executives called up Harry Friedman, the executive producer of the show,
and suggested putting Watson on the air. Friedman quickly accepted the
challenge: "Because it's IBM, we took it seriously."
hold a special match pitting Watson against one or more famous winners from
the past. If the contest includes the very best players, Watson may lose.
It's pretty far up in the winner's cloud, but it's not yet at the top.
Ferrucci says his team will continue to fine-tune Watson, but improving
its performance is getting harder. "When we first started, we'd add a new
algorithm and it would improve the performance by 10 percent, 15 percent,"
he says. "Now it'll be like half a percent is a good improvement." Watson
might lose merely because of bad luck.
IBM plans to sell versions of
Watson to companies in the next year or two. Watson could help
decision-makers sift through enormous piles of written material in seconds.
Its speed and quality could make it part of rapid-fire decision-making, with
users talking to Watson to guide their thinking process.
At first, a
Watson system could cost several million dollars, because it needs to run on
a big IBM supercomputer. But within ten years an artificial brain like
Watson could run on a much cheaper server, affordable by any small firm, and
later perhaps even on a laptop.
Watson-level AI could make it easier
for citizens to get answers quickly from big bureaucracies. But critics
wonder about the wisdom of relying on AI systems in the face of complex
reality. And while service companies can save money by relying more on such
systems, customers crave the ability to talk to a real human on the phone.
A lot of human knowledge isn't represented in words alone, and a
computer won't learn that stuff just by encoding English language texts, as
Watson can answer only questions asking for an
objectively knowable fact. It cannot produce an answer that requires
judgment. Watson doesn't come close to replicating human wisdom.
AR I'm enthused. In 1991,
at Springer, I edited a book on IBM research in question-answering systems
(LNAI 546 on LILOG) and saw that there was real hope of cracking the natural
dialog challenge. Than, at SAP, my team's TREX engine showed how close we
were to having the hardware and the algorithms (for parallel statistical
evaluations) to make working systems for answering questions. Other IBM
projects over the years have been closing in on this goal. Now at last the
prize is almost in our grasp: first Jeopardy and then ...