Keeping Pace with the IBM Watson “win”

| February 18, 2011 | 1 Comment

Now that the Jeop­ardy “Man vs Machine” chal­lenge is over, lets spend a lit­tle time eval­u­at­ing what Wat­son did (and what it didn’t) to get a bet­ter under­stand­ing of how appli­ca­tions like IBM’s Wat­son will “Keep Pace” in the future

So what did we witness ?

Regard­less of your knowl­edge of the under­ly­ing tech­nol­ogy, the per­for­mance of Wat­son was impres­sive (and enter­tain­ing).   Just the steps needed to answer a ques­tion in a span of approx­i­mately 3 sec­onds (the time to read the ques­tion) con­sisted of:

  • Break apart the words of the “answer” into dis­creet pieces
  • Ana­lyze the the pieces against its database
  • Iden­tify rela­tion­ships and pos­si­ble answers
  • Cal­cu­late “degrees of confidence”
  • Make the deci­sion to answer the ques­tion — mean­ing “press” the button

Very impres­sive to say the least.   But should mankind be worried ?

What Wat­son did REALLY well !

Some of the things that stood out:

1.  When it “knew” the answer Wat­son could press the but­ton  instan­ta­neously.   Sev­eral arti­cles have stated that it took Wat­son exactly four mil­lisec­onds — So a key prob­lem for the human con­tes­tants was try­ing to beat Wat­son to the punch (the con­tes­tants looked very frus­trated at times with this aspect).

2.  The abil­ity to derive CORRECT answers .. QUICKLY !

Ignore the man behind the curtain ..

Now if you were watch­ing closely,  sub­tle flaws could be iden­ti­fied with Watson’s per­for­mance.    Some of those flaws are based on:

  • Com­put­ers have an IQ of ZERO
  • Com­put­ers still can’t “think” OR rea­son like human beings
  • Wat­son wasn’t designed to THINK — it was designed to play Jeop­ardy (and under­stand the games nuances)

So with that as work­ing back­ground, some ele­ments to watch on the even­tual rerun:

1.  Short answers — If you watch day 2 of the show — some­where around the mid­dle of the show was a period of “answers” that were very short (i.e. a few words).  It appeared as though Wat­son didn’t have enough infor­ma­tion to get an answer with a high enough con­fi­dence — and this allowed the human play­ers to catch up.

2.  Reason-ability.   In the first “Final Jeop­ardy” the cat­e­gory was US Cities and Wat­son even­tu­ally came back with an answer of “Toronto”.  The response from IBM was that “Cities” was not specif­i­cally men­tioned in the “Answer” — but both humans were able to tie the cat­e­gory and ques­tion together.   Wat­son couldn’t

3.  Lim­i­ta­tions — Wat­son was pro­grammed to answer ques­tions based on the rules of “Jeop­ardy”.   You would be hard pressed to make it do some­thing else — OR — answer ques­tions that weren’t in its data­base.  It is likely that Wat­son didn’t have enough infor­ma­tion to asso­ciate the US Cities ref­er­enced in the “answer” s to World War II battles

We can rebuild him .. Bet­ter, Stronger, Faster …

The beauty of the Wat­son demon­stra­tion is that the tech­nol­ogy is at an inter­est­ing point where it may be able to apply this approach to tasks that are truly data inten­sive.   It can evolve to be yet another tool that peo­ple use to get their jobs done.   It will be VERY inter­est­ing to see what IBM does next with Watson.

But as we know — tech­nol­ogy marches on.    And right after the Jeop­ardy chal­lenge, the fol­low­ing Wall Street Jour­nal blog dis­cusses some­thing that should be BETTER than Wat­son and is only 4–5 years away !

http://blogs.wsj.com/digits/2011/02/16/computer-scientist-racr-will-eclipse-watson/

To “Keep Pace” with future technologies

Pop­u­lar­ity: 14%

Tags: , ,

Category: Articles

About the Author ()

Comments (1)

Trackback URL | Comments RSS Feed

  1. JEOPARDY! , as you may know, is one of the most pop­u­lar game shows on tele­vi­sion and is cel­e­brat­ing its 25th anniver­sary this sea­son. Dur­ing my time here at SONY, I have had the good for­tune to work on sev­eral games based on JEOPARDY! , includ­ing mul­ti­ple ver­sions for PCs and mobile phones, as well as skill-based and ITV versions.

Leave a Reply