Thursday, February 10, 2011

Knowledge Discovery: Questions & Answers, Clues & Response

Question and Answer startups are popping up all over Silicon Valley and one in particular, Quora, has been getting a lot of press recently and Quora enthusiasts abound. There is a great deal of community building and knowledge discovery that can happen with Q&A sites of high quality and Quora is one of them. As Quora builds in collections of Q&As and expands its topics, it should appeal to a more general audience rather than the "tech-heavy" persona it now presents.

But one Q&A, actually Clues and Response, that I am looking forward to is the IBM Watson challenge, an official Jeopardy tournament competition with Ken Jennings and Brad Rutter, those famous Jeopardy champions playing against IBM's supercomputer Watson. A brief article, Building Watson: an Overview of the DeepQA Project (AAAI-Fall 2010) can easily be found online.

IBM says that Watson is an application of advanced natural language processing, information retrieval, knowledge representation and reasoning, and machine learning technologies to the field of open domain question answering. Watson took three years of intense R&D with a core team of 20 or so people.

At its core, Watson is built on IBM's DeepQA technology for hypothesis generation, massive evidence gathering, analysis, and scoring. Watson is a workload optimized system designed for complex analytics, made possible by integrating massively parallel POWER7 processors and the IBM DeepQA software to answer Jeopardy! questions in under three seconds.

Watson is made up of a cluster of ninety IBM Power 750 servers (plus additional I/O, network and cluster controller nodes in 10 racks) with a total of 2880 POWER7 processor cores and 16 Terabytes of RAM. Each Power 750 server uses a 3.5 GHz POWER7 eight core processor, with four threads per core. The POWER7 processor's massively parallel processing capability is an ideal match for Watsons IBM DeepQA software which is embarrassingly parallel (that is a workload that executes multiple threads in parallel).

While primarily an IBM effort, the development team includes faculty and students from Carnegie Mellon University, University of Massachusetts, University of Southern California/Information Sciences Institute, University of Texas, Massachusetts Institute of Technology, University of Trento, and Rensselaer Polytechnic Institute.

If you are even more curious about this topic,do what I did and download the eBook from Amazon "Final Jeopardy: Man vs. Machine and the Quest to Know Everything" and the final chapter will be downloaded to your Kindle after the Jeopardy matches are concluded. Or, if you want to wait until after the outcome of the match is known, you can wait for the print version to arrive at your local bookseller.

Shades of the chess match Deep Blue vs. Garry Kasparov in a former era, the Jeopardy episodes will be aired on TV from February 14–16, 2011.

Question answering technologies have a business purpose, they can help support professionals healthcare, customer service and support, business intelligence, knowledge discovery and the like.

Disclosure: I'm a Jeopardy fan - and I'm looking for a good Jeopardy game for my game console - the ones that are out there are insipid. Does anyone have a recommendation?

No comments: