Reddit blog readers requested someone who worked on Watson over in IAMA, and IBM Watson Research team to answer questions about Watson, how it was developed and how IBM plans to use it in the future.Below are answers to the top 10 questions, along with some bonus common ones as well.
Thanks for taking the time to answer, Watson Team!
Accelerating the innovation process -- making it easy to combine, weigh evaluate and evolve many different independently developed algorithms that analyse language form different perspectives.
Watson is a leap in computers being able to understand natural language, which will help humans be able to find the answers they need from the vast amounts of information they deal with everyday.
Think of Watson as a technology that will enable people to have the exact information they need at their fingertips.
Step One: Parses sentence to get some logical structure describing the answer X is the answer. antagonist(X). antagonist_of(X, Stevenson's Treasure Island). modifies_possesive(Stevenson, Treasure Island). modifies(Treasure, Island)
Step Two: Generates Semantic Assumptions island(Treasure Island) location(Treasure Island) resort(Treasure Island) book(Treasure Island) movie(Treasure Island) person(Stevenson) organisation(Stevenson) company(Stevenson) author(Stevenson) director(Stevenson) person(antagonist) person(X)
Step Three: Builds different semantic queries based on phrases, keywords and semantic assumptions.
Step Four: Generates 100s of answers based on passage, documents and facts returned from 3. Hopefully Long-John Silver is one of them.
Step Five: For each answer formulates new searches to find evidence in support or refutation of answer -- score the evidence. Positive Examples: Long-John Silver the main character in Treasure Island..... The antagonist in Treasure Island is Long-John Silver Treasure Island, by Stevenson was a great book. One of the great antagonists of all time was Long-John Silver Richard Lewis Stevenson's book, Treasure Island features many great characters, the greatest of which was Long-John Silver.
Step Six: Generate, get evidence and score new assumptions Positive Examples: (negative examples would support other characters, people, books, etc associated with any Stevenson, Treasure or Island) Stevenson = Richard Lewis Stevenson 'by Stevenson' --> Stevenson's main character --> antagonist
Step Seven: Combine all the evidence and their scores Based on analysis of evidence for all possible answer compute a final confidence and link back to the evidence. Watson's correctness will depend on evidence collection, analysis and scoring algorithms and the machine learning used to weight and combine the scores.
What is Watson's strategy for seeking out Daily Doubles, and how did it compute how much to wager on the Daily Doubles and the final clue?
Watson's strategy for seeking out Daily Doubles is the same as humans -- Watson hunts around the part of the grid where they typically occur.
In order to compute how much to wager, Watson uses input like its general confidence, the current state of the game (how much ahead or behind), its confidence in the category and prior clues, what is at risk and known human betting behaviours.
We ran Watson through many, many simulations to learn the optimal bet for increasing chances of winning.
It seems like Watson had an unfair advantage with the buzzer. How did Jeopardy! and IBM try to level the playing field?
Jeopardy! and IBM tried to ensure that both humans and machines had equivalent interfaces to the game. For example, they both had to press down on the same physical buzzer. IBM had to develop a mechanical device that grips and physically pushes the button. Any given player however has different strengths and weakness relative to his/her/its competitors.
Ken had a fast hand relative to his competitors and dominated many games because he had the right combination of language understanding, knowledge, confidence, strategy and speed. Everyone knows you need ALL these elements to be a Jeopardy! champion. Both machine and human got the same clues at the same time -- they read differently, they think differently, they play differently, they buzz differently but no player had an unfair advantage over the other in terms of how they interfaced with the game. If anything the human players could hear the clue being read and could anticipate when the buzzer would enable.
This allowed them the ability to buzz in almost instantly and considerably faster than Watson's fastest buzz. By timing the buzz just right like this, humans could beat Watson's fastest reaction. At the same time, one of Watson's strength was its consistently fast buzz -- only effective of course if it could understand the question in time, compute the answer and confidence and decide to buzz in before it was too late. The clues are in English -- Brad and Ken's native language; not Watson's.
Watson analyses the clue in natural language to understand what the clue is asking for. Once it has done that, it must sift through the equivalent of one million books to calculate an accurate response in 2-3 seconds and determine if it's confident enough to buzz in, because in Jeopardy! you lose money if you buzz in and respond incorrectly. This is a huge challenge, especially because humans tend to know what they know and know what they don't know. Watson has to do thousands of calculations before it knows what it knows and what it doesn't.
The calculating of confidence based on evidence is a new technological capability that is going to be very significant in helping people in business and their personal lives, as it means a computer will be able to not only provide humans with suggested answers, but also provide an explanation of where the answers came from and why they seem correct.
Watson is powered by 10 racks of IBM Power 750 servers running Linux, and uses 15 terabytes of RAM, 2,880 processor cores and is capable of operating at 80 teraflops. Watson was written in mostly Java but also significant chunks of code are written C++ and Prolog, all components are deployed and integrated using UIMA.
Watson contains state-of-the-art parallel processing capabilities that allow it to run multiple hypotheses -- around one million calculations -- at the same time. Watson is running on 2,880 processor cores simultaneously, while your laptop likely contains four cores, of which perhaps two are used concurrently. Processing natural language is scientifically very difficult because there are many different ways the same information can be expressed.
That means that Watson has to look at the data from scores of perspectives and combine and contrast the results. The parallel processing power provided by IBM Power 750 systems allows Watson to do thousands of analytical tasks simultaneously to come up with the best answer in under three seconds.
We are pleased with Watson's performance on Jeopardy! While at times, Watson did provide the wrong response to the clues, such as its Toronto response, it is still a giant leap in a computer's understanding of natural human language; in its ability to understand what the Jeopardy! clue was asking for and respond with the correct response the majority of the time.
We envision Watson-like cloud services being offered by companies to consumers, and we are working to create a cloud version of Watson's natural language processing. However, IBM is focused on creating technologies that help businesses make sense of data in order to enable companies to provide the best service to the consumer.
So, we are first focused on providing this technology to companies so that those companies can then provide improved services to consumers. The first industry we will provide the Watson technology to is the healthcare industry, to help physicians improve patient care. Consider these numbers:
- Primary care physicians spend an average of only 10.7 - 18.7 minutes face-to-face with each patient per visit.
- Approximately 81% average 5 hours or less per month -- or just over an hour a week -- reading medical journals.
- An estimated 15% of diagnoses are inaccurate or incomplete.
In today's healthcare environment, where physicians are often working with limited information and little time, the results can be fragmented care and errors that raise costs and threaten quality. What doctors need is an assistant who can quickly read and understand massive amounts of information and then provide useful suggestions. In terms of other applications we're exploring, here are a few examples of how Watson might some day be used:
- Watson technology offered through energy companies could teach us about our own energy consumption. People querying Watson on how they might improve their energy management would draw on extensive knowledge of detailed smart meter data, weather and historical information.
- Watson technology offered through insurance companies would allow us to get the best recommendations from insurance agents and help us understand our policies more easily. For our questions about insurance coverage, the question answering system would access the text for that person's actual policy, the other policies that they might have purchased, and any exclusions, endorsements, and riders.
- Watson technology offered through travel agents would more easily allow us to plan our vacations based on our interests, budget, desired temperature, and more. Instead of having to do lots of searching, Watson-like technology could help us quickly get the answers we need among all of the information that is out there on the Internet about hotels, destinations, events, typical weather, etc, to plan our travel faster.
I am sure that you distilled down whatever source materials you were using into something quick to query, but I noticed that on some of the possible answers Watson had, it looked like you weren't sanitizing your sources too much; for example, some words were in all caps, or phrases included extraneous and unrelated bits.
Did such inconsistencies not cause you any problems? Couldn't Watson trip up an answer as a result?
Some of the source data was very messy and we did several things to clean it up. It was relatively rare, less than 1% of the time that this issue overtly surfaced in a confident answer. Evidentiary passages might have been weighed differently if they were cleaner, however. We did not measure how much of problem messy data effected evidence assessment.
In the training/testing materials I saw, it seemed to be limited to 'What is--' regardless of what is being talked about ('What is Shakespeare?'), which made me think that words were only words and Watson had no way of telling if a word was a person, place, or thing.
Then in the Jeopardy challenge, there was plenty of 'Who is--.' Was there a last-minute change to enable this, or was it there all along and I just never happened to catch it? I think that would help me understand the way that Watson stores and relates data.
Watson does distinguish between and people, things, dates, events, etc. certainly for answering questions. It does not do it perfectly of course, there are many ambiguous cases where it struggles to resolve. When formulating a response, however, since 'What is....' was acceptable regardless, early on in the project, we did not make the effort to classify the answer for the response.
Later in the project, we brought more of the algorithms used in determining the answer to help formulate the more accurate response phrase. So yes, there was a change in that we applied those algorithms, or the results there-of, to formulate the 'who'/'what' response.
We don't assign grand challenges, grand challenges arrive based on our scientists' insights and inspiration. One of the great things about working for IBM Research is that we have so much talent that we have ambitious projects going on in a wide variety of areas today. For example:
- We are working to make computing systems 1,000 times more powerful than they are today from the petascale to the exascale.
- We are working to make nanoelectronic devices 1,000 times smaller than they are today, moving us from an era of nanodevices to nanosystems. One of those systems we are working on is a DNA transistor, which could decode a human genome for under $1000, to help enable personalised medicine to become reality.
- We are working on technologies that move from an era of wireless connectivity -- which we all enjoy today -- to the Internet of Things and people, where all sorts of unexpected things can be connected to the Internet.
If you give him traditional questions, ie not phrased in the form they are on Jeopardy, how well will he perform- how tailored is he to those questions, and how easy would it be to change that? Would it be unfeasible to hook him up to a website and let people run queries?
At this point, all Watson can do is play Jeopardy and provide responses in the Jeopardy format. However, we are collaborating with Nuance, Columbia University Medical centre and the University of Maryland School of Medicine to apply Watson technology to healthcare. You can read more about that here: http://www-03.ibm.com/press/us/en/pressrelease/33726.wss
After seeing the description of how Watson works, I found myself wondering whether what it does is really natural language processing, or something more akin to word association.
In the time it takes a human to even know they are hearing something (about .2 seconds) Watson has already read the question and done several million computations. It's got a huge head start. Do you agree or disagree with that assessment?
The clues are in English -- Brad and Ken's native language; not Watson's. Watson must calculate its response in 2-3 seconds and determine if it's confident enough to buzz in, because as you know, you lose money if you buzz in and respond incorrectly.
This is a huge challenge, especially because humans tend to know what they know and know what they don't know. Watson has to do thousands of calculations before it knows what it knows and what it doesn't.
The calculating of confidence based on evidence is a new technological capability that is going to be very significant in helping people in business and their personal lives, as it means a computer will be able to not only provide humans with suggested answers, but also provide an explanation of where the answers came from and why they seem correct. This will further human ability to make decisions.
Business Insider Emails & Alerts
Site highlights each day to your inbox.