making watson fast daniel brown hon111. need for watson to be fast to play jeopardy successfully –...
TRANSCRIPT
![Page 1: Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –](https://reader035.vdocuments.us/reader035/viewer/2022062722/56649f385503460f94c5453d/html5/thumbnails/1.jpg)
Making Watson Fast
Daniel BrownHON111
![Page 2: Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –](https://reader035.vdocuments.us/reader035/viewer/2022062722/56649f385503460f94c5453d/html5/thumbnails/2.jpg)
• Need for Watson to be fast to play Jeopardy successfully– All computations have to be done in a few seconds– Initial application speed: 1-2 hours processing
time per question
![Page 3: Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –](https://reader035.vdocuments.us/reader035/viewer/2022062722/56649f385503460f94c5453d/html5/thumbnails/3.jpg)
• Unstructured Information Management Architecture (UIMA): framework for NLP applications; facilitates parallel processing– UIMA-AS: Asynchronous Scaleout
• UIMA chosen at start for these reasons; other optimization work only began after 2 years (after QA accuracy/confidence improved)
![Page 4: Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –](https://reader035.vdocuments.us/reader035/viewer/2022062722/56649f385503460f94c5453d/html5/thumbnails/4.jpg)
UIMA implementation of DeepQA
![Page 5: Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –](https://reader035.vdocuments.us/reader035/viewer/2022062722/56649f385503460f94c5453d/html5/thumbnails/5.jpg)
UIMA implementation of DeepQA
• Type System• Common Analysis Structure (CAS)• Annotator– CAS multiplier (CM): creates new “children” CASes
• Flow Controller
• CASes can be spread across multiple systems (processed in parallel) for efficiency
![Page 6: Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –](https://reader035.vdocuments.us/reader035/viewer/2022062722/56649f385503460f94c5453d/html5/thumbnails/6.jpg)
Scaling out
• Two systems: – Development (+question processing)• Meant to analyze many questions accurately
– Production (+speed)• Meant to answer one question quickly
![Page 7: Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –](https://reader035.vdocuments.us/reader035/viewer/2022062722/56649f385503460f94c5453d/html5/thumbnails/7.jpg)
Scaling out: UIMA-AS
• (UIMA-AS: Asynchronous Scaleout)– Manages multithreading, communication between
processes necessary for parallel processing• Feasibility test: simulated production system with
110 processes, 110 8-core machines– Goal: less than 3 seconds; actual: more than 3 seconds– Two sources of latency: CAS serialization, network
communication– Optimizing CAS serialization resulted in runtime of <1s
![Page 8: Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –](https://reader035.vdocuments.us/reader035/viewer/2022062722/56649f385503460f94c5453d/html5/thumbnails/8.jpg)
Scaling out: Deployment• 400 processes, 72 machines
![Page 9: Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –](https://reader035.vdocuments.us/reader035/viewer/2022062722/56649f385503460f94c5453d/html5/thumbnails/9.jpg)
• How to find time bottlenecks in such a system?– Monitoring tool– Integrated timing
measurements (in flow controller component)
![Page 10: Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –](https://reader035.vdocuments.us/reader035/viewer/2022062722/56649f385503460f94c5453d/html5/thumbnails/10.jpg)
RAM Optimizations
• Wanted to avoid disk read/write time delays, so all (production system) data was put into RAM
• Some optimizations: – Reference size reduction– Java object size reduction– Java object overhead– String size– Special hash tables– Java garbage collection with large heap sizes
• *Full GC between games
![Page 11: Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –](https://reader035.vdocuments.us/reader035/viewer/2022062722/56649f385503460f94c5453d/html5/thumbnails/11.jpg)
Indri Search Optimizations
• Indri search: used to find most relevant 1-2 sentences from Watson database
• Using single processor, primary search takes too long (i.e. 100s)– Supporting evidence search even longer
• Solution?– Divide corpus (body of information to search) into chunks, then
assign each search daemon a chunk– (specifically, 50GB corpus of 6.8 million documents, 79 chunks of
100000 documents each, 79 Indri search daemons with 8 CPU cores each; end result, 32 passage queries could be run at once)
![Page 12: Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –](https://reader035.vdocuments.us/reader035/viewer/2022062722/56649f385503460f94c5453d/html5/thumbnails/12.jpg)
Preprocessing and Custom Content Services
• Watson must first analyze the passage texts before being able to use them– Deep NLP analysis - semantic/structural parsing,
etc.• Since Watson had to be self-contained, this
analysis could be done before run time (preprocessed)– Used Hadoop (distributed file system software)– 50 machines, 16GB/8 cores each
![Page 13: Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –](https://reader035.vdocuments.us/reader035/viewer/2022062722/56649f385503460f94c5453d/html5/thumbnails/13.jpg)
Preprocessing and Custom Content Services
• Retrieving the preprocessed data? – Preprocessed data much larger than unprocessed
corpus (~300GB total)– Built custom content server – allocated data to 14
machines, ~20GB each– Documents then were accessed from these
servers
![Page 14: Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –](https://reader035.vdocuments.us/reader035/viewer/2022062722/56649f385503460f94c5453d/html5/thumbnails/14.jpg)
End result
• Parallel processing combined with a number of other performance optimizations resulted in a final average latency of less than 3 seconds.– No one “silver bullet” solution
![Page 15: Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –](https://reader035.vdocuments.us/reader035/viewer/2022062722/56649f385503460f94c5453d/html5/thumbnails/15.jpg)