Download - Lambda Architecture The Hive
![Page 1: Lambda Architecture The Hive](https://reader033.vdocuments.us/reader033/viewer/2022042701/55ab61b71a28ab602f8b47bf/html5/thumbnails/1.jpg)
Lambda ArchitectureUse Case: Mayo Clinic
FEBRUARY 2015Altan Khendup – Leader, UDA Architecture COE
![Page 2: Lambda Architecture The Hive](https://reader033.vdocuments.us/reader033/viewer/2022042701/55ab61b71a28ab602f8b47bf/html5/thumbnails/2.jpg)
2
Background of Lambda Architecture
Background
– Reference architecture for Big Data systems
– Designed by Nathan Marz (Twitter)
– Defined as a system that runs arbitrary functions on arbitrary data
– “query = function(all data)”
Design Principles
– Human fault-tolerant, Immutability, Computable
Lambda Layers
– Batch - Contains the immutable, constantly growing master dataset.
– Speed - Deals only with new data and compensates for the high latency updates of the serving layer.
– Serving - Loads and exposes the combined view of data so that they can be queried.
![Page 3: Lambda Architecture The Hive](https://reader033.vdocuments.us/reader033/viewer/2022042701/55ab61b71a28ab602f8b47bf/html5/thumbnails/3.jpg)
3
Overview of Lambda Architecture
![Page 4: Lambda Architecture The Hive](https://reader033.vdocuments.us/reader033/viewer/2022042701/55ab61b71a28ab602f8b47bf/html5/thumbnails/4.jpg)
4 © 2014 Teradata
USE CASE – MAYO CLINIC
![Page 5: Lambda Architecture The Hive](https://reader033.vdocuments.us/reader033/viewer/2022042701/55ab61b71a28ab602f8b47bf/html5/thumbnails/5.jpg)
Mayo Clinic HistoryEvery year, more than a million people from all 50 states
and nearly 150 countries come for care
Dozens of locations in several states with major campuses in Rochester, Minn.; Scottsdale and Phoenix,
Ariz.; and Jacksonville, Fla.
Mayo Clinic Rochester, Minn. recognized as the top hospital in the nation for 2014-2015 by U.S. News &
World Report
![Page 6: Lambda Architecture The Hive](https://reader033.vdocuments.us/reader033/viewer/2022042701/55ab61b71a28ab602f8b47bf/html5/thumbnails/6.jpg)
Why Big Data?Challenges in Medical Data
Health data tends to be “wide”, not “deep”New data types are becoming more important
Unstructured
Real-time streaming
A challenge to generally move from retrospective “BI” viewing to event-based and predictive analytics usage
Multiple layers
Lots of events, data
Complex
Lots of different languages and data structures
Difficult to maintain
Lots of moving pieces/components/technologies
Lots of changes in the business
![Page 7: Lambda Architecture The Hive](https://reader033.vdocuments.us/reader033/viewer/2022042701/55ab61b71a28ab602f8b47bf/html5/thumbnails/7.jpg)
Data DiscoveryMany “Big Data” stories start with data discovery
The Data Lake, etc.
But, data discovery is not predictable!
Mayo Clinic needed to define a real operational need that a “Big Data” technology stack could fulfill
![Page 8: Lambda Architecture The Hive](https://reader033.vdocuments.us/reader033/viewer/2022042701/55ab61b71a28ab602f8b47bf/html5/thumbnails/8.jpg)
ProjectOptimize an existing Natural Language Processing
pipeline in support of critical Colorectal Surgery (Move to tens of thousands of documents processed)
Replace an existing free-text search facility used by Clinical Web Service for colorectal cancer
(Move search to milliseconds)
![Page 9: Lambda Architecture The Hive](https://reader033.vdocuments.us/reader033/viewer/2022042701/55ab61b71a28ab602f8b47bf/html5/thumbnails/9.jpg)
9
Overall Architecture
![Page 10: Lambda Architecture The Hive](https://reader033.vdocuments.us/reader033/viewer/2022042701/55ab61b71a28ab602f8b47bf/html5/thumbnails/10.jpg)
10
• Current Storm throughput up to 1.5 million documents per hour
• Average of 140,000 HL7 messages actually processed per day with average latency of 60 milliseconds from ingest to persistence
• Average of 50,000 documents passed through annotators per day versus 5,000 historically
• Actual annotations of documents up to 6 times faster than previously accomplished
• Free-text search use cases that took over 30 minutes on old infrastructure completing in milliseconds in ElasticSearch
Operational Statistics
![Page 11: Lambda Architecture The Hive](https://reader033.vdocuments.us/reader033/viewer/2022042701/55ab61b71a28ab602f8b47bf/html5/thumbnails/11.jpg)
11
• Benefits
– An architecturally-driven, internally-owned technology stack that blends:
- An event-based/”real-time” processing fabric
- A multi-destination distillation hub
- A foundation for “Classic” BI delivery techniques
- A foundation for “Services-based” delivery techniques
- A “serendipitous” discovery environment
– Mutually supportive components that combine in delivering novel clinical solutions
– Data continuity
- Historical data can be assessed as algorithms change over time
Summary
![Page 12: Lambda Architecture The Hive](https://reader033.vdocuments.us/reader033/viewer/2022042701/55ab61b71a28ab602f8b47bf/html5/thumbnails/12.jpg)
12
Thank you! We’re Hiring!thinkbigcareers.teradata.com
Altan Khendup (@madmongol)
Ron Bodkin (@ronbodkin)