building a vibrant search ecosystem @ bloomberg: presented by steven bower & ken laporte,...
TRANSCRIPT
![Page 1: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg](https://reader031.vdocuments.us/reader031/viewer/2022021422/587068381a28ab48378b566d/html5/thumbnails/1.jpg)
O C T O B E R 1 1 - 1 4 , 2 0 1 6 • B O S T O N , M A
![Page 2: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg](https://reader031.vdocuments.us/reader031/viewer/2022021422/587068381a28ab48378b566d/html5/thumbnails/2.jpg)
Building a Vibrant Search Ecosystem @ Bloomberg
Steven Bower & Ken LaPorte
Copyright 2016 Bloomberg Finance L.P. All rights reserved.
![Page 3: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg](https://reader031.vdocuments.us/reader031/viewer/2022021422/587068381a28ab48378b566d/html5/thumbnails/3.jpg)
3
01 Bloomberg • Largest provider of financial news and information • Our strength is quickly and accurately delivering data, news and analytics • Creating high performance and accurate information retrieval systems is core to our
strength
![Page 4: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg](https://reader031.vdocuments.us/reader031/viewer/2022021422/587068381a28ab48378b566d/html5/thumbnails/4.jpg)
4
02 Why are we giving this talk?
![Page 5: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg](https://reader031.vdocuments.us/reader031/viewer/2022021422/587068381a28ab48378b566d/html5/thumbnails/5.jpg)
5
01 What came before…
• Search has been around for a long time at Bloomberg - Rapid delivery of product to clients - Proprietary, commercial and open-source search technologies
• Fragmented solutions - Disparate search technologies - Custom code - Deployment patterns - Lack of standards
• Costly to maintain & evolve
![Page 6: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg](https://reader031.vdocuments.us/reader031/viewer/2022021422/587068381a28ab48378b566d/html5/thumbnails/6.jpg)
6
01 How We Got Started • Created a team to specialize in search • Reviewed existing applications reliant upon search • Selected a set of representative applications
- Various scales - Data types - Distinct requirements
![Page 7: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg](https://reader031.vdocuments.us/reader031/viewer/2022021422/587068381a28ab48378b566d/html5/thumbnails/7.jpg)
7
01 Why Solr? • Evaluated other open source search engines
- Already used at Bloomberg • Large community & widely used • Established & growing feature set • Scalable • Committed to open source
- Ability to contribute to core engine - Ability to fix bugs ourselves - Contributions in almost every Solr release since 4.5.0 - 3 Solr committers at the company
![Page 8: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg](https://reader031.vdocuments.us/reader031/viewer/2022021422/587068381a28ab48378b566d/html5/thumbnails/8.jpg)
8
01 Search as a service • Designed platform with application teams • Middleware service to wrap Solr
- Familiar & lightweight interface - Simplified APIs - Insulate clients from changes in Solr
• Pass-thru capability • Basic monitoring/metrics
![Page 9: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg](https://reader031.vdocuments.us/reader031/viewer/2022021422/587068381a28ab48378b566d/html5/thumbnails/9.jpg)
9
01 Open for business!
• Hundreds of search applications - Diverse use cases and scale - Displaced other technologies
• >10 Billion documents • >10 Million new documents daily • >4000 Solr instances • >100s of servers • >2,000 of queries per second • Mission critical to Bloomberg and the financial markets
0
50
100
150
200
250
300
2012
Num
ber
of C
olle
ctio
ns
Time
Number of Collections over Time
2016
![Page 10: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg](https://reader031.vdocuments.us/reader031/viewer/2022021422/587068381a28ab48378b566d/html5/thumbnails/10.jpg)
10
01 What have we done?! • Human scaling • Ineffective Alarming • Manual build process
- Limited automated testing • Configuration Management • Lots of known unknowns
![Page 11: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg](https://reader031.vdocuments.us/reader031/viewer/2022021422/587068381a28ab48378b566d/html5/thumbnails/11.jpg)
11
01 Challenge: EcoSystem
• Ownership - Where’s the line?
• Planning for scale • Education
- Search != Database - Data types (text parsing) - Relevance - Features
![Page 12: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg](https://reader031.vdocuments.us/reader031/viewer/2022021422/587068381a28ab48378b566d/html5/thumbnails/12.jpg)
12
01 Solution: Ecosystem • Survey
- Understand business requirements - Identify scale and complexity - Assist with schema and query design - Concerns
• Develop & Test - Best practices - Documentation & code samples - Office hours & support chat - Community development
![Page 13: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg](https://reader031.vdocuments.us/reader031/viewer/2022021422/587068381a28ab48378b566d/html5/thumbnails/13.jpg)
13
01 Solution: Ecosystem • Validate & Deploy
- Hardware provisioning - Automated deployments - Hot & cold collections - Load testing
• Maintain and Grow - Applications change & grow - Solr & platform upgrades - Monitoring
![Page 14: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg](https://reader031.vdocuments.us/reader031/viewer/2022021422/587068381a28ab48378b566d/html5/thumbnails/14.jpg)
14
01 Challenge: Monitoring Solr • Very large monitoring footprint • What should we monitor?
- Ping - Cluster state - Process state - Server health
• False alarms - Flutter - Solr can lie to you! (SOLR-8599)
• Many different ways to view system health - Different people care about different things - Active vs Forensic
![Page 15: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg](https://reader031.vdocuments.us/reader031/viewer/2022021422/587068381a28ab48378b566d/html5/thumbnails/15.jpg)
15
01 Solution: Monitoring Solr • Monitor via multiple mechanisms • Aggregate events
- Alarm on multiple signals - Delay alarms
• Niteowl - Solr / ZooKeeper / Generic - Distributed / Scalable - Events indexed into Solr
• Led to massive stability improvements
![Page 16: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg](https://reader031.vdocuments.us/reader031/viewer/2022021422/587068381a28ab48378b566d/html5/thumbnails/16.jpg)
16
01 What We Found • Long Garbage Collections
- Profiler interactions with Mmap - Young generation pressure during ingest - Use G1GC / Keep heap small
• Long Recovery Times - Transaction logs don’t hold enough - Always doing full replications when under ingest load
• Solr Bugs • Out of Memory Exceptions
- One off OOMs are not uncommon - Use DocValues! - OOM Killer
SOLR-9310SOLR-9207SOLR-9506
Long recovery times
SOLR-6931 Random connection reset issues
SOLR-8085 Replicas get out of sync
SOLR-8599 ZooKeeper client in inconsistent state
![Page 17: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg](https://reader031.vdocuments.us/reader031/viewer/2022021422/587068381a28ab48378b566d/html5/thumbnails/17.jpg)
17
01 Challenge: Configuration Management
• Deployment process • Requires versioning / rollback
- Some changes cannot be rolled back • Template driven configuration
- Good for simple things - Doesn’t scale for complex collections
• Lack of provenance
![Page 18: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg](https://reader031.vdocuments.us/reader031/viewer/2022021422/587068381a28ab48378b566d/html5/thumbnails/18.jpg)
18
01 Solution: Configuration Management • Convert to SDLC process
- Configurations live in Git repository - Solr extensions linked as dependencies - Built with Maven / Jenkins - Published to artifact repository
• Validation of configurations during build - Static Analysis
• Allowed schema changes • Access control of solr configuration
- Integration testing
• Deployed to ZooKeeper / Solr
![Page 19: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg](https://reader031.vdocuments.us/reader031/viewer/2022021422/587068381a28ab48378b566d/html5/thumbnails/19.jpg)
19
01 Challenge: Infrastructure • Substantial demand • Large lead times • Differing requirements
- Security - Scale - Control
• Too many pets!
![Page 20: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg](https://reader031.vdocuments.us/reader031/viewer/2022021422/587068381a28ab48378b566d/html5/thumbnails/20.jpg)
20
01 Solution: Infrastructure • Streamlined process • Shared and dedicated resources • Built from the ground up
- Well defined layers of abstraction - Cattle not pets - Infrastructure-as-code - SDLC / provenance
• Better hardware == better experience - SSDs - More RAM - Faster network
Hardware / OS
Control Plane
Applications
APIs
![Page 21: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg](https://reader031.vdocuments.us/reader031/viewer/2022021422/587068381a28ab48378b566d/html5/thumbnails/21.jpg)
21
01 What’s next? • Containerization
- Simplify / decentralize operational procedures - Local testing and development - Security / Metrics / QoS
• Delegation of control - Mute / Direct alarms to tenants - Tenant managed
• Detect failures before they happen - Heuristics / ML models
• Solr - More work on streaming - Analytics
• distributed analytics • pivot faceting
![Page 22: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg](https://reader031.vdocuments.us/reader031/viewer/2022021422/587068381a28ab48378b566d/html5/thumbnails/22.jpg)
Building a Vibrant Search Ecosystem @ Bloomberg
QUESTIONS?
Steven Bower [email protected]
Ken LaPorte [email protected]