google cloud bigtable integrating time series database with · pdf fileopentsdb + bigtable...
TRANSCRIPT
![Page 1: Google Cloud Bigtable Integrating time series database with · PDF fileOpenTSDB + Bigtable Integrating time series database with Google Cloud Bigtable Danil Zburivsky, Big Data Practice](https://reader033.vdocuments.us/reader033/viewer/2022051305/5a9f97717f8b9a7f178d0324/html5/thumbnails/1.jpg)
OpenTSDB + Bigtable
Integrating time series database withGoogle Cloud Bigtable
Danil Zburivsky, Big Data Practice Lead - @zburivskyChristos Soulios, Big Data Architect - @c_soulios
![Page 2: Google Cloud Bigtable Integrating time series database with · PDF fileOpenTSDB + Bigtable Integrating time series database with Google Cloud Bigtable Danil Zburivsky, Big Data Practice](https://reader033.vdocuments.us/reader033/viewer/2022051305/5a9f97717f8b9a7f178d0324/html5/thumbnails/2.jpg)
Pythian specializes in design, implementation, and management of systems that directly contribute to revenue and business success.
History19 years in business
Growing at 30+% per year
400+ employees
300+ customers worldwide
HQ Ottawa, Canada - global reach
Technology agnostic = trusted advisor
Deep expertise: Oracle, Oracle Apps, MySQL, AWS, SQL Server, Cassandra/DataStax, Azure, PostgreSQL, Cloudera, MapR, Hortonworks etc.
Google Premier Partner Status (as of end Aug)
5 Certified Developers (soon to be 12)
Dedicated Google Technical Champion
Launch partner for: Kubernetes, Dataflow, Cloud SQL, Dataproc
Integrated OpenTSDB with Bigtable
DW Explorers Program Partner
Upcoming BigQuery & Cloud ML Launch Partner
![Page 3: Google Cloud Bigtable Integrating time series database with · PDF fileOpenTSDB + Bigtable Integrating time series database with Google Cloud Bigtable Danil Zburivsky, Big Data Practice](https://reader033.vdocuments.us/reader033/viewer/2022051305/5a9f97717f8b9a7f178d0324/html5/thumbnails/3.jpg)
• (time, metric, value)
• OS and apps metrics
• Industrial equipment
• Web traffic
Time series data
![Page 4: Google Cloud Bigtable Integrating time series database with · PDF fileOpenTSDB + Bigtable Integrating time series database with Google Cloud Bigtable Danil Zburivsky, Big Data Practice](https://reader033.vdocuments.us/reader033/viewer/2022051305/5a9f97717f8b9a7f178d0324/html5/thumbnails/4.jpg)
• Volume can be explosive
• Data arrival and access patterns are different
Storing time series data is a challenge
![Page 5: Google Cloud Bigtable Integrating time series database with · PDF fileOpenTSDB + Bigtable Integrating time series database with Google Cloud Bigtable Danil Zburivsky, Big Data Practice](https://reader033.vdocuments.us/reader033/viewer/2022051305/5a9f97717f8b9a7f178d0324/html5/thumbnails/5.jpg)
• Volume can be explosive
• Data arrival and access patterns are different
Storing time series data is a challenge
![Page 6: Google Cloud Bigtable Integrating time series database with · PDF fileOpenTSDB + Bigtable Integrating time series database with Google Cloud Bigtable Danil Zburivsky, Big Data Practice](https://reader033.vdocuments.us/reader033/viewer/2022051305/5a9f97717f8b9a7f178d0324/html5/thumbnails/6.jpg)
• NoSQL
• Data model and storage optimized for time series
• Separate query language
Better alternatives — specialized stores
![Page 7: Google Cloud Bigtable Integrating time series database with · PDF fileOpenTSDB + Bigtable Integrating time series database with Google Cloud Bigtable Danil Zburivsky, Big Data Practice](https://reader033.vdocuments.us/reader033/viewer/2022051305/5a9f97717f8b9a7f178d0324/html5/thumbnails/7.jpg)
• Open source
• Uses HBase as a data store
• Data model optimized for time series
• REST API
OpenTSDB
<metric_uid><timestamp><tagk1><tagv1>[...<tagkN><tagvN>]
<col_t+1>[...<col_t+N>]
![Page 8: Google Cloud Bigtable Integrating time series database with · PDF fileOpenTSDB + Bigtable Integrating time series database with Google Cloud Bigtable Danil Zburivsky, Big Data Practice](https://reader033.vdocuments.us/reader033/viewer/2022051305/5a9f97717f8b9a7f178d0324/html5/thumbnails/8.jpg)
OpenTSDB Architecture
Server Server Server Server
TSD TSD
HBase
TSD RPC
HBase RPC
Web UI
Scripts/Alerting
HTTP
TSD RPC
![Page 9: Google Cloud Bigtable Integrating time series database with · PDF fileOpenTSDB + Bigtable Integrating time series database with Google Cloud Bigtable Danil Zburivsky, Big Data Practice](https://reader033.vdocuments.us/reader033/viewer/2022051305/5a9f97717f8b9a7f178d0324/html5/thumbnails/9.jpg)
• HBase requires a full Hadoop setup (3xZK, 2xNN, 3xDN, 2xHMaster, 3xHRegion)
• HBase tuning is a job for the brave (HFiles, WAL, MemStore, BucketCache, BlockCache)
HBase can be too much
![Page 10: Google Cloud Bigtable Integrating time series database with · PDF fileOpenTSDB + Bigtable Integrating time series database with Google Cloud Bigtable Danil Zburivsky, Big Data Practice](https://reader033.vdocuments.us/reader033/viewer/2022051305/5a9f97717f8b9a7f178d0324/html5/thumbnails/10.jpg)
HBase can be too much
![Page 11: Google Cloud Bigtable Integrating time series database with · PDF fileOpenTSDB + Bigtable Integrating time series database with Google Cloud Bigtable Danil Zburivsky, Big Data Practice](https://reader033.vdocuments.us/reader033/viewer/2022051305/5a9f97717f8b9a7f178d0324/html5/thumbnails/11.jpg)
But all I wanted was a time series database
![Page 12: Google Cloud Bigtable Integrating time series database with · PDF fileOpenTSDB + Bigtable Integrating time series database with Google Cloud Bigtable Danil Zburivsky, Big Data Practice](https://reader033.vdocuments.us/reader033/viewer/2022051305/5a9f97717f8b9a7f178d0324/html5/thumbnails/12.jpg)
Google Cloud Bigtable
• Highly Scalable NoSQL database
• Low latency, high throughput
• Powers most Google products
• Available as a Google Cloud Service
![Page 13: Google Cloud Bigtable Integrating time series database with · PDF fileOpenTSDB + Bigtable Integrating time series database with Google Cloud Bigtable Danil Zburivsky, Big Data Practice](https://reader033.vdocuments.us/reader033/viewer/2022051305/5a9f97717f8b9a7f178d0324/html5/thumbnails/13.jpg)
Migrate HBase apps to Cloud Bigtable
• The Bigtable client is API compatible with HBase client
• Only replace hbase-client.jar with bigtable-hbase.jar
• No code changes required!
![Page 14: Google Cloud Bigtable Integrating time series database with · PDF fileOpenTSDB + Bigtable Integrating time series database with Google Cloud Bigtable Danil Zburivsky, Big Data Practice](https://reader033.vdocuments.us/reader033/viewer/2022051305/5a9f97717f8b9a7f178d0324/html5/thumbnails/14.jpg)
Migrate OpenTSDB to Cloud Bigtable
• OpenTSDB does not use standard hbase-client.jar
• OpenTSDB is based on AsyncHBase library
![Page 15: Google Cloud Bigtable Integrating time series database with · PDF fileOpenTSDB + Bigtable Integrating time series database with Google Cloud Bigtable Danil Zburivsky, Big Data Practice](https://reader033.vdocuments.us/reader033/viewer/2022051305/5a9f97717f8b9a7f178d0324/html5/thumbnails/15.jpg)
AsyncHBase library
• Open source HBase client library
• Multi-threaded Multiple threads use the same instance
• Fully asynchronous, non-blocking
• Implements the low level HBase RPCs
![Page 16: Google Cloud Bigtable Integrating time series database with · PDF fileOpenTSDB + Bigtable Integrating time series database with Google Cloud Bigtable Danil Zburivsky, Big Data Practice](https://reader033.vdocuments.us/reader033/viewer/2022051305/5a9f97717f8b9a7f178d0324/html5/thumbnails/16.jpg)
Detour: Asynchronous programming
![Page 17: Google Cloud Bigtable Integrating time series database with · PDF fileOpenTSDB + Bigtable Integrating time series database with Google Cloud Bigtable Danil Zburivsky, Big Data Practice](https://reader033.vdocuments.us/reader033/viewer/2022051305/5a9f97717f8b9a7f178d0324/html5/thumbnails/17.jpg)
Detour: Why asynchronous?
• Efficient thread usage
• Less threads = less memory
• CPU scheduler friendly
• Extremely high concurrency
![Page 18: Google Cloud Bigtable Integrating time series database with · PDF fileOpenTSDB + Bigtable Integrating time series database with Google Cloud Bigtable Danil Zburivsky, Big Data Practice](https://reader033.vdocuments.us/reader033/viewer/2022051305/5a9f97717f8b9a7f178d0324/html5/thumbnails/18.jpg)
AsyncHBase library
http://www.tsunanet.net/~tsuna/asynchbase/benchmark/viz.html
![Page 19: Google Cloud Bigtable Integrating time series database with · PDF fileOpenTSDB + Bigtable Integrating time series database with Google Cloud Bigtable Danil Zburivsky, Big Data Practice](https://reader033.vdocuments.us/reader033/viewer/2022051305/5a9f97717f8b9a7f178d0324/html5/thumbnails/19.jpg)
AsyncHBase library
“AsyncHBase client differs significantly from HBase's client. Switching to it is not easy as it requires to rewrite all the code that was interacting with any HBase API”
AsyncHBase documentation
![Page 20: Google Cloud Bigtable Integrating time series database with · PDF fileOpenTSDB + Bigtable Integrating time series database with Google Cloud Bigtable Danil Zburivsky, Big Data Practice](https://reader033.vdocuments.us/reader033/viewer/2022051305/5a9f97717f8b9a7f178d0324/html5/thumbnails/20.jpg)
AsyncBigtable library
● Complete rewrite of AsyncHBase API
● Uses standard hbase-client for Bigtable access
● Compatible with the bigtable-hbase API
![Page 21: Google Cloud Bigtable Integrating time series database with · PDF fileOpenTSDB + Bigtable Integrating time series database with Google Cloud Bigtable Danil Zburivsky, Big Data Practice](https://reader033.vdocuments.us/reader033/viewer/2022051305/5a9f97717f8b9a7f178d0324/html5/thumbnails/21.jpg)
AsyncBigtable challenges
● OpenTSDB jar dependencies
● AsyncBigtable is not async!
● BufferedMutator + Threadpool to emulate async
![Page 22: Google Cloud Bigtable Integrating time series database with · PDF fileOpenTSDB + Bigtable Integrating time series database with Google Cloud Bigtable Danil Zburivsky, Big Data Practice](https://reader033.vdocuments.us/reader033/viewer/2022051305/5a9f97717f8b9a7f178d0324/html5/thumbnails/22.jpg)
AsyncBigtable library
![Page 23: Google Cloud Bigtable Integrating time series database with · PDF fileOpenTSDB + Bigtable Integrating time series database with Google Cloud Bigtable Danil Zburivsky, Big Data Practice](https://reader033.vdocuments.us/reader033/viewer/2022051305/5a9f97717f8b9a7f178d0324/html5/thumbnails/23.jpg)
AsyncBigtable library
● Merged upstream OpenTSDB v2.3.0
● http://opentsdb.net/docs/build/html/user_guide/backends/bigtable.html
● https://github.com/OpenTSDB/asyncbigtable
![Page 24: Google Cloud Bigtable Integrating time series database with · PDF fileOpenTSDB + Bigtable Integrating time series database with Google Cloud Bigtable Danil Zburivsky, Big Data Practice](https://reader033.vdocuments.us/reader033/viewer/2022051305/5a9f97717f8b9a7f178d0324/html5/thumbnails/24.jpg)
Future work
● Native Bigtable API
● Fully asynchronous
● Improve performance
● Add more unit tests
![Page 25: Google Cloud Bigtable Integrating time series database with · PDF fileOpenTSDB + Bigtable Integrating time series database with Google Cloud Bigtable Danil Zburivsky, Big Data Practice](https://reader033.vdocuments.us/reader033/viewer/2022051305/5a9f97717f8b9a7f178d0324/html5/thumbnails/25.jpg)
Questions?
https://github.com/opentsdb/asyncbigtable