from bloom filters to data pipelines
TRANSCRIPT
![Page 1: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/1.jpg)
Building Data applications with Go
01
from Bloom filters to Data pipelines
Sergii Khomenko, Data [email protected], @lc0d3r
FOSDEM - January 31, 2016
![Page 2: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/2.jpg)
Sergii Khomenko
2
Data scientist at one of the biggest fashion communities, Stylight.
Data analysis and visualisation hobbyist, working on problems not only in working time but in free time for fun and personal data visualisations.
First time faced Golang in ~ 2010. Fell in love with language channels and core concepts.
Speaker at Berlin Buzzwords 2014, ApacheCon Europe 2014, Puppet Camp London, Berlin Buzzwords 2015 , Tableau Conference on Tour, Budapest BI Forum 2015, Crunchsconf 2015 and others
![Page 3: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/3.jpg)
3
Munich, Germany Founded on Apr 5, 2014 Gophers: 323
![Page 4: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/4.jpg)
4
![Page 5: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/5.jpg)
5
https://www.pinterest.com/pin/38351034303708696/
![Page 6: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/6.jpg)
Profitable LeadsStylight provides its partners with high-quality leads enabling partner shops to leverage Stylight as a ROI positive traffic channel.
InspirationStylight offers
shoppable inspiration that
makes it easy to know what to
buy and how to style it.
Branding & ReachStylight offers a unique opportunity for brands to reach an audience that is actively looking for style online.
ShoppingStylight helps users search
and shop fashion and lifestyle products smarter across
hundreds of shops.
6
Stylight – Make Style HappenCore Target Group
Stylight help aspiring women between 18 and 35 to evolve their style through shoppable inspiration.
![Page 7: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/7.jpg)
Stylight – acting on a global scale
![Page 8: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/8.jpg)
Experienced & Ambitious Team
Innovative cross-functional organisation with flat hierarchy builds a unique team spirit.
• +200 employees• 40 PhDs/Engineers• 28 years average age
• 63% female• 23 nationalities• 0 suits
8
![Page 9: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/9.jpg)
Agenda
9
P r o b a b i l i s t i c d a t a s t r u c t u r e s B l o o m f i l t e r s , C o u n t M i n o r H y p e r l o g l o g
O p e n S o u r c e s t a c k
A m a z o n A W S
G o o g l e C l o u d S e r v e r l e s s a r c h i t e c t u r e
![Page 10: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/10.jpg)
The Nature of Data
10
![Page 11: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/11.jpg)
Sources of data:
11
• Web tracking • Metrics tracking • Behaviour tracking
• Business intelligence ETL • Internal Services • ML tagging service
![Page 12: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/12.jpg)
Access patterns
12
• Real-time • Nearly real-time • Daily batches
![Page 13: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/13.jpg)
Probabilistic data structures
13
![Page 14: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/14.jpg)
14
D a t a s t r u c t u r e s t h a t u s e d i f f e r e n t p r o b a b i l i s t i c
a p p r o a c h e s t o c o m p a c t l y s t o r e d a t a
![Page 15: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/15.jpg)
15
![Page 16: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/16.jpg)
16
![Page 17: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/17.jpg)
Bloom filter
17
Approximate Membership
![Page 18: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/18.jpg)
18
A B l o o m f i l t e r i s a s p a c e -e f f i c i e n t p r o b a b i l i s t i c d a t a
s t r u c t u r e , c o n c e i v e d b y B u r t o n H o w a r d B l o o m i n 1 9 7 0 , t h a t i s
u s e d t o t e s t w h e t h e r a n e l e m e n t i s a m e m b e r o f a s e t .
F a l s e p o s i t i v e m a t c h e s a r e p o s s i b l e , b u t f a l s e n e g a t i v e s a r e
n o t
![Page 19: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/19.jpg)
19
A B l o o m f i l t e r i s a s p a c e - e f f i c i e n t p r o b a b i l i s t i c d a t a s t r u c t u r e , c o n c e i v e d b y B u r t o n H o w a r d
B l o o m i n 1 9 7 0 , t h a t i s u s e d t o t e s t w h e t h e r a n e l e m e n t i s a
m e m b e r o f a s e t . F a l s e p o s i t i v e m a t c h e s a r e p o s s i b l e , b u t
f a l s e n e g a t i v e s a r e n o t
![Page 20: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/20.jpg)
20
• b i t a r r a y o f m b i t s . • k d i f f e r e n t h a s h f u n c t i o n s w i t h
a u n i f o r m r a n d o m d i s t r i b u t i o n
![Page 21: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/21.jpg)
21
![Page 22: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/22.jpg)
22
![Page 23: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/23.jpg)
23
https://www.jasondavies.com/bloomfilter/
![Page 24: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/24.jpg)
Size estimation
24
memory usage
hash functions n - estimated number of elementsp - false positive probabilitym - required bit array lengthExample:n=1,000,000FPR 10% ~= 4800000 Bit ~= 600 kByteFPR 0.1% ~= 14400000 Bit ~= 1.8 MByte
![Page 25: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/25.jpg)
Use-cases
25
• Caches• Databases
• HBase• Cassandra
• Networking
https://github.com/willf/bloomhttps://github.com/reddragon/bloomfilter.go
https://github.com/seiflotfy/dlCBFhttps://github.com/patrickmn/go-bloom
https://github.com/armon/bloomdhttps://github.com/geetarista/go-bloomd
![Page 26: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/26.jpg)
Extensions
26
• Cardinality estimate (increment counter when add a new) • Scalable Bloom filters (add another hash function on top)• Counting Bloom filters
• increment every time we see it
![Page 27: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/27.jpg)
Count-Min
27
Frequency estimator
![Page 28: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/28.jpg)
28
• m a t r i x o f w c o l u m n s a n d d r o w s • h a s h f u n c t i o n a s s o c i a t e d w i t h
e v e r y r o w
![Page 29: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/29.jpg)
29
![Page 30: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/30.jpg)
HyperLogLog
30
Cardinality estimator
![Page 31: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/31.jpg)
31
H y p e r L o g L o g i s a n a l g o r i t h m f o r t h e c o u n t - d i s t i n c t p r o b l e m ,
a p p r o x i m a t i n g t h e n u m b e r o f d i s t i n c t e l e m e n t s i n a m u l t i s e t .
![Page 32: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/32.jpg)
32
T h e H y p e r L o g L o g a l g o r i t h m i s a b l e t o e s t i m a t e c a r d i n a l i t i e s o f > 1 0 ^ 9 w i t h a t y p i c a l e r r o r
r a t e o f 2 % , u s i n g 1 . 5 k B o f m e m o r y [ 3 ]
![Page 33: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/33.jpg)
33
hash(x) -> stream of bits {1,0,0,1,0,1..}• hash generates uniformly distributed values• every bit is independent
Hash function
![Page 34: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/34.jpg)
34
p(first bit - 0) = 1/2p(second bit - 0) = 1/25 consecutive zeros - (1/2)^5
N consecutive zeros - (1/2)^N
Bit probability
![Page 35: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/35.jpg)
35
p(first bit - 0) = 1/2p(second bit - 0) = 1/25 consecutive zeros - (1/2)^5
N consecutive zeros - (1/2)^NN = 32, Odds = 1/4294967296 -> Expected 4294967296 samples
Guessing bits
![Page 36: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/36.jpg)
36
N = 32 = {1,0,0,0,0} = 6bitWith 6bits we can count 2^64Where the name is coming from Log(Log(64)) = 6
Storing bits
![Page 37: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/37.jpg)
37
• Create m registers• Partition the bit stream
• first log(m) - register index• rest used for actual values
Multiple registers
![Page 38: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/38.jpg)
38
HyperLogLog - add
![Page 39: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/39.jpg)
39
• Given m registers• Estimate aggregated value
• Min? Max? Avg? Median?• Geometric/Harmonic mean!
• Estimate A*m*H
HyperLogLog - size
![Page 40: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/40.jpg)
40http://content.research.neustar.biz/blog/hll.html
![Page 41: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/41.jpg)
Use-cases
41
• Databases• Redis• PostgreSQL• Redshift• Impala• Hive
• Spark
https://github.com/clarkduvall/hyperlogloghttps://github.com/armon/hlld
![Page 42: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/42.jpg)
42
I n c o m p u t i n g , a p i p e l i n e i s a s e t o f d a t a p r o c e s s i n g e l e m e n t s c o n n e c t e d i n s e r i e s , w h e r e t h e
o u t p u t o f o n e e l e m e n t i s t h e i n p u t o f t h e n e x t o n e .
![Page 43: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/43.jpg)
Open Source Stack
43
![Page 44: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/44.jpg)
44
http://lambda-architecture.net/
![Page 45: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/45.jpg)
45
A p a c h e K a f k a i s p u b l i s h - s u b s c r i b e m e s s a g i n g r e t h o u g h t a s a d i s t r i b u t e d c o m m i t l o g .
![Page 46: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/46.jpg)
46
![Page 47: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/47.jpg)
47
![Page 48: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/48.jpg)
48
Libraries• Sarama is an MIT-licensed Go client library for Apache Kafka version 0.8 (and later)https://github.com/Shopify/sarama
Go Kafka Clienthttps://github.com/elodina/go_kafka_client
![Page 49: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/49.jpg)
49
producer, err := NewAsyncProducer([]string{"localhost:9092"}, nil)if err != nil { panic(err)}
defer func() { if err := producer.Close(); err != nil { log.Fatalln(err) }}()
![Page 50: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/50.jpg)
50
var enqueued, errors intProducerLoop:for { select { case producer.Input() <- &ProducerMessage{Topic: "my_topic", Key: nil, Value: StringEncoder("testing 123")}: enqueued++
case err := <-producer.Errors(): log.Println("Failed to produce message", err) errors++ case <-signals: break ProducerLoop }}
log.Printf("Enqueued: %d; errors: %d\n", enqueued, errors)
![Page 51: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/51.jpg)
51http://www.ipponusa.com/wp-content/uploads/2014/10/spark-architecture.jpg
dmrgo is a Go library for writing map/reduce jobs.https://github.com/dgryski/dmrgo
![Page 52: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/52.jpg)
Results
52
• Scalable • Flexible
• High costs of maintenance • Not so easy to setup
![Page 53: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/53.jpg)
53
A p r o g r a m m i n g l a n g u a g e i s l o w l e v e l w h e n i t s p r o g r a m s r e q u i r e
a t t e n t i o n t o t h e i r r e l e v a n t .
Alan Jay Perlis / Epigrams on Programming
![Page 54: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/54.jpg)
Amazon AWS
54
![Page 55: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/55.jpg)
Kinesis Streams
![Page 56: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/56.jpg)
56
![Page 57: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/57.jpg)
57
![Page 59: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/59.jpg)
59
![Page 60: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/60.jpg)
60
![Page 61: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/61.jpg)
Kinesis Firehose
![Page 62: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/62.jpg)
Kinesis Analytics
![Page 63: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/63.jpg)
63
custom unificationpipeline
ProductProcessing
BusinessIntelligence
ML/TaggingProduct events
variety of event types and structures
![Page 64: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/64.jpg)
Google Cloud
64
![Page 65: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/65.jpg)
65
![Page 66: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/66.jpg)
66
![Page 67: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/67.jpg)
67
Libraries
• Google APIs Client Library for Gohttps://github.com/GoogleCloudPlatform/gcloud-golang
![Page 68: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/68.jpg)
68
![Page 69: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/69.jpg)
69
![Page 70: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/70.jpg)
![Page 71: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/71.jpg)
71
![Page 72: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/72.jpg)
Serverless architecture
72
![Page 73: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/73.jpg)
73
![Page 74: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/74.jpg)
74
![Page 75: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/75.jpg)
75
![Page 76: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/76.jpg)
76
![Page 77: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/77.jpg)
77
![Page 78: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/78.jpg)
78
![Page 79: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/79.jpg)
79
![Page 80: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/80.jpg)
80
Possibilities
• all Lambdas in one place with version control• integration tests with real events• proper CI/CD setup
![Page 81: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/81.jpg)
81
![Page 83: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/83.jpg)
Related links
83
1. Burton H. Bloom. Space/Time Trade-offs in Hash Coding with Allowable Errors. 1970
2. Interactive visualisation: Bloom Filters
3. HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm
4. HyperLogLog in Practice: Algorithmic Engineering of a State of The Art Cardinality Estimation Algorithm
5. HyperLogLog — Cornerstone of a Big Data Infrastructure
6. Armon Dadgar on Bloom Filters and HyperLogLog
![Page 84: from Bloom filters to Data pipelines](https://reader034.vdocuments.us/reader034/viewer/2022042611/58a1a1b71a28abac578b91c4/html5/thumbnails/84.jpg)
Related links
84
7. https://github.com/willf/bloom
8. Google’s Cloud Pub/Sub Real-Time Messaging Service Is Now In Public Beta
9. Streaming Data Processing with Amazon Kinesis and AWS Lambda
10. Google Cloud Dataflow Two Worlds Become a Much Better One
11. https://github.com/apex/apex