unique id generation in distributed systems
DESCRIPTION
A run through of the various options available for generating unique IDsTRANSCRIPT
![Page 1: Unique ID generation in distributed systems](https://reader031.vdocuments.us/reader031/viewer/2022020206/547ba771b479599a098b4d56/html5/thumbnails/1.jpg)
ID generation
PHP London 2012-08-02@davegardnerisme
![Page 2: Unique ID generation in distributed systems](https://reader031.vdocuments.us/reader031/viewer/2022020206/547ba771b479599a098b4d56/html5/thumbnails/2.jpg)
@davegardnerisme
hailoapp.com/dave(for a £5 discount)
![Page 3: Unique ID generation in distributed systems](https://reader031.vdocuments.us/reader031/viewer/2022020206/547ba771b479599a098b4d56/html5/thumbnails/3.jpg)
Web AppMySQL
DC 1
MySQL auto increment
1,2,3,4…
![Page 4: Unique ID generation in distributed systems](https://reader031.vdocuments.us/reader031/viewer/2022020206/547ba771b479599a098b4d56/html5/thumbnails/4.jpg)
MySQL auto increment
• Numeric IDs
• Go up with time
• Not resilient
![Page 5: Unique ID generation in distributed systems](https://reader031.vdocuments.us/reader031/viewer/2022020206/547ba771b479599a098b4d56/html5/thumbnails/5.jpg)
Web AppMySQL
DC 1
MySQL multi-master replication
MySQL
1,3,5,7…
2,4,6,8…
![Page 6: Unique ID generation in distributed systems](https://reader031.vdocuments.us/reader031/viewer/2022020206/547ba771b479599a098b4d56/html5/thumbnails/6.jpg)
MySQL multi-master replication
• Numeric IDs
• Do not go up with time
• Some resilience
![Page 7: Unique ID generation in distributed systems](https://reader031.vdocuments.us/reader031/viewer/2022020206/547ba771b479599a098b4d56/html5/thumbnails/7.jpg)
Going global…
DC 1
DC 2
DC 3
DC 4
DC 5
DC 6
![Page 8: Unique ID generation in distributed systems](https://reader031.vdocuments.us/reader031/viewer/2022020206/547ba771b479599a098b4d56/html5/thumbnails/8.jpg)
Web App
DC 1
MySQL in multi DC setup
MySQL
Web App
DC 2
?
1,2,3…
WAN LINK
![Page 9: Unique ID generation in distributed systems](https://reader031.vdocuments.us/reader031/viewer/2022020206/547ba771b479599a098b4d56/html5/thumbnails/9.jpg)
Web App
DC 1
Flickr MySQL ticket server
Ticket Server
Web App
DC 2
1,3,5…
WAN LINK
Ticket Server
4,6,8…
WAN link not required to generate an ID
![Page 10: Unique ID generation in distributed systems](https://reader031.vdocuments.us/reader031/viewer/2022020206/547ba771b479599a098b4d56/html5/thumbnails/10.jpg)
Flickr MySQL ticket server
• Numeric IDs
• Do not go up with time
• Resilient and distributed
• ID generation separated from data store
![Page 11: Unique ID generation in distributed systems](https://reader031.vdocuments.us/reader031/viewer/2022020206/547ba771b479599a098b4d56/html5/thumbnails/11.jpg)
DC
The anatomy of a ticket server
Web App
Web App
Web App
Web App
Ticket Server
![Page 12: Unique ID generation in distributed systems](https://reader031.vdocuments.us/reader031/viewer/2022020206/547ba771b479599a098b4d56/html5/thumbnails/12.jpg)
DC
Making things simpler
ID gen
Web App
ID gen
Web App
ID gen
Web App
ID gen
Web App
![Page 13: Unique ID generation in distributed systems](https://reader031.vdocuments.us/reader031/viewer/2022020206/547ba771b479599a098b4d56/html5/thumbnails/13.jpg)
UUIDs
• 128 bits
• Could use type 4 (Random) or type 1 (MAC address with time component)
• Can generate on each machine with no co-ordination
![Page 14: Unique ID generation in distributed systems](https://reader031.vdocuments.us/reader031/viewer/2022020206/547ba771b479599a098b4d56/html5/thumbnails/14.jpg)
Type 4 – random
xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx
f47ac10b-58cc-4372-a567-0e02b2c3d479
version
variant (8, 9, A or B)
![Page 15: Unique ID generation in distributed systems](https://reader031.vdocuments.us/reader031/viewer/2022020206/547ba771b479599a098b4d56/html5/thumbnails/15.jpg)
5.3 x 1036
possible values for a type 4 UUID
![Page 16: Unique ID generation in distributed systems](https://reader031.vdocuments.us/reader031/viewer/2022020206/547ba771b479599a098b4d56/html5/thumbnails/16.jpg)
1.1 x 1019
UUIDs we could generate per second since the Universe began
![Page 17: Unique ID generation in distributed systems](https://reader031.vdocuments.us/reader031/viewer/2022020206/547ba771b479599a098b4d56/html5/thumbnails/17.jpg)
2.1 x 1027
Olympic swimming pools filled if each possible value contributed a millilitre
![Page 18: Unique ID generation in distributed systems](https://reader031.vdocuments.us/reader031/viewer/2022020206/547ba771b479599a098b4d56/html5/thumbnails/18.jpg)
Type 1 – MAC address
51063800-dc76-11e1-9fae-001c42000009
• Time component is based on 100 nanosecond intervals since October 15, 1582
• Most significant bits of timestamp shifted to least significant bits of UUID
![Page 19: Unique ID generation in distributed systems](https://reader031.vdocuments.us/reader031/viewer/2022020206/547ba771b479599a098b4d56/html5/thumbnails/19.jpg)
Type 1 – MAC address
• The address (MAC) of the computer that generated the ID is encoded into it
• Lexical ordering essentially meaningless
• Deterministically unique
![Page 20: Unique ID generation in distributed systems](https://reader031.vdocuments.us/reader031/viewer/2022020206/547ba771b479599a098b4d56/html5/thumbnails/20.jpg)
There are some other options…
![Page 21: Unique ID generation in distributed systems](https://reader031.vdocuments.us/reader031/viewer/2022020206/547ba771b479599a098b4d56/html5/thumbnails/21.jpg)
No co-ordination needed
Deterministically unique
K-ordered (time-ordered lexically)
![Page 22: Unique ID generation in distributed systems](https://reader031.vdocuments.us/reader031/viewer/2022020206/547ba771b479599a098b4d56/html5/thumbnails/22.jpg)
Twitter Snowflake
• Under 64 bits
• No co-ordination (after startup)
• K-ordered
• Scala service, Thrift interface, uses Zookeeper for configuration
![Page 23: Unique ID generation in distributed systems](https://reader031.vdocuments.us/reader031/viewer/2022020206/547ba771b479599a098b4d56/html5/thumbnails/23.jpg)
Twitter Snowflake
41 bits Timestampmillisecond precision,
bespoke epoch
10 bits Configured machine ID
12 bits Sequence number
![Page 24: Unique ID generation in distributed systems](https://reader031.vdocuments.us/reader031/viewer/2022020206/547ba771b479599a098b4d56/html5/thumbnails/24.jpg)
Twitter Snowflake
77669839702851584
= (timestamp << 22) | (machine << 12) | sequence
![Page 25: Unique ID generation in distributed systems](https://reader031.vdocuments.us/reader031/viewer/2022020206/547ba771b479599a098b4d56/html5/thumbnails/25.jpg)
Boundary Flake
• 128 bits
• No co-ordination at all
• K-ordered
• Erlang service
![Page 26: Unique ID generation in distributed systems](https://reader031.vdocuments.us/reader031/viewer/2022020206/547ba771b479599a098b4d56/html5/thumbnails/26.jpg)
Boundary Flake
64 bits Timestampmillisecond precision,
1970 epoch
48 bits MAC address
16 bits Sequence number
![Page 27: Unique ID generation in distributed systems](https://reader031.vdocuments.us/reader031/viewer/2022020206/547ba771b479599a098b4d56/html5/thumbnails/27.jpg)
PHP Cruftflake
• Based on Twitter Snowflake
• No co-ordination (after startup)
• K-ordered
• PHP, ZeroMQ interface, uses Zookeeper for configuration
![Page 28: Unique ID generation in distributed systems](https://reader031.vdocuments.us/reader031/viewer/2022020206/547ba771b479599a098b4d56/html5/thumbnails/28.jpg)
Questions?
![Page 29: Unique ID generation in distributed systems](https://reader031.vdocuments.us/reader031/viewer/2022020206/547ba771b479599a098b4d56/html5/thumbnails/29.jpg)
References
Flickr distributed ticket serverhttp://code.flickr.com/blog/2010/02/08/ticket-servers-distributed-unique-primary-keys-on-the-cheap/
UUIDshttp://tools.ietf.org/html/rfc4122
How random are random UUIDs?http://stackoverflow.com/a/2514722/15318
Twitter Snowflakehttps://github.com/twitter/snowflake
Boundary Flakehttps://github.com/boundary/flake
PHP Cruftflakehttps://github.com/davegardnerisme/cruftflake
![Page 30: Unique ID generation in distributed systems](https://reader031.vdocuments.us/reader031/viewer/2022020206/547ba771b479599a098b4d56/html5/thumbnails/30.jpg)
private function mintId64($timestamp, $machine, $sequence){ $timestamp = (int)$timestamp; $value = ($timestamp << 22) | ($machine << 12) | $sequence; return (string)$value;}
private function mintId32($timestamp, $machine, $sequence){ $hi = (int)($timestamp / pow(2,10)); $lo = (int)($timestamp * pow(2, 22)); // stick in the machine + sequence to the low bit $lo = $lo | ($machine << 12) | $sequence;
// reconstruct into a string of numbers $hex = pack('N2', $hi, $lo); $unpacked = unpack('H*', $hex); $value = $this->hexdec($unpacked[1]); return (string)$value;}
![Page 31: Unique ID generation in distributed systems](https://reader031.vdocuments.us/reader031/viewer/2022020206/547ba771b479599a098b4d56/html5/thumbnails/31.jpg)
public function generate(){ $t = floor($this->timer->getUnixTimestamp() - $this->epoch); if ($t !== $this->lastTime) { $this->sequence = 0; $this->lastTime = $t; } else { $this->sequence++; if ($this->sequence > 4095) { throw new \OverflowException('Sequence overflow'); } } if (PHP_INT_SIZE === 4) { return $this->mintId32($t, $this->machine, $this->sequence); } else { return $this->mintId64($t, $this->machine, $this->sequence); }}