serialization and performance by sergey morenets
TRANSCRIPT
Serialization and performance
Сергей Моренец21 ноября 2013 г.
About author• Works in IT since 2000• 10 year of Java SE/EE experience• Occupied senior Java developer/Team Lead
positions• Winner of 2013 JBoss Community Recognition
Award. https://www.jboss.org/jbcra
Software witchcraft• Software Development in a nutshell• Art of interview• Java under microscope• Magic of refactoring• Pattern-driven design• Java Developer Toolkit• Deconstructing Java• From novice to architect• Business and technical English• Business thinking and communication• Successful career in IT
Agenda• Purpose of serialization• Frameworks overview• Performance testing• Q & A
Serialization• File storages• Database• Network communication• Web usage
Serialization• Simple• Flexible• Fast• Compact• Versioning• Scalable
Data formats• Binary• XML• JSON• YAML
Difference• Source code changes• Schemas• Optimization & customization• Interoperability• Output class knowledge
Java serialization• Your class should implement Serializable interface• The easiest programming effort• Out-of-the-box functionality
Java serialization• Decreases the flexibility to change a class’s
implementation once it has been released• Doesn’t allow to exchange data with C++/Python
applications• Due to default constructors hole for invariant
corruption and illegal access• No customization• You should have access to the source code
Java externalization• Serialization but by implementing Externalizable
interface to persist and restore the object• Responsibility of the class to save and restore the
contents of its instances• Requires modifications in
marshalling/unmarshalling code if the class contents changed
Java externalization
Avro
• Schema evolution• Binary and JSON encoding• Dynamic typing• Support of Java, C, C++, C# and Python
Avro
{"namespace": "org.test.domain", "type": "record", "name": "User", "fields": [ {"name": "login", "type": "string"}, ] }
Avro
XML• Interchangeable format• Supported schemas• Space intensive and huge performance loss• Complex navigating
Simple
• High performance XML serialization and configuration framework for Java.
• Requires absolutely no configuration• Can handle cycles in the object graph
Simple
• High performance XML serialization and configuration framework for Java.
• Requires absolutely no configuration• Can handle cycles in the object graph
Simple
Javolution
• Fast real-time library for safety-critical applications
• Based on OSGi context• Parallel computing support
Javolution
Json-io
• Doesn’t require custom interfaces/attributes usage/source code
• Handles cyclic references• Reader/writer customization• Does not depend on any native or 3rd party
libraries.
Google gson
• A Java library to convert JSON to Java objects and vice-versa
• Doesn’t require source code of serialized objects• Allow custom representatives
Jackson
• High-performance, ergonomic JSON processor Java library
• Extensive customization tools• Mix-in annotations• Materialized interfaces• Multiple data formats
Jackson
• JSON• CSV• Smile(binary JSON)• XML• YAML(similar to JSON)
BSON for Jackson
• Binary encoded JSON• Main data exchange format for MongoDB• Allows writing custom extensions
Protocol buffers
• Way of encoding structured data in an efficient yet extensible format.
• Google uses Protocol Buffers for almost all of its internal RPC protocols and file formats.
Protocol buffers
message User { required string login = 1; repeated Order orders = 2;}
message Order { required int32 id = 1; optional string date = 2;}
Protocol buffers
FST
• Focus on speed, size and compatibility• Use case is high performance message oriented
software• Drop-in replacement• Custom optimization using annotations, custom
serializers
GridGain
• Part of distributed computing system• Don’t require any custom interfaces or API • Direct memory copying by invoking native
"unsafe" operations• Predefined fields introspection
Kryo
• Fast and efficient object graph serialization framework for Java
• Open source project on Google code• Automatic deep and shallow copying/cloning• Doesn’t put requirements on the source classes(in
most cases)
Kryo• Twitter• Apache Hive• Akka• Storm• S4
Kryo
Kryo
Benchmark
• JDK 1.7.0.45• Apache Avro 1.7.5• Simple 2.7.1• Json-io 2.2.32• Google GSON 2.2.4• Jackson 2.3.0• BSON for Jackson 2.2.3• Protocol buffers 2.5• Kryo 2.22• FST 1.28• GridGain 5.3.0
Benchmark
• Speed(serialization and deserialization)• Size(complex and ordinary objects)• Flexibility
Serialization(complex)# Framework Time(ms)
1 Kryo(optimized) 249
2 Protocol buffers 304
3 Kryo(unsafe) 356
4 FST 433
5 Jackson(smile) 480
6 Kryo 510
7 Java serialization 518
8 Jackson(XML) 634
9 GridGain 700
10 Jackson 803
11 Javolution 1346
12 Google GSON 1448
Serialization(simple)# Framework Time(ms)
1 Protocol buffers 2
2 Google GSON 32
3 Java serialization 35
4 Kryo(optimized) 36
5 Kryo(unsafe) 52
6 Kryo 55
7 BSON for Jackson 72
8 Jackson(smile) 75
9 Jackson(XML) 77
10 Jackson 115
11 FST 149
12 Javolution 176
Deserialization(complex)
# Framework Time(ms)
1 Kryo(optimized) 254
2 Kryo(unsafe) 376
3 Protocol buffers 384
4 Jackson(smile) 469
5 GridGain 460
6 FST 492
7 Kryo 597
8 Java serialization 597
9 Jackson 1069
10 Jackson(XML) 1087
11 Google GSON 1484
12 BSON for Jackson 1857
Deserialization(simple)
# Framework Time(ms)
1 Protocol buffers 6
2 Google GSON 30
3 Kryo(optimized) 36
4 GridGain 47
5 Kryo(unsafe) 53
6 Kryo 54
7 Jackson(smile) 75
8 BSON for Jackson 61
9 FST 112
10 Jackson 116
11 Java serialization 123
12 Json-io 162
Size(complex)# Framework Size(bytes
)
1 Kryo(optimized) 33904
2 FST 34069
3 Kryo 35674
4 Protocol buffers 39477
5 Kryo(unsafe) 40554
6 Jackson(smile) 44840
7 Java serialization 49757
8 GridGain 58288
9 Jackson 67858
10 Google GSON 68338
11 Jackson(YAML) 79017
12 Javolution 80021
Size(simple)# Framework Size(bytes
)
1 Kryo(optimized) 18
2 Kryo 18
3 Protocol buffers 20
4 Kryo(unsafe) 21
5 GridGain 33
6 Jackson(smile) 40
7 Jackson 41
8 Google GSON 41
9 Jackson(YAML) 41
10 BSON for Jackson 46
11 Simple 48
12 FST 57
Usability# Framework
1 Google GSON
2 Kryo
2 Kryo(unsafe)
3 Jackson
3 Jackson(XML)
3 Jackson(Smile)
3 Jackson(YAML)
3 BSON for Jackson
4 Json-io
5 FST
6 Java serialization
7 Kryo(optimized)
Overall rating# Framework Rating
1 Kryo(optimized) 73
2 Kryo(unsafe) 65
3 Protocol buffers 63
4 Kryo 59
5 Jackson(smile) 51
6 Google GSON 45
7 FST 42
8 GridGain 34
9 Jackson 32
10
Java serialization 30
11
BSON for Jackson 24
12
Jackson(XML) 21
AdvicesFramework Usage
Kryo When you need fast and compact serializer for complex objects over network
Protocol buffers
When you need fast serializer for simple objects
Jackson(smile) When you need Jackson-based serializer for Web usage
Google JSON When you need dirty solution to quickly serialize/deserialize objects
Apache Avro When you need to serialize objects into files with possible schema changes
Java When you need out-of-the-box solution without additional libraries