serialization and performance by sergey morenets

47
Serialization and performance Сергей Моренец 21 ноября 2013 г.

Upload: alex-tumanoff

Post on 10-May-2015

7.712 views

Category:

Technology


4 download

TRANSCRIPT

Page 1: Serialization and performance by Sergey Morenets

Serialization and performance

Сергей Моренец21 ноября 2013 г.

Page 2: Serialization and performance by Sergey Morenets

About author• Works in IT since 2000• 10 year of Java SE/EE experience• Occupied senior Java developer/Team Lead

positions• Winner of 2013 JBoss Community Recognition

Award. https://www.jboss.org/jbcra

Page 3: Serialization and performance by Sergey Morenets

Software witchcraft• Software Development in a nutshell• Art of interview• Java under microscope• Magic of refactoring• Pattern-driven design• Java Developer Toolkit• Deconstructing Java• From novice to architect• Business and technical English• Business thinking and communication• Successful career in IT

Page 4: Serialization and performance by Sergey Morenets

Agenda• Purpose of serialization• Frameworks overview• Performance testing• Q & A

Page 5: Serialization and performance by Sergey Morenets

Serialization• File storages• Database• Network communication• Web usage

Page 6: Serialization and performance by Sergey Morenets

Serialization• Simple• Flexible• Fast• Compact• Versioning• Scalable

Page 7: Serialization and performance by Sergey Morenets

Data formats• Binary• XML• JSON• YAML

Page 8: Serialization and performance by Sergey Morenets

Difference• Source code changes• Schemas• Optimization & customization• Interoperability• Output class knowledge

Page 9: Serialization and performance by Sergey Morenets

Java serialization• Your class should implement Serializable interface• The easiest programming effort• Out-of-the-box functionality

Page 10: Serialization and performance by Sergey Morenets

Java serialization• Decreases the flexibility to change a class’s

implementation once it has been released• Doesn’t allow to exchange data with C++/Python

applications• Due to default constructors hole for invariant

corruption and illegal access• No customization• You should have access to the source code

Page 11: Serialization and performance by Sergey Morenets

Java externalization• Serialization but by implementing Externalizable

interface to persist and restore the object• Responsibility of the class to save and restore the

contents of its instances• Requires modifications in

marshalling/unmarshalling code if the class contents changed

Page 12: Serialization and performance by Sergey Morenets

Java externalization

Page 13: Serialization and performance by Sergey Morenets

Avro

• Schema evolution• Binary and JSON encoding• Dynamic typing• Support of Java, C, C++, C# and Python

Page 14: Serialization and performance by Sergey Morenets

Avro

{"namespace": "org.test.domain", "type": "record", "name": "User", "fields": [ {"name": "login", "type": "string"}, ] }

Page 15: Serialization and performance by Sergey Morenets

Avro

Page 16: Serialization and performance by Sergey Morenets

XML• Interchangeable format• Supported schemas• Space intensive and huge performance loss• Complex navigating

Page 17: Serialization and performance by Sergey Morenets

Simple

• High performance XML serialization and configuration framework for Java.

• Requires absolutely no configuration• Can handle cycles in the object graph

Page 18: Serialization and performance by Sergey Morenets

Simple

• High performance XML serialization and configuration framework for Java.

• Requires absolutely no configuration• Can handle cycles in the object graph

Page 19: Serialization and performance by Sergey Morenets

Simple

Page 20: Serialization and performance by Sergey Morenets

Javolution

• Fast real-time library for safety-critical applications

• Based on OSGi context• Parallel computing support

Page 21: Serialization and performance by Sergey Morenets

Javolution

Page 22: Serialization and performance by Sergey Morenets

Json-io

• Doesn’t require custom interfaces/attributes usage/source code

• Handles cyclic references• Reader/writer customization• Does not depend on any native or 3rd party

libraries.

Page 23: Serialization and performance by Sergey Morenets

Google gson

• A Java library to convert JSON to Java objects and vice-versa

• Doesn’t require source code of serialized objects• Allow custom representatives

Page 24: Serialization and performance by Sergey Morenets

Jackson

• High-performance, ergonomic JSON processor Java library

• Extensive customization tools• Mix-in annotations• Materialized interfaces• Multiple data formats

Page 25: Serialization and performance by Sergey Morenets

Jackson

• JSON• CSV• Smile(binary JSON)• XML• YAML(similar to JSON)

Page 26: Serialization and performance by Sergey Morenets

BSON for Jackson

• Binary encoded JSON• Main data exchange format for MongoDB• Allows writing custom extensions

Page 27: Serialization and performance by Sergey Morenets

Protocol buffers

• Way of encoding structured data in an efficient yet extensible format.

• Google uses Protocol Buffers for almost all of its internal RPC protocols and file formats.

Page 28: Serialization and performance by Sergey Morenets

Protocol buffers

message User { required string login = 1; repeated Order orders = 2;}

message Order { required int32 id = 1; optional string date = 2;}

Page 29: Serialization and performance by Sergey Morenets

Protocol buffers

Page 30: Serialization and performance by Sergey Morenets

FST

• Focus on speed, size and compatibility• Use case is high performance message oriented

software• Drop-in replacement• Custom optimization using annotations, custom

serializers

Page 31: Serialization and performance by Sergey Morenets

GridGain

• Part of distributed computing system• Don’t require any custom interfaces or API • Direct memory copying by invoking native

"unsafe" operations• Predefined fields introspection

Page 32: Serialization and performance by Sergey Morenets

Kryo

• Fast and efficient object graph serialization framework for Java

• Open source project on Google code• Automatic deep and shallow copying/cloning• Doesn’t put requirements on the source classes(in

most cases)

Page 33: Serialization and performance by Sergey Morenets

Kryo• Twitter• Apache Hive• Akka• Storm• S4

Page 34: Serialization and performance by Sergey Morenets

Kryo

Page 35: Serialization and performance by Sergey Morenets

Kryo

Page 36: Serialization and performance by Sergey Morenets

Benchmark

• JDK 1.7.0.45• Apache Avro 1.7.5• Simple 2.7.1• Json-io 2.2.32• Google GSON 2.2.4• Jackson 2.3.0• BSON for Jackson 2.2.3• Protocol buffers 2.5• Kryo 2.22• FST 1.28• GridGain 5.3.0

Page 37: Serialization and performance by Sergey Morenets

Benchmark

• Speed(serialization and deserialization)• Size(complex and ordinary objects)• Flexibility

Page 38: Serialization and performance by Sergey Morenets

Serialization(complex)# Framework Time(ms)

1 Kryo(optimized) 249

2 Protocol buffers 304

3 Kryo(unsafe) 356

4 FST 433

5 Jackson(smile) 480

6 Kryo 510

7 Java serialization 518

8 Jackson(XML) 634

9 GridGain 700

10 Jackson 803

11 Javolution 1346

12 Google GSON 1448

Page 39: Serialization and performance by Sergey Morenets

Serialization(simple)# Framework Time(ms)

1 Protocol buffers 2

2 Google GSON 32

3 Java serialization 35

4 Kryo(optimized) 36

5 Kryo(unsafe) 52

6 Kryo 55

7 BSON for Jackson 72

8 Jackson(smile) 75

9 Jackson(XML) 77

10 Jackson 115

11 FST 149

12 Javolution 176

Page 40: Serialization and performance by Sergey Morenets

Deserialization(complex)

# Framework Time(ms)

1 Kryo(optimized) 254

2 Kryo(unsafe) 376

3 Protocol buffers 384

4 Jackson(smile) 469

5 GridGain 460

6 FST 492

7 Kryo 597

8 Java serialization 597

9 Jackson 1069

10 Jackson(XML) 1087

11 Google GSON 1484

12 BSON for Jackson 1857

Page 41: Serialization and performance by Sergey Morenets

Deserialization(simple)

# Framework Time(ms)

1 Protocol buffers 6

2 Google GSON 30

3 Kryo(optimized) 36

4 GridGain 47

5 Kryo(unsafe) 53

6 Kryo 54

7 Jackson(smile) 75

8 BSON for Jackson 61

9 FST 112

10 Jackson 116

11 Java serialization 123

12 Json-io 162

Page 42: Serialization and performance by Sergey Morenets

Size(complex)# Framework Size(bytes

)

1 Kryo(optimized) 33904

2 FST 34069

3 Kryo 35674

4 Protocol buffers 39477

5 Kryo(unsafe) 40554

6 Jackson(smile) 44840

7 Java serialization 49757

8 GridGain 58288

9 Jackson 67858

10 Google GSON 68338

11 Jackson(YAML) 79017

12 Javolution 80021

Page 43: Serialization and performance by Sergey Morenets

Size(simple)# Framework Size(bytes

)

1 Kryo(optimized) 18

2 Kryo 18

3 Protocol buffers 20

4 Kryo(unsafe) 21

5 GridGain 33

6 Jackson(smile) 40

7 Jackson 41

8 Google GSON 41

9 Jackson(YAML) 41

10 BSON for Jackson 46

11 Simple 48

12 FST 57

Page 44: Serialization and performance by Sergey Morenets

Usability# Framework

1 Google GSON

2 Kryo

2 Kryo(unsafe)

3 Jackson

3 Jackson(XML)

3 Jackson(Smile)

3 Jackson(YAML)

3 BSON for Jackson

4 Json-io

5 FST

6 Java serialization

7 Kryo(optimized)

Page 45: Serialization and performance by Sergey Morenets

Overall rating# Framework Rating

1 Kryo(optimized) 73

2 Kryo(unsafe) 65

3 Protocol buffers 63

4 Kryo 59

5 Jackson(smile) 51

6 Google GSON 45

7 FST 42

8 GridGain 34

9 Jackson 32

10

Java serialization 30

11

BSON for Jackson 24

12

Jackson(XML) 21

Page 46: Serialization and performance by Sergey Morenets

AdvicesFramework Usage

Kryo When you need fast and compact serializer for complex objects over network

Protocol buffers

When you need fast serializer for simple objects

Jackson(smile) When you need Jackson-based serializer for Web usage

Google JSON When you need dirty solution to quickly serialize/deserialize objects

Apache Avro When you need to serialize objects into files with possible schema changes

Java When you need out-of-the-box solution without additional libraries

Page 47: Serialization and performance by Sergey Morenets

Q&A

• Сергей Моренец, [email protected]