serialization and performance in java

52
Serialization and performance Сергей Моренец 23 мая 2014 г.

Upload: strannik2013

Post on 18-Dec-2014

461 views

Category:

Technology


2 download

DESCRIPTION

1. Serialization overview 2. Java libraries for serialization. 3. Benchmarking

TRANSCRIPT

Page 1: Serialization and performance in Java

Serialization and performance

Сергей Моренец23 мая 2014 г.

Page 2: Serialization and performance in Java

About author

Works in IT since 2000

10 year of Java SE/EE experience

Occupied senior Java developer/Team Lead positions

Winner of 2013 JBoss Community Recognition Award. https://www.jboss.org/jbcra

Page 3: Serialization and performance in Java

Agenda

•Purpose of serialization•Frameworks overview•Performance testing•Q & A

Page 4: Serialization and performance in Java

Serialization

File storages

Database

Network communication

Web usage

Page 5: Serialization and performance in Java

Serialization

Simple

Flexible

Compact

Versioning

Fast Scalable

Page 6: Serialization and performance in Java

Data formats

Binary

XML

JSON

YAML

Page 7: Serialization and performance in Java

Performance

•Native memory copying using C operations•“Unsafe” operations•Ignore object introspection•Direct object-object copying

Page 8: Serialization and performance in Java

Java serialization

•The easiest programming effort•Out-of-the-box functionality

Page 9: Serialization and performance in Java

Java serialization

•Serializable interface•Decreases the flexibility to change a class’s implementation once it has been released•Doesn’t allow to exchange data with

C++/Python applications•Due to default constructors hole for invariant corruption and illegal access•No customization•You should have access to the source code

No customizationYou should have access to the source code

Page 10: Serialization and performance in Java

Java externalization

•Serialization but by implementing Externalizable interface to persist and restore the object•Responsibility of the class to save and

restore the contents of its instances•Requires modifications in

marshalling/unmarshalling code if the class contents changed

No customizationYou should have access to the source code

Page 11: Serialization and performance in Java

Java externalization

Page 12: Serialization and performance in Java

Avro

•Schema evolution•Binary and JSON encoding•Dynamic typing•Support of Java, C, C++, C# and Python•Apache Hadoop integration

Page 13: Serialization and performance in Java

Avro

Page 14: Serialization and performance in Java

Avro

Page 15: Serialization and performance in Java

XML

•Interchangeable format•Supported schemas•Space intensive and huge performance loss•Complex navigating

Page 16: Serialization and performance in Java

Simple

•High performance XML serialization and configuration framework for Java.•Requires absolutely no configuration•Can handle cycles in the object graph

Page 17: Serialization and performance in Java

Simple

Page 18: Serialization and performance in Java

Javolution

•Fast real-time library for safety-critical applications•Based on OSGi context•Parallel computing support

Page 19: Serialization and performance in Java

Javolution

Page 20: Serialization and performance in Java

Json-io

•Doesn’t require custom interfaces/attributes usage/source code•Handles cyclic references•Reader/writer customization•Does not depend on any native or 3rd party

libraries.

Page 21: Serialization and performance in Java

Google gson

•Java library to convert JSON to Java objects and vice-versa•Doesn’t require source code of serialized objects•Allow custom representatives

Page 22: Serialization and performance in Java

Google gson

Page 23: Serialization and performance in Java

Jackson

•High-performance, ergonomic JSON processor Java library•Extensive customization tools•Mix-in annotations•Materialized interfaces•Multiple data formats

Page 24: Serialization and performance in Java

Jackson

•JSON•CSV•Smile(binary JSON)•XML•YAML(similar to JSON)

Page 25: Serialization and performance in Java

BSON for Jackson

•Binary encoded JSON•Main data exchange format for MongoDB•Allows writing custom extensions

Page 26: Serialization and performance in Java

Protocol buffers

•Way of encoding structured data in an efficient yet extensible format. •Google uses Protocol Buffers for almost all of

its internal RPC protocols and file formats. • Supported in Java, C++, Python

Page 27: Serialization and performance in Java

Protocol buffers

message User { required string login = 1; repeated Order orders = 2;}

message Order { required int32 id = 1; optional string date = 2;}

Page 28: Serialization and performance in Java

Protocol buffers

Page 29: Serialization and performance in Java

FST

•Java-to-java library•No support for versioning•Use case is high performance message oriented

software•Drop-in replacement•Custom optimization using annotations, custom

serializers

Page 30: Serialization and performance in Java

FST

Page 31: Serialization and performance in Java

GridGain

•Part of distributed computing system•Don’t require any custom interfaces or API •Direct memory copying by invoking native

"unsafe" operations•Predefined fields introspection

Page 32: Serialization and performance in Java

GridGain

Page 33: Serialization and performance in Java

Kryo

•Fast and efficient object graph serialization framework for Java•Open source project on Google code•Automatic deep and shallow copying/cloning•Doesn’t put requirements on the source classes(in most cases)

Page 34: Serialization and performance in Java

Kryo

•Twitter•Apache Hive•Akka•Storm•S4

Page 35: Serialization and performance in Java

Kryo

Page 36: Serialization and performance in Java

Kryo

Page 37: Serialization and performance in Java

Benchmark

•JDK 1.8.0.5•Apache Avro 1.7.6•Simple 2.7.1•Json-io 2.5.2•Google GSON 2.2.4•Jackson 2.3.2•BSON for Jackson 2.3.1•Protocol buffers 2.5•Kryo 2.23•FST 1.54•GridGain 6.0.2

Page 38: Serialization and performance in Java

Benchmark

•Speed(serialization and deserialization)•Size(complex and ordinary objects)•Flexibility

Page 39: Serialization and performance in Java

Benchmark

Page 40: Serialization and performance in Java

Benchmark

Page 41: Serialization and performance in Java

Issues

Library Description

Gson, Jackson Crashed when serializing cyclic dependency

Simple Crashed for very big XML file

Avro Bug during deserialization

Page 42: Serialization and performance in Java

Serialization (complex)

# Library Time(ms)

1 Kryo(optimized) 134

2 Protocol buffers 165

3 GridGain 196

4 FST 207

5 Kryo 209

6 Jackson(smile) 275

7 Kryo(unsafe) 306

8 Jackson 491

9 Java serialization 605

10 Javolution 1043

Page 43: Serialization and performance in Java

Serialization (simple)

# Library Time(ms)

1 Protocol buffers <1

2 Google GSON 5

3 Java serialization 10

4 BSON for Jackson 10

5 Jackson(smile) 11

6 Kryo(optimized) 17

7 Kryo 18

8 Jackson 18

9 Kryo(unsafe) 20

10 Javolution 21

Page 44: Serialization and performance in Java

Deserialization (complex)

# Library Time(ms)

1 Kryo(optimized) 113

2 Protocol buffers 165

3 GridGain 196

4 FST 207

5 Kryo 209

6 Jackson(smile) 275

7 Kryo(unsafe) 306

8 Jackson 491

9 Java serialization 605

10 BSON for Jackson 930

Page 45: Serialization and performance in Java

Deserialization (simple)

# Library Time(ms)

1 Protocol buffers 1

2 GridGain 3

3 Google GSON 6

4 Jackson(smile) 9

5 BSON for Jackson 9

6 Kryo(optimized) 18

7 Kryo 18

8 Jackson 18

9 Kryo(unsafe) 20

10 FST 42

Page 46: Serialization and performance in Java

Size (complex)

# Library Size(bytes)

1 Kryo(optimized) 33904

2 FST 34069

3 Kryo 35674

4 Protocol buffers 39517

5 Kryo(unsafe) 40554

6 Jackson(smile) 44840

7 Java serialization 49757

8 GridGain 58288

9 Jackson 67858

10 Google GSON 68338

Page 47: Serialization and performance in Java

Size (simple)

# Library Size(bytes)

1 Kryo(optimized) 18

2 Kryo 18

3 Protocol buffers 20

4 Kryo(unsafe) 21

5 GridGain 33

6 Jackson(smile) 40

7 Jackson 41

8 Google GSON 41

9 Jackson(YAML) 41

10 BSON for Jackson 46

Page 48: Serialization and performance in Java

Usability

# Library

1 Google GSON

2 Kryo

2 Kryo(unsafe)

3 Jackson

3 Jackson(XML, Smile, YAML)

3 BSON for Jackson

4 Json-io

5 FST

6 Java serialization

7 Kryo(optimized)

Page 49: Serialization and performance in Java

Overall rating (2014)

# Library Rating

1 Kryo(optimized) 67

2 Protocol buffers 65

3 Kryo 58

4 Jackson(smile) 55

5 Kryo(unsafe) 46

6 GridGain 44

7 Google GSON 43

8 FST 43

9 Jackson 40

10

BSON for Jackson 33

11

Java serialization 32

Page 50: Serialization and performance in Java

Overall rating (2013)

# Library Rating

1 Kryo(optimized) 67

2 Kryo(unsafe) 65

3 Protocol buffers 63

4 Kryo 59

5 Jackson(smile) 51

6 Google GSON 45

7 FST 42

8 GridGain 34

9 Jackson 32

10

Java serialization 30

11

BSON for Jackson 24

Page 51: Serialization and performance in Java

Advices

Library Usage

Kryo Fast and compact serializer for complex objects over network

Protocol buffers Fast serializer for simple objects

Jackson(smile) Jackson-based serializer for Web usage

Google JSON Dirty solution to quickly serialize/deserialize objects

Apache Avro Serialize objects into files with possible schema changes

Java Out-of-the-box trusted solution without additional libraries

Page 52: Serialization and performance in Java

Сергей Моренец[email protected]

Q&A