© 2018 GridGain Systems, Inc.
Distributed Machine Learning with Zero ETL
Yury Babak
Head of development, GridGain
© 2018 GridGain Systems, Inc.
Long ETL
© 2018 GridGain Systems, Inc.
Long ETL
- Х%
- Х%
© 2018 GridGain Systems, Inc.
Distributed Training
© 2018 GridGain Systems, Inc.
Node Crash
© 2018 GridGain Systems, Inc.
Apache Ignite
© 2018 GridGain Systems, Inc.
Apache Ignite: Replicated Caches
Server Node 1 Server Node 2
Server Node 3 Server Node 4
Client
© 2018 GridGain Systems, Inc.
Map Reduce
© 2018 GridGain Systems, Inc.
Iterative Optimization Algorithm
© 2018 GridGain Systems, Inc.
Partition Based Data Set
© 2018 GridGain Systems, Inc.
Restoration of partitions after a failure
© 2018 GridGain Systems, Inc.
Recovering calculations after failure
© 2018 GridGain Systems, Inc.
OLS sample
Loss function
Gradient of loss function
Node 2Node 1Node M
© 2018 GridGain Systems, Inc.
Sample 2 LSQR
© 2018 GridGain Systems, Inc.
Limitations of Applicability
Iteration time
Number of Iterations
SGDBS 1 000
BS 10
Time to training
© 2018 GridGain Systems, Inc.
https://ignite.apache.org
https://apacheignite.readme.io/docs
https://github.com/apache/ignite
Want to learn more?