modern data architectures for real time analytics and engagement
TRANSCRIPT
![Page 1: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/1.jpg)
![Page 2: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/2.jpg)
Modern Data Architectures for Real-Time Analytics & Engagement
Russell NashAPAC Solutions Architect
![Page 3: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/3.jpg)
Russell NashAPAC Solutions ArchitectAmazon Web Services
![Page 4: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/4.jpg)
SCALABLE FLEXIBLE MANAGEABLE COST EFFECTIVE
Modern Data Architecture
![Page 5: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/5.jpg)
Ingest Serving
Speed (Real-time)
Scale (Batch)
Data analysts
Data scientists
Business users
Engagement platforms
Automation / events
Sources
Modern Data Architecture
![Page 6: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/6.jpg)
Ingest Serving
Speed (Real-time)
Scale (Batch)
Data analysts
Data scientists
Business users
Engagement platforms
Automation / events
Sources
Real-time Pipeline
Amazon Kinesis
Machines
Devices
Mobile
Clickstream
![Page 7: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/7.jpg)
Amazon Kinesis Streams
Amazon Kinesis Firehose
Amazon Kinesis Analytics
Kinesis Family
![Page 8: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/8.jpg)
Availability Zone
Availability Zone
Availability Zone
Amazon Kinesis
Stream
AWS Lambda
KCL App
Amazon EMR
Streaming
Logs
Alerts
Analysis
Dashboards
Predictions
![Page 9: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/9.jpg)
Amazon Kinesis Stream
SHARD1000 TPS or 1MB 5 TPS or 2MB
SHARD
2000 TPS or 2MB 10 TPS or 4MB
SHARD
3000 TPS or 3MB 15 TPS or 6MB
Retention: 24 hours to 7 Days
![Page 10: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/10.jpg)
Creating a Kinesis Stream
![Page 11: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/11.jpg)
Amazon Kinesis Stream
SHARD
SHARD
SHARD
EVENT PRODUCERS
KinesisEndpoint
Specify Partition Key
![Page 12: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/12.jpg)
• Writes to one or more Amazon Kinesis Streams• Retry Mechanism• Uses PutRecords • Aggregates • Integrates with Amazon KCL to de-aggregate• Submits Amazon CloudWatch metrics
Kinesis Producer Library
![Page 13: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/13.jpg)
Kinesis Agent
• Monitors files and sends new data records to your delivery stream• Handles file rotation, checkpointing, and retry upon failures• Delivers all data in a reliable, timely, and simple manner• Emits AWS CloudWatch metrics
![Page 14: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/14.jpg)
Availability Zone
Availability Zone
Availability Zone
Amazon Kinesis
Stream
AWS Lambda
KCL App
Amazon EMR
Streaming
Logs
Alerts
Analysis
Dashboards
Predictions
![Page 15: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/15.jpg)
Kinesis Data Out – Kinesis Client Library
SHARD 1
SHARD 2
SHARD 3
SHARD N
EC2 Instance
Worker 1
Worker 2
EC2 Instance
Worker 3
Worker N
KCL: Java, Node.js, Python, .NET, Ruby
![Page 16: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/16.jpg)
twitter-trends.com
twitter-trends.com website
![Page 17: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/17.jpg)
twitter-trends.com
The solution: Local Top 10
My top-10
My top-10
My top-10
Global top-10
![Page 18: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/18.jpg)
KINESIS
twitter-trends.com
Challenges using the Kinesis API directly
Kinesisapplication
Manual creation of workers and assignment to shards
How many workers per EC2 instance?How many EC2 instances?
![Page 19: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/19.jpg)
KINESIS
twitter-trends.com
Using the Kinesis Client Library
Kinesisapplication
Shard mgmt table
![Page 20: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/20.jpg)
KINESIS
twitter-trends.com
Elasticity and load balancing
Shard mgmt table
Auto scaling Group
![Page 21: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/21.jpg)
KINESIS
twitter-trends.com
Fault tolerance support in KCL
Shard mgmt table
XAvailability Zone
1
Availability Zone 3
![Page 22: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/22.jpg)
Checkpoint, replay design pattern
Kinesis
1417182123
Shard-i235810
Shard ID
Lock Seq num
Shard-i
Host A
Host B
Shard ID
Local top-10
Shard-i
0
10
18X2
3
5
8
10
14
1718
2123
0
310
Host AHost B
{#Movies: 10235, #Weather: 9835, …}{#Movies: 10235, #Weather: 9910, …}
1023
1417
1821
23
![Page 23: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/23.jpg)
Availability Zone
Availability Zone
Availability Zone
Amazon Kinesis
Stream
AWS Lambda
KCL App
Amazon EMR
Streaming
Logs
Alerts
Analysis
Dashboards
Predictions
![Page 24: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/24.jpg)
Kinesis & Lambda
SHARD 1
SHARD 2
SHARD 3
SHARD N
AWS Lambda: Node.js, Java, Python, C#
AWS Lambda
![Page 25: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/25.jpg)
LambdaBlueprints
![Page 26: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/26.jpg)
Availability Zone
Availability Zone
Availability Zone
Amazon Kinesis
Stream
AWS Lambda
KCL App
Amazon EMR
Streaming
Logs
Alerts
Analysis
Dashboards
Predictions
![Page 27: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/27.jpg)
Spark Core
SparkSQL
Spark Streaming
Spark R
Spark ML Graph X
![Page 28: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/28.jpg)
Spark Core
SparkSQL
Spark Streaming
Spark R
Spark ML Graph X
![Page 29: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/29.jpg)
StreamMicro
BatchesResults
Amazon Kinesis
Apache Kafka
![Page 30: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/30.jpg)
Spark Core
SparkSQL
Spark Streaming
Spark R
Spark ML Graph X
![Page 31: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/31.jpg)
Data Prep
Prediction Model
Train
TestSplit
70%
30%
Near Real-time Data
Training Data
SQL
ML
![Page 32: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/32.jpg)
Ingest Serving
Speed (Real-time)
Scale (Batch)
Data analysts
Data scientists
Business users
Engagement platforms
Automation / events
Sources
Amazon Kinesis AWS Lambda
Application
Amazon EMR
Streaming
S3 (Log)
Amazon ElasticSearch(Dashboard)
Real-time Pipeline
![Page 33: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/33.jpg)
AmazonElasticsearch
• Search and Analytics• Scalable• Fully Managed• Integrated – Logstash, Kibana
![Page 34: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/34.jpg)
![Page 35: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/35.jpg)
Ingest Serving
Speed (Real-time)
Scale (Batch)
Data analysts
Data scientists
Business users
Engagement platforms
Automation / events
Sources
Amazon Kinesis AWS Lambda
Application
Amazon EMR
Streaming
S3 (Logs)
Amazon ElasticSearch(Dashboards)
Amazon EMR(Predictions)
ML
Amazon SNS(Alerts)
Real-time Pipeline
Amazon Redshift
(Analytics)
![Page 36: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/36.jpg)
Amazon Kinesis Streams
Amazon Kinesis Firehose
Amazon Kinesis Analytics
Kinesis Family
![Page 37: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/37.jpg)
S3
Redshift
Elasticsearch
Amazon Kinesis Firehose
Auto provisioningAuto partition keysEnd to End Elastic
Batch Compress
Encrypt
![Page 38: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/38.jpg)
Amazon Kinesis Streams
Amazon Kinesis Firehose
Amazon Kinesis Analytics
Kinesis Family
![Page 39: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/39.jpg)
Kinesis Analytics
Stream or Firehose
Kinesis Analytics
Data OutData In
SQL
Stream or Firehose
![Page 40: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/40.jpg)
![Page 41: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/41.jpg)
![Page 42: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/42.jpg)
Sonos
![Page 43: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/43.jpg)
New X1 Instance - Tons of Memory
• Large-scale, in-memory applications
• Intel® Xeon® E7 8880 v3 Haswell processors
• Up to 2TB of memory
• Up to 128 vCPUs per instance
![Page 44: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/44.jpg)
Intel® Processor Technologies
Intel® AVX – Dramatically increases performance for highly parallel HPC workloads such as life science engineering, data mining, financial analysis, media processing
Intel® AES-NI – Enhances security with new encryption instructions that reduce the performance penalty associated with encrypting/decrypting data
Intel® Turbo Boost Technology – Increases computing power with performance that adapts to spikes in workloads
Intel Transactional Synchronization (TSX) Extensions – Enables execution of transactions that are independent to accelerate throughput
P state & C state control – provides granular performance tuning for cores and sleep states to improve overall application performance
![Page 45: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/45.jpg)
![Page 46: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/46.jpg)
twitter.com/awsawscloudseasia
facebook.com/amazonwebservices/
youtube.com/user/AmazonWebServices
slideshare.net/amazonwebservices
Thank you for joining us today. Please complete the survey & let us know what you think of the webinar.
![Page 47: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/47.jpg)
REGISTER NOWhttp://amzn.to/2jFt11NComplimentary labs are available only till 31 March 2017
Get hands on experience working with the AWS Technology.Access the complimentary Big Data on AWS self-paced labs
![Page 48: Modern data architectures for real time analytics and engagement](https://reader037.vdocuments.us/reader037/viewer/2022103105/58ce71ea1a28abdc578b59ed/html5/thumbnails/48.jpg)
Q&A