[awskrug&jaws-ug meetup #1] serverless real-time analysis
TRANSCRIPT
Serverless Real-Time Analysis
Minyoung JeongCTO, The Beatpacking Company
Minyoung Jeong
CTO, The Beatpacking Company
Founder, AWSKRUG
AWS Community Hero
Love Python!
Agenda
• Real-Time Analysis ?
• Why Serverless?
• Architecture Overview
• AWS Products
Real-Time Analysis
BEATAd supported Streaming Radio
To sell ads,need Real-time analysis
To detect abusing, use Real-time analysis.
What's Real-time?
Differences between Unit
Day, Hour, Minute
Day, Hour, Minute
Second
충분히 적은 오차를 감수지연을 줄이는 분석
Why Serverless?
–William of Occam
“Pluralitas non est ponenda sine neccesitate”
Everything seems nails when you hold a hammer?
Flexible by demand Increase efficiency
Deploy, Scaling, CapacityFault tolerance,, ….
Works great with Microservice Architecture
Service that covers smaller areaQuick develop, higher reliability
MonitoringLoad dispersion and management
Forget everything other than codes.
Forget everything other than codes ?
Lambda has inconvenient development model Apex, Serverless
Architecture Overview
Peak: 1K RPS Avg: 300 RPS
API Gateawy+Lambda+ES<= $400/m < m4.2xlarge 1ea/m
API Gateway
REST EndpointAuth & Cache & Monitoring
HTTP ProxyLambda, AWS Service
By using Cloufront,reduce latency and prevent DDOS
Automatically expands within designated Range, then throttling works.
Utilizing Stage feature,Test/Verification
Request/Response Template
Kinesis
Platform for streaming data on AWS
Managed KafkaQueue/ETL/Analysis
Scale In/Out by ShardExplore forward, backward, and certain point of streamConfigurable Retention Period
Stream
Shard
Shard
Shard
Shard
TRIM_HORIZON LATESTAT(AFTER)_SEQUENCE
Connect various consumers to one streamInvocation is available within retaining period Idempotence
Lambda Integration(w/Event Source)Provide KCL/KPL SDKREST API
Lambda
Code in the Cloud
Execute by receiving event and contextCRON/AWS Event/API GatewayCloudwatch Log & Monitoring
Set only memory and execution timeoutAutoscale by throughputCharge by 100ms
Throtting and latency spike existExecute invocation and assure idempotence
event
function endpointCollect(event, context) { // api-gateway context let apiContext = event.context;
event.body._context = { es_name: apiContext['es-name'], es_index: apiContext['es-index'], es_doc_type: apiContext['es-doc-type'], s3_bucket: apiContext['s3-bucket'], s3_prefix: apiContext['s3-prefix'] }
putStream(apiContext['stream-name'], event).then((resolve, reject) => { context.succeed(); }).catch((err) => { context.fail(err); }); }
function putStream(streamName, event) { // inbound from AWS API Gateway
let injected = injectContext(event); let record = { Data: JSON.stringify(injected), PartitionKey: `${event.body.timestamp}`, StreamName: streamName }; return new Promise((resolve, reject) => { kinesis.putRecord(record, (err, data) => { if (err) { reject(err); } else { resolve(data); } }); }); }
const ESEndpoint = new AWS.Endpoint(es_host); const req = new AWS.HttpRequest(ESEndpoint);
req.method = 'POST'; req.body = body; req.path = '/_bulk'; req.region = 'ap-northeast-1';
req.headers['presigned-expires'] = false; req.headers['Host'] = ESEndpoint.host;
if (AWSCredential.accessKeyId) { const signer = new AWS.Signers.V4(req, 'es'); signer.addAuthorization(AWSCredential, new Date()); }
Demo
Thanks!