failsafe mechanism for yahoo homepage
TRANSCRIPT
![Page 1: Failsafe Mechanism for Yahoo Homepage](https://reader034.vdocuments.us/reader034/viewer/2022042610/58acffa21a28abca0c8b6753/html5/thumbnails/1.jpg)
Failsafe Mechanism for Yahoo Homepage
Using Apache Storm & Apache Traffic ServerPushkar Sachdeva ([email protected])
Kit Chan ([email protected])
05/2016
![Page 2: Failsafe Mechanism for Yahoo Homepage](https://reader034.vdocuments.us/reader034/viewer/2022042610/58acffa21a28abca0c8b6753/html5/thumbnails/2.jpg)
![Page 3: Failsafe Mechanism for Yahoo Homepage](https://reader034.vdocuments.us/reader034/viewer/2022042610/58acffa21a28abca0c8b6753/html5/thumbnails/3.jpg)
Failsafe
“A fail-safe or fail-secure device is one that, in the event of a specific type of failure, responds in a way that will cause no harm, or at least a minimum of harm, to other devices or to
personnel”
![Page 4: Failsafe Mechanism for Yahoo Homepage](https://reader034.vdocuments.us/reader034/viewer/2022042610/58acffa21a28abca0c8b6753/html5/thumbnails/4.jpg)
Overall Architecture
Yahoo! Presentation, Confidential
Browser
ELB
EC2 ATS
S3
Property ATS
PropertyServing Stack
Crawler on Storm
AWSYahoo
Auto activate Failsafe
Switch traffic to AWS
Offstage Data Flow
Online Request FlowNormal Operation
Online Request FlowFailsafe Mode
![Page 5: Failsafe Mechanism for Yahoo Homepage](https://reader034.vdocuments.us/reader034/viewer/2022042610/58acffa21a28abca0c8b6753/html5/thumbnails/5.jpg)
AWS Failsafe Stack Architecture
Elastic Load Balancer
S3 Bucket
Security Group
ATS EC2 Instances
ATS Server
VPC
Availability Zone #1
ATS EC2 Instances
ATS Server
Availability Zone #2
Region (US W Oregon)Region (US E North Virginia)Region (Ireland)Region (Singapore)
S3 Replication across regions
Cloud watch
Crawled data from Yahoo
https
http
![Page 6: Failsafe Mechanism for Yahoo Homepage](https://reader034.vdocuments.us/reader034/viewer/2022042610/58acffa21a28abca0c8b6753/html5/thumbnails/6.jpg)
EC2 Instance - ATS● Instance (amazon linux)
○ t2.large - burstable ○ 2 vCPUs/8GB RAM/1 gbps network
● Apache Traffic Server ○ For caching
■ Negative caching enabled■ Ramdisk used
○ Health Check/S3 Authentication plugin○ Lua plugin
■ Query Parameters Sorting■ Simple Device Detection■ Error handling
● Cloudwatch Log Agent/Monitoring Scripts● Autoscaling based on # of incoming requests● Deployment Mechanism using Terraform / Packer
ATS 4Gb ramdisk cache
Amazon Linux
CloudwatchAgent
CloudwatchMonitoring Scripts
![Page 7: Failsafe Mechanism for Yahoo Homepage](https://reader034.vdocuments.us/reader034/viewer/2022042610/58acffa21a28abca0c8b6753/html5/thumbnails/7.jpg)
Lua script example - sorting query parameters function do_remap()
local query = ts.client_request.get_uri_args()
if (query ~= nil and query ~= '') then
local result = {}
local i = 1
for value in query:gmatch '([^&]*)' do
if (value ~= '') then
result [i] = value
i = i + 1
end
end
table.sort(result)
local sorted_query = table.concat(result, '&')
ts.client_request.set_uri_args(sorted_query)
end
end
![Page 8: Failsafe Mechanism for Yahoo Homepage](https://reader034.vdocuments.us/reader034/viewer/2022042610/58acffa21a28abca0c8b6753/html5/thumbnails/8.jpg)
Cloudwatch Log Agent Conf# /etc/awslogs/awslogs.conf# Custom ATS log enabled and in /usr/local/var/log/trafficserver/mon
[monlog]datetime_format = %Y-%m-%d %H:%M:%Sfile = /usr/local/var/log/trafficserver/mon.*buffer_duration = 5000log_stream_name = {instance_id}initial_position = start_of_filelog_group_name = monlog
![Page 9: Failsafe Mechanism for Yahoo Homepage](https://reader034.vdocuments.us/reader034/viewer/2022042610/58acffa21a28abca0c8b6753/html5/thumbnails/9.jpg)
Perl Script calling Cloudwatch Monitoring Lib
+ if ($report_chr) {+ my $result = `/usr/local/bin/traffic_line -r proxy.node.cache_hit_ratio_avg_10s`;+ add_metric('CacheHitRatio', 'Percent', 100 * $result);+ }+ if ($report_tef) {+ my $connect_failed = `/usr/local/bin/traffic_line -r proxy.node.http.transaction_frac_avg_10s.errors.connect_failed`;+ my $aborts = `/usr/local/bin/traffic_line -r proxy.node.http.transaction_frac_avg_10s.errors.aborts`;+ my $possible_aborts = `/usr/local/bin/traffic_line -r proxy.node.http.transaction_frac_avg_10s.errors.possible_aborts`;+ my $pre_accept_hangups = `/usr/local/bin/traffic_line -r proxy.node.http.transaction_frac_avg_10s.errors.pre_accept_hangups`;+ my $early_hangups = `/usr/local/bin/traffic_line -r proxy.node.http.transaction_frac_avg_10s.errors.early_hangups`;+ my $empty_hangups = `/usr/local/bin/traffic_line -r proxy.node.http.transaction_frac_avg_10s.errors.empty_hangups`;+ my $other = `/usr/local/bin/traffic_line -r proxy.node.http.transaction_frac_avg_10s.errors.other`;++ add_metric('TransErrorFraction', 'Percent', 100 * ($connect_failed + $aborts + $possible_aborts + $pre_accept_hangups + $early_hangups + $empty_hangups + $other));+ }
![Page 10: Failsafe Mechanism for Yahoo Homepage](https://reader034.vdocuments.us/reader034/viewer/2022042610/58acffa21a28abca0c8b6753/html5/thumbnails/10.jpg)
Cloudwatch Dashboard
![Page 11: Failsafe Mechanism for Yahoo Homepage](https://reader034.vdocuments.us/reader034/viewer/2022042610/58acffa21a28abca0c8b6753/html5/thumbnails/11.jpg)
AWS Autoscaling - Terraform Configuration Fileresource "aws_autoscaling_group" "fsfb_base_load" { availability_zones = ["${split(",", var.zones)}"] name = "${var.env}_fsfb_base_load-${aws_launch_configuration.fsfb_ats.name}" load_balancers = ["${aws_elb.fsfb_elb.name}"] max_size = 8 min_size = 2 health_check_grace_period = 180 health_check_type = "ELB" desired_capacity = 2 launch_configuration = "${aws_launch_configuration.fsfb_ats.name}" force_delete = true wait_for_elb_capacity = 2 lifecycle {
create_before_destroy = true }}
![Page 12: Failsafe Mechanism for Yahoo Homepage](https://reader034.vdocuments.us/reader034/viewer/2022042610/58acffa21a28abca0c8b6753/html5/thumbnails/12.jpg)
AWS Autoscaling - Terraform Configuration File (Cont’d) resource "aws_autoscaling_policy" "fsfb_scale_out_med" {
name = "${var.env}_fsfb_scale_out_med"scaling_adjustment = 8adjustment_type = "ExactCapacity"cooldown = 300autoscaling_group_name = "${aws_autoscaling_group.fsfb_base_load.name}"
}
![Page 13: Failsafe Mechanism for Yahoo Homepage](https://reader034.vdocuments.us/reader034/viewer/2022042610/58acffa21a28abca0c8b6753/html5/thumbnails/13.jpg)
AWS Autoscaling - Terraform Configuration File (Cont’d)resource "aws_cloudwatch_metric_alarm" "fsfb_upper_medium_rps" {
alarm_name = "${var.env}_fsfb_upper_medium_rps"comparison_operator = "GreaterThanOrEqualToThreshold"evaluation_periods = "1"period = "60"metric_name = "RequestCount"namespace = "AWS/ELB"statistic = "Sum"threshold = "75000"dimensions {
LoadBalancerName = "${aws_elb.fsfb_elb.name}"}alarm_description = "This metric monitors medium elb traffic"alarm_actions = ["${aws_autoscaling_policy.fsfb_scale_out_med.arn}", "${var.sns_email_topic}"]
}
![Page 14: Failsafe Mechanism for Yahoo Homepage](https://reader034.vdocuments.us/reader034/viewer/2022042610/58acffa21a28abca0c8b6753/html5/thumbnails/14.jpg)
Escalate Plugin in Apache Traffic Server (ATS) ● ATS is a proxy server that sits between the user and the origin server
● ‘Escalate’ is an ATS plugin that fetches content from failsafe servers when the origin server fails to provide a ‘good’ response.
ATS Origin Server User
![Page 15: Failsafe Mechanism for Yahoo Homepage](https://reader034.vdocuments.us/reader034/viewer/2022042610/58acffa21a28abca0c8b6753/html5/thumbnails/15.jpg)
Escalate Plugin in ATS (Continued)● ‘Escalate’ is a remap plugin -
map http://games.yahoo.com/ http://some_origin.yahoo.com/ @plugin=ats_escalate.so @pparam=some_label
● Loads global configuration with ‘label’ definitions● Sample ‘label’ definition -
"some_label" : { "enable" : 1, "response" : { "500" : { "mode" : "url", "url" : "http://brb.yahoo.net/$h/$d/$p$x" } } }
![Page 16: Failsafe Mechanism for Yahoo Homepage](https://reader034.vdocuments.us/reader034/viewer/2022042610/58acffa21a28abca0c8b6753/html5/thumbnails/16.jpg)
Escalate Plugin in ATS (Continued)● Runs in ‘READ_RESPONSE_HDR_HOOK’ ● Uses 'TSHttpTxnRedirectUrlSet’ to fetch content from failsafe servers
if (EscalateLabel::ACTION_URL == entry->second.mode) { std::string content; MyExpander expander(txn, entry->second.url); if (!expander(entry->second.url, config->get_device_type_header(), config->get_default_device_type())) { TSError("[" PLUGIN_TAG "] invalid expansion"); TSDebug(PLUGIN_TAG, "invalid expansion"); goto finish; } expander.swap(content); url_str = TSstrdup(content.c_str()); length = content.size(); if (url_str) { TSHttpTxnRedirectUrlSet(txn, url_str, length); // Transfers ownership }}
![Page 17: Failsafe Mechanism for Yahoo Homepage](https://reader034.vdocuments.us/reader034/viewer/2022042610/58acffa21a28abca0c8b6753/html5/thumbnails/17.jpg)
Apache Storm Crawler● Based on scalable Apache Storm platform● Topology● Spouts● Bolts
Spout Bolt
Spout Bolt
Bolt
![Page 18: Failsafe Mechanism for Yahoo Homepage](https://reader034.vdocuments.us/reader034/viewer/2022042610/58acffa21a28abca0c8b6753/html5/thumbnails/18.jpg)
Apache Storm Crawler (Continued)Simplified Topology
Cron Feeder
Changelog Feeder
IndexUrlConfigFetcher
UrlFetcher
Memory Storage Writer
Response Processor
Response Uploader
Custom Event Queue UpdaterCustom Event
Queue Feeder
![Page 19: Failsafe Mechanism for Yahoo Homepage](https://reader034.vdocuments.us/reader034/viewer/2022042610/58acffa21a28abca0c8b6753/html5/thumbnails/19.jpg)
Apache Storm Crawler (Continued)● Crawls content for desktop, smartphone and tablet● Supports domain level configuration for request headers, query params and
output storage. ● Failsafe url path mapping example -
Mapping: http://{failsafe_host}/{original_domain}/{device}/{path};{sorted_query_params_as_matrix_params}
URL: https://www.yahoo.com/news/trump-unveils-foreign-policy-plan-201628138.html?q=1&a=2
S3 file path: http://brb.yahoo.net/www.yahoo.com/smartphone/news/trump-unveils-foreign-policy-plan-201628138.html;a=2;q=1
![Page 20: Failsafe Mechanism for Yahoo Homepage](https://reader034.vdocuments.us/reader034/viewer/2022042610/58acffa21a28abca0c8b6753/html5/thumbnails/20.jpg)
High Level Architecture
Proxy Router Proxy Cache Origin Server
Failsafe CrawlerAWS storage
1
105
4
32
9
8 7
6
User
7
6
4
35
2
1
PUT
Offline Crawler Request FlowUser Request Flow
Optional Request Flow to fetch failsafe content
![Page 21: Failsafe Mechanism for Yahoo Homepage](https://reader034.vdocuments.us/reader034/viewer/2022042610/58acffa21a28abca0c8b6753/html5/thumbnails/21.jpg)
Benefits● No manual intervention needed to serve failsafe content● Granular control● More relevant content is shown to user● Failsafe content is cached in proxy layer
![Page 22: Failsafe Mechanism for Yahoo Homepage](https://reader034.vdocuments.us/reader034/viewer/2022042610/58acffa21a28abca0c8b6753/html5/thumbnails/22.jpg)
Pitfalls/Limitations ● Lagging Crawler● Handling additional Crawler traffic● Bucket specific experience● Malformed Page
![Page 23: Failsafe Mechanism for Yahoo Homepage](https://reader034.vdocuments.us/reader034/viewer/2022042610/58acffa21a28abca0c8b6753/html5/thumbnails/23.jpg)
Future on Resiliency - multi-cloud for failsafe ● Additional Cloud Vendor
○ E.g. Google Cloud Platform○ S3 vs Google Cloud Storage○ EC2/ELB vs Google Compute Engine○ Cloudwatch vs StackDriver
● Changes in Apache Storm Crawler○ Can use Apache jclouds to create objects in storage in S3 or Google Cloud Storage
● Changes in deployment using terraform / configuration using chef○ GCP & AWS are supported
● Route 53 can be used to do failover to GCP
![Page 24: Failsafe Mechanism for Yahoo Homepage](https://reader034.vdocuments.us/reader034/viewer/2022042610/58acffa21a28abca0c8b6753/html5/thumbnails/24.jpg)
Future on Resiliency ● Speculative Retry
void SpeculativeRetryPlugin::handleInputComplete(){ orig_url_ = transaction_.getClientRequest().getUrl().getUrlString(); //fetch original request sendFetchRequest(orig_url_, false); //start a timer which would give a callback after ‘time_’ msecs Async::execute<AsyncTimer>(this, new AsyncTimer(AsyncTimer::TYPE_ONE_OFF, time_), getMutex());}
void SpeculativeRetryPlugin::handleAsyncComplete(AsyncTimer &async_timer){ async_timer.cancel();
//active_fetch keeps track if we have received the response of original request yet or not //if not initiate a retry request if(!active_fetch_) { sendFetchRequest(orig_url_, true); }}
![Page 25: Failsafe Mechanism for Yahoo Homepage](https://reader034.vdocuments.us/reader034/viewer/2022042610/58acffa21a28abca0c8b6753/html5/thumbnails/25.jpg)
Thank you. Questions?