how it's made - myget.org - azureconf

60
THANK YOU, LOCAL ORGANIZERS! Over 60 community-led Windows Azure training events worldwide! http://globalwindowsazure.azurewebsite s.net

Upload: maarten-balliauw

Post on 15-May-2015

583 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: How it's made - MyGet.org - AzureConf

THANK YOU, LOCAL ORGANIZERS!

Over 60 community-led Windows Azure training events worldwide!

http://globalwindowsazure.azurewebsites.net

Page 2: How it's made - MyGet.org - AzureConf

Maarten Balliauw@maartenballiauw

How it’s madeMyGet.org

Page 3: How it's made - MyGet.org - AzureConf

• Maarten BalliauwDaytime: Technical Evangelist, JetBrains

Nighttime: Co-founder MyGet.org• AZUG• Focus on web• Big passion: Windows Azure• http://blog.maartenballiauw.be • @maartenballiauw

Who am I?

Shameless self promotion: Pro NuGet - http://amzn.to/pronuget

Page 4: How it's made - MyGet.org - AzureConf

• NuGet? MyGet?• How we started• What we did not know• Our first architecture• Our second architecture• ACS• Tough times (learning moments)• Conclusion

Agenda

Page 5: How it's made - MyGet.org - AzureConf

NuGet?MyGet?

Page 6: How it's made - MyGet.org - AzureConf

NuGet? MyGet?

Page 7: How it's made - MyGet.org - AzureConf

NuGet? MyGet?

Page 8: How it's made - MyGet.org - AzureConf

• Safely store your IP with us• Creating packages is hard. We have Build

Services!• Granular security• Activity streams• Symbol server• Analytics

Why MyGet?

Page 9: How it's made - MyGet.org - AzureConf

• Xavier Decoster@xavierdecoster

• Yves Goeleven@yvesgoeleven

• Also known as @MyGetTeam

I’m not alone!

Page 10: How it's made - MyGet.org - AzureConf

How westarted

Page 11: How it's made - MyGet.org - AzureConf

The real begin? May 09, 2011

Page 12: How it's made - MyGet.org - AzureConf

• Using OData as their feeds• Which is some sort of WCF…• Multiple feeds?• Exchanged some ideas with Xavier• Prototyped something during TechDays 2011

NuPack!

Page 13: How it's made - MyGet.org - AzureConf

Prototype online! May 31, 2011

Page 14: How it's made - MyGet.org - AzureConf

• Windows Azure (yay, new toy!)• Windows Azure Table Storage & Blob Storage

(cheap in case we fail!)• Windows Azure ACS (no way I’m typing

another user registration)• ASP.NET MVC 2• MEF

Technologies used?

Page 15: How it's made - MyGet.org - AzureConf

Best practices used?

Page 16: How it's made - MyGet.org - AzureConf

• One web role• One storage account

Architecture at the time?

Page 17: How it's made - MyGet.org - AzureConf

What wedid notknow…

Page 18: How it's made - MyGet.org - AzureConf

• Grew from 5 feeds to 70 feeds in a few weeks, thinking we hit our max.

Users would come!

Page 19: How it's made - MyGet.org - AzureConf

• One user started pushing 1.300 packages worth 1 GB of storage.

• Others started pushing CI packages.

Data would come!

Page 20: How it's made - MyGet.org - AzureConf

• A lot of refactoring done• Using best practices• SOLID and DRY (well, not everywhere but

refactoring takes time)• Running on two instances (availability, yay!)

ReSharper time!

Page 21: How it's made - MyGet.org - AzureConf

• Someone mentioned they would pay for our service

• Business model• Public site• Volume of feeds kept going up• Users in EU and US

We “started”

Page 22: How it's made - MyGet.org - AzureConf

Our firstarchitecture

Page 23: How it's made - MyGet.org - AzureConf

WEB ROLE

STORAGE

EU-WEST NORTH CENTRAL US

WEB ROLE

STORAGE

Page 24: How it's made - MyGet.org - AzureConf

• Datacenters nearby our users• Centralized storage• Packages on CDN for faster throughput• DNS fail-over if one of the DC’s went down

Awesome!

Page 25: How it's made - MyGet.org - AzureConf

• Datacenters nearby our usersOr not?

• Centralized storage Speed of light! USA was slow!

• Packages on CDN for faster throughput Sync issues, downtime, …

• DNS fail-over if one of the DC’s went down Seems not every ISP follows DNS standards

Not so awesome…

Page 26: How it's made - MyGet.org - AzureConf

• Local caching in USA added• 2 instances in EU, 1 in the USA• Syncing data kept being slow• Populating cache was a nightmare• CDN kept having issues• Of 3 instances, only 1 was being used with

enough load (~60%)

We persisted!

Page 27: How it's made - MyGet.org - AzureConf

• We had public subscription plans• We added enterprise tenants (multi-tenancy

added)• Resulting in…

• Architecture became complex• Caching and syncing became complex

We pivoted!

Page 28: How it's made - MyGet.org - AzureConf

Our secondarchitecture

Page 29: How it's made - MyGet.org - AzureConf

• Managing feeds and packages• Doesn’t matter much where (who cares about a little latency)

• Downloading packages• May matter where, let the tenant decide on storage account location

• Builds• Who cares where!

Workloads

Page 30: How it's made - MyGet.org - AzureConf

WEB ROLE

STORAGE

EU-WEST

STORAGE

EU-NORTH

STORAGE ACCT PER TENANT

OTHER DATACENTERS

STORAGE ACCT (SOME TENANTS)

VIRTUAL MACHINE (BUILDS)

Page 31: How it's made - MyGet.org - AzureConf

• … was scaled across the globe• … but as synchronous as it could be• … prone to all issues with latency vs.

synchrony

• Event Driven Architecture?*

*some concepts borrowed from EDA

Our first architecture…

Page 32: How it's made - MyGet.org - AzureConf

• Some actions put an ICommand on a queue(ground rule: if it can’t be done in 1 write, use ICommand)

• All actions complete with an IEvent on a queue

• Handlers can subscribe to ICommand and IEvent

• Handlers are idempotent and not depending on others

EDA in MyGet

Page 33: How it's made - MyGet.org - AzureConf

• 2 operations: 1 read, 1 write• Read the profile• Store the profile with LastLogin date• No use of ICommand• Finishes with UserLoggedInEvent

Example: log in

Page 34: How it's made - MyGet.org - AzureConf

• Many operations!• Read two user profiles• Read current access rights• Change access rights• Push new privileges to SymbolSource.org

• One command, one event• ChangeFeedOwnerCommand• FeedOwnerChangedEvent

Example: change feed owner

Page 35: How it's made - MyGet.org - AzureConf

Example: change feed owner

ChangeFeedOwnerCommandHan

dler

ChangeFeedOwnerCommand

FeedOwnerChangedEvent

SymSrcHandler<FeedOwnerChangedEve

nt>

SymSrcEvent

ActivityLogHandler

<FeedOwnerChangedEvent>

Page 36: How it's made - MyGet.org - AzureConf

• We now run on 2 instances, mostly for redundancy (coming from 3)

• Average CPU usage? 20% (coming from 60%)

• Way easier to implement new features!• New feature: activity log• Simply subscribe to events we want to see in that log

Gain?

Page 37: How it's made - MyGet.org - AzureConf

• Why no relational database?

• With only PartitionKey as an index, how do you store a feed’s packages and versions in an optimal way?• Three important values: feed name, package id, package version• Table per feed• Package id = PartitionKey• Package version = RowKey

Storage

Page 38: How it's made - MyGet.org - AzureConf

• Reading 1.000 rows and deserializing them is SLOW (many seconds)

• We cache some tables on blob storage• 1.000 rows in serialized JSON = small• Loading one file = fast• Searching in memory through 1.000 rows = fast

• Cache update subscribed to IEvent

Storage

Page 39: How it's made - MyGet.org - AzureConf

Windows AzureAccess Control Service

Page 40: How it's made - MyGet.org - AzureConf

• Multiple applications• www.myget.org• staging.myget.org• localhost:1196• Customer1.myget.org• Customer2.myget.org• …

• Multiple identity providers• Who wants Microsoft Account?• Google anyone?• Oh, your custom ADFS? Sure!

Imagine managing this!

Page 41: How it's made - MyGet.org - AzureConf

production tenants

www.myget.org*.customer.myget.orgother domain names

localhost:1196 myget-staging.cloudapp.net

develo

pm

ent

stag

ing

Windows Azure Access Control Service

Page 42: How it's made - MyGet.org - AzureConf

• Users typically have some identity that allows federation

• ACS gives us Microsoft Account, Yahoo!, Google & Facebook accounts*

• We only care about ACS in our code

*we built many others and are working on a spin-off http://socialsts.com

No more user registration!

Page 43: How it's made - MyGet.org - AzureConf

• An identity is an identity, whether dev/staging/prod

• ACS handles subtle differences per environment

• Our app just gets and uses the claims

No difference between environments!

Page 44: How it's made - MyGet.org - AzureConf

• Easy multi-tenant logins with different identity providers

• ACS decides how to log in based on the audience• www.myget.org• some.customers.myget.org

• Our app just gets and uses the claims

No difference between tenants!

Page 45: How it's made - MyGet.org - AzureConf

Tough timesLearning moments

Page 46: How it's made - MyGet.org - AzureConf

• Symptoms:• Users complaining about “downtime”• No monitoring SMS alert• Half an hour later: “site up!”, “site down!”, “site up!”, “site down!” SMS

alerts• No sign of issues in the Windows Azure Management portal

• But what’s the cause?• We just deployed our multi-tenant architecture• We just enabled storage analytics• ELMAH was showing storage throttling• 16.000 unprocessed commands and events in queue

Huge downtime on July 2nd, 2012Full story at http://blog.myget.org/post/2012/07/02/Site-issues-on-July-2nd-2012.aspx

Page 47: How it's made - MyGet.org - AzureConf

• One, simple piece of code…• GetHashCode() on Package object faulty

• “If two string objects are equal, the GetHashCode method returns identical values. However, there is not a unique hash code value for each unique string value. Different strings can return the same hash code.“

• GetHashCode() used to track object in data context (new vs. update)

• 2 objects with the same hashcode = UnhandledException

Huge downtime on July 2nd, 2012Full story at http://blog.myget.org/post/2012/07/02/Site-issues-on-July-2nd-2012.aspx

Page 48: How it's made - MyGet.org - AzureConf

• We caught any Exception and back then, blindly retried operations• Resulting in 16.000 commands and events being retried continuously• Causing storage throttling• Causing the website to retry reads• Causing more throttling• Starving IIS worker threads

• Lessons learned?• A simple bug can halt the entire application• Only retry transient errors• Our monitoring sucked• Bad, untested code (code from back when MyGet was a blog post…)

An exception killed the site? WTF?!?

Page 49: How it's made - MyGet.org - AzureConf

• Symptoms:• Everything down• Furious users on social media• Windows Azure Management Portal Down• Furious tweets about #WindowsAzure

• The cause?• Global outage of Windows Azure due to an expired SSL certificate on

storage

Huge downtime February 23rd, 2013

Full story at http://blog.myget.org/post/2013/02/24/We-were-down.aspx

Page 50: How it's made - MyGet.org - AzureConf

• Move storage to HTTP instead of HTTPS?• Windows Azure down globally impacts us

quite a bit• Fail-over to another solution costs money

and lots of effort• Decided against it for now

• Considering off-Windows Azure backups of at least all packages

Considerations and lessons learned

Full story at http://blog.myget.org/post/2013/02/24/We-were-down.aspx

Page 51: How it's made - MyGet.org - AzureConf

• “Retention policies” introduced• Seemed to be a success! 3+ million

commands and events in queue• Solution: scale out (20 instances did it in a

few minutes)• Solution for the future: feature toggling

One more! New features…

Page 52: How it's made - MyGet.org - AzureConf

But overall…

From: http://status.myget.org

Page 53: How it's made - MyGet.org - AzureConf

Bonus tip

Page 54: How it's made - MyGet.org - AzureConf

• “The Lean Startup” book says this• Don’t build it yourself: Google Analytics

Measure everything, test assumptions

Page 55: How it's made - MyGet.org - AzureConf

this is why we built username/password registration, seems a lot of people prefer typing instead of one click

we must keep investing in Build Services

feed discovery is more popular than we imagined from zero reactions on our blog and Twitterthe technical fear we had about “download as ZIP” consuming too much server resources? That thing doesn’t show up in our stats, that’s how successful it is…

Page 56: How it's made - MyGet.org - AzureConf

Conclusion

Page 57: How it's made - MyGet.org - AzureConf

• NuGet? MyGet?• How we started• What we did not know• Our architecture• ACS• Tough times provide learning• Measurement as well

Conclusion

Page 58: How it's made - MyGet.org - AzureConf

Thank you!

http://blog.maartenballiauw.be

@maartenballiauw

http://amzn.to/pronuget

Page 59: How it's made - MyGet.org - AzureConf

Thank you!http://

blog.maartenballiauw.be@maartenballiauw

http://amzn.to/pronugethttp://www.myget.org

Page 60: How it's made - MyGet.org - AzureConf

http://aka.ms/AzureConf-MemberOffers

http://aka.ms/AzureConf-FreeTrial

Get started with a 90 day free trial

Or, use your existing benefits…