apache-cloudandfosslessonslearned-combined

109
Apache httpd v2.4: Hello Cloud Jim Jagielski

Upload: matt-hudson

Post on 14-Mar-2016

213 views

Category:

Documents


0 download

DESCRIPTION

http://posscon.org/assets/Uploads/Apache-CloudandFOSSlessonslearned-combined.pdf

TRANSCRIPT

Apache httpd v2.4:Hello Cloud

Jim Jagielski

What we will cover

• Performance Related Enhancements

• Reverse Proxy Server Enhancements

Apache httpd 2.4Currently in beta release

Expected GA: This May!

Significant Improvements

high-performance

cloud suitability

Apache httpd 2.4

Support for async I/O w/o dropping support for older systems

Larger selection of usable MPMs: added Event, Simple, etc...

Leverages higher-performant versions of APR

Apache httpd 2.4

Bandwidth control now standard

Finer control of timeouts, esp. during requests

Controllable buffering of I/O

Support for Lua

Apache httpd 2.4Reverse Proxy Improvements

Supports FastCGI, SCGI

Additional load balancing mechanisms

Runtime changing of clusters w/o restarts

Support for dynamic configuration

mod_proxy

• An Apache module• Implements core proxy capability• Both forward and reverse proxy• In general, most people use it for reverse proxy

(gateway) functionality

How did we get here?• A stroll down mod_proxy lane– First available in Apache 1.1• “Experimental Caching Proxy Server”

– In Apache 1.2, pretty stable, but just HTTP/1.0– In Apache 1.3, much improved with added support

for HTTP/1.1– In Apache 2.0, break out cache and proxy– In Apache 2.2, lay framework

Proxy Improvements• Becoming a robust but generic proxy implementation• Support various protocols– HTTP, HTTPS, CONNECT, FTP– AJP, FastCGI, SCGI, WSGI (soon)– Load balancing

• Clustering, failover

AJP? Really?

• Yep, Apache can now talk AJP with Tomcat directly• mod_proxy_ajp is the magic mojo• Other proxy improvements make this even more

exciting• mod_jk alternative

But I like mod_jk• That’s fine, but...– Now the config is much easier and more

consistent ProxyPass /servlets ajp://tc.example.com:

8089

– Easier when Apache needs to proxy both HTTP and AJP– Leverage improvements in proxy module

Features of Proxy Server

• Performance

• Monitoring

• Filtering

• Caching (with mod_cache)

Reverse Proxy

Internet

Firewall Firewall

Cloud

Reverse Proxy Server

Transactional Servers

Browser

• Operated at the server end of the transaction

• Completely transparent to the Web Browser – thinks the Reverse Proxy Server is the real server

Features of Reverse Proxy • Security

– Uniform security policy can be administered

– The real transactional servers are behind the firewall

• Delegation, Specialization, Load Balancing

Configuring Reverse Proxy

• Set ProxyRequests Off

• Apply ProxyPass, ProxyPassReverse and possibly RewriteRule directives

Reverse Proxy Directives:• Allows remote server to be mapped into the space of

the local (Reverse Proxy) server

• Example:

– ProxyPass /secure/ http://secureserver/

– Presumably “secureserver” is inaccessible directly from the internet

Reverse Proxy Directives:• Used to specify that redirects issued by the remote

server are to be translated to use the proxy before being returned to the client.

• Syntax is identical to ProxyPass; used in conjunction with it

• Example:– ProxyPass /secure/ http://secureserver/

Simple Rev Proxy• All requests for /images to a backend server

• ProxyPass /images http://images.example.com/

• ProxyPass <path> <scheme>://<full url>

• Useful, but limited

• What if:

– images.example.com dies?

– traffic for /images increases

Baby got back• We need more backend servers

• And balance the load between them

• Before 2.2, mod_rewrite was your only option

• Some people would prefer spending an evening with an Life Insurance salesman rather than deal with mod_rewrite

Load Balancer

• mod_proxy_balancer.so

• mod_proxy can do native load balancing

– weight by actual requests

– weight by traffic

– weight by busyness

– lbfactors

Load Balancer

• LB algorithms are implemented as providers

– easy to add

– no core code changes required

– growing list of methods

Load Balancer• Backend connection pooling

• Available for named workers:

– eg: ProxyPass /foo http://bar.example.com

• Reusable connection to origin

– For threaded MPMs, can adjust size of pool (min, max, smax)

– For prefork: singleton

• Shared data held in shared memory

Pooling example<Proxy balancer://foo>

BalancerMember http://www1.example.com:80/ loadfactor=1

BalancerMember http://www2.example.com:80/ loadfactor=1

BalancerMember http://www3.example.com:80/ loadfactor=4 status=+h

ProxySet lbmethod=bytraffic

</Proxy>

Load Balancer• Sticky session support

– aka “session affinity”

• Cookie based

– stickysession=PHPSESSID

– stickysession=JSESSIONID

• Natively easy with Tomcat

• May require more setup for “simple” HTTP proxying

Load Balancer• Cluster set with failover

• Group backend servers as numbered sets

– balancer will try lower-valued sets first

– If no workers are available, will try next set

• Hot standby

Example<Proxy balancer://foo>

BalancerMember http://php1:8080/ loadfactor=1

BalancerMember http://php2:8080/ loadfactor=4

BalancerMember http://phpbkup:8080/ loadfactor=4 status=+h

BalancerMember http://offsite1:8080/ lbset=1

BalancerMember http://offsite2:8080/ lbset=1

ProxySet lbmethod=bytraffic

</Proxy>

ProxyPass /apps/ balancer://foo/

Embedded Admin• Allows for real-time

–Monitoring of stats for each worker

– Adjustment of worker params• lbset

• load factor

• route

• enabled / disabled

• ...

Embedded Admin• Allows for real-time

• Addition of new workers/nodes

• Change of LB methods

• Can be persistent

• More RESTful

• Can be CLI-driven

Easy setup

<Location /balancer-manager>

SetHandler balancer-manager

Order Deny,Allow

Deny from all

Allow from 192.168.2.22

</Location>

Admin

Admin

Admin

Changing the LBmethod

Adding new worker

Admin

Some tuning params• For workers:

– loadfactor• normalized load for worker [1]

– lbset• worker cluster number [0]

– retry• retry timeout, in seconds, for failed workers [60]

Some tuning params• For workers - connection pool:

–min• Initial number of connections [0]

–max• Hard maximum number of connections [1|TPC]

– smax:• soft max - keep this number available [max]

• time to live for connections above smax

Some tuning params• For workers - connection pool:

– disablereuse:

• bypass the connection pool

– ttl• time to live for connections above smax

Some tuning paramsFor workers (cont):– connectiontimeout/timout• Connection timeouts on backend [ProxyTimeout]

– flushpackets *• Does proxy need to flush data with each chunk of

data?– on : Yes | off : No | auto : wait and see

– flushwait *• ms to wait for data before flushing

Some tuning params

For workers (cont):– status (+/-)• D : disabled• S : Stopped• I : Ignore errors• H : Hot standby• E : Error

Some tuning paramsFor balancers:– lbmethod• load balancing algo to use [byrequests]

– stickysession• sticky session name (eg: PHPSESSIONID)

–maxattempts• failover tries before we bail

– Nofailover• Back-ends don't support failover so don't send session when failing over

Recent improvements• ProxyPassMatch– ProxyPass can now take regex’s instead of just

“paths”• ProxyPassMatch ^(/.*\.gif)$ http://backend.example.com$1

– JkMount migration• Or

– ProxyPass ~ ^(/.*\.gif)$ http://backend.example.com$1 • mod_rewrite is balancer aware

Recent improvements• ProxyPassReverse is NOW balancer aware!• The below will work:

<Proxy balancer://foo>

BalancerMember http://php1:8080/ loadfactor=1

BalancerMember http://php2:8080/ loadfactor=4

</Proxy>

ProxyPass /apps/ balancer://foo/

ProxyPassReverse /apps balancer://foo/

Useful Envars

• BALANCER_SESSION_STICKY– This is assigned the stickysession value used in the current request. It is

the cookie or parameter name used for sticky sessions

• BALANCER_SESSION_ROUTE– This is assigned the route parsed from the current request.

• BALANCER_NAME– This is assigned the name of the balancer used for the current request. The

value is something like balancer://foo.

Useful Envars• BALANCER_WORKER_NAME

– This is assigned the name of the worker used for the current request. The value is something like http://hostA:1234.

• BALANCER_WORKER_ROUTE– This is assigned the route of the worker that will be used for the current request.

• BALANCER_ROUTE_CHANGED– This is set to 1 if the session route does not match the worker route

(BALANCER_SESSION_ROUTE != BALANCER_WORKER_ROUTE) or the session does not yet have an established route. This can be used to determine when/if the client needs to be sent an updated route when sticky sessions are used.

Putting it all together<Proxy balancer://foo>

BalancerMember http://php1:8080/ loadfactor=1

BalancerMember http://php2:8080/ loadfactor=4

BalancerMember http://phpbkup:8080/ loadfactor=4 status=+h

BalancerMember http://phpexp:8080/ lbset=1

ProxySet lbmethod=bytraffic

</Proxy>

<Proxy balancer://javaapps>

BalancerMember ajp://tc1:8089/ loadfactor=1

BalancerMember ajp://tc2:8089/ loadfactor=4

ProxySet lbmethod=byrequests

</Proxy>

Putting it all together

ProxyPass /apps/ balancer://foo/

ProxyPass /serv/ balancer://javaapps/

ProxyPass /images/ http://images:8080/

Manipulating HTTP Headers:

• Modify HTTP request and response headers

– Can be used in Main server, Vhost, Directory, Location, Files sections

– Headers can be merged, replaced or removed

– Pass on client-specific data to the backend server

• IP Address, Request scheme (HTTP, HTTPS), UserAgent, SSL connection info, etc.

Manipulating HTTP Headers:

• Shield backend server’s info from the clients

– Strip out Server name

– Server IP address

– etc.

Header examples• Copy all request headers that begin with “TS” to response headers

– Header echo ^TS

• Say hello to Joe

– Header add JoeHeader “Hello Joe!”

• If header “MyRequestHeader: value” is present, response will contain “MyHeader” header:

– SetEnvIf MyRequestHeader value HAVE_MyRequestHeader

– Header add MyHeader “%D %t mytext” env=HAVE_MyRequestHeader

Header examples• Remember, sequence is important! Following will

result in “MHeader” to be stipped from the response:

– RequestHeader append MyHeader “value1”

– RequestHeader append MyHeader “value2”

– RequestHeader unset MyHeader

Example:• Pass additional info about Client Browsers to the App Server:

ProxyPass / http://backend.covalent.net

ProxyPassReverse / http://backend.covalent.net

RequestHeader set X-Forwarded-IP %{REMOTE_ADDR}e

RequestHeader set X-Request-Scheme %{REQUEST_SCHEME}e

• App Server receives the following HTTP headers:

– X-Forwarded-IP: 10.0.0.3

– X-Request-Scheme: https

Using mod-rewrite example# mod_proxy lb example using request parameter

RewriteEngine On

# Use mod_rewrite to insert a node name into the url

RewriteCond %{QUERY_STRING} accountId=.*([0-2])\b

RewriteRule ^/sampleApp/(.*) balancer://tc1/$1 [P]

RewriteCond %{QUERY_STRING} accountId=.*([3-6])\b

RewriteRule ^/sampleApp/(.*) balancer://tc2/$1 [P]

RewriteCond %{QUERY_STRING} accountId=.*([7-9])\b

RewriteRule ^/sampleApp/(.*) balancer://tc3/$1 [P]

# No ID - round robin to all nodes

ProxyPass /sampleApp/ balancer://all/

Using mod-rewrite example

<Proxy balancer://tc1>

# Default worker for this balancer

BalancerMember http://linux6401.dev.local:8080/sampleApp lbset=1

# Backup balancers for node failure - used in round robin

# no stickyness

BalancerMember http://linux6402.dev.local:8081/sampleApp lbset=1 status=H

BalancerMember http://linux6403.dev.local:8081/sampleApp lbset=1 status=H

# Maintenance balancer used to re-route traffic for upgrades etc

BalancerMember http://linux6404.dev.local:8080/sampleApp status=D

</Proxy>

Using mod-rewrite example<Proxy balancer://tc2>

BalancerMember http://linux6402.dev.local:8080/sampleApp lbset=1

# Backup balancers for node failure - used in round robin

# no stickyness

BalancerMember http://linux6401.dev.local:8081/sampleApp lbset=1 status=H

BalancerMember http://linux6403.dev.local:8081/sampleApp lbset=1 status=H

# Maintenance balancer used to re-route traffic for upgrades etc

BalancerMember http://linux6404.dev.local:8080/sampleApp status=D

</Proxy>

Using mod-rewrite example<Proxy balancer://tc3>

BalancerMember http://linux6403.dev.local:8080/sampleApp lbset=1

# Backup balancers for node failure - used in round robin

# no stickyness

BalancerMember http://linux6401.dev.local:8081/sampleApp lbset=1 status=H

BalancerMember http://linux6402.dev.local:8081/sampleApp lbset=1 status=H

# Maintenance balancer used to re-route traffic for upgrades etc

BalancerMember http://linux6404.dev.local:8080/sampleApp status=D

</Proxy>

Using mod-rewrite example<Proxy balancer://all>

BalancerMember http://linux6401:8080/sampleApp

BalancerMember http://linux6402:8080/sampleApp

BalancerMember http://linux6403:8080/sampleApp

</Proxy>

<Location /balancer-manager>

SetHandler balancer-manager

Order deny,allow

Deny from all

Allow from .dev.local

</Location>

What’s on the horizon?

• Improving AJP

• Adding additional protocols

• mass_vhost like clusters/proxies

• More dynamic configuration

Open Source: It’s just not for IT anymore!

Jim Jagielski

Agenda

Introduction

What is the ASF

What exactly is “Open Source”

The Lessons Learned by the ASF

Introduction

Jim Jagielski

Longest still-active developer/contributor

Co-founder of the ASF

Member, Director and President

The ASF

ASF == The Apache Software Foundation

Before the ASF there was “The Apache Group”

The ASF was incorporated in 1999

The ASFNon-profit corporation founded in 1999

501( c )3 charity

Volunteer organization

Virtual world-wide organization

Exists to provide the organizational, legal, and financial support for various OSS projects

The ASF’s MissionProvide open source software to the public free of charge

Provide a foundation for open, collaborative software development projects by supplying hardware, communication, and business infrastructure

Create an independent legal entity to which companies and individuals can donate resources and be assured that those resources will be used for the public benefit

The ASF’s MissionProvide a means for individual volunteers to be sheltered from legal suits directed at the Foundation’s projects

Protect the ‘Apache’ brand, as applied to its software products, from being abused by other organizations

Provide legal and technical infrastructure for open source software development and to perform appropriate oversight of

How We WorkThe Apache Software Foundation provides support for the Apache community of open-source software projects. The Apache projects are characterized by a collaborative, consensus based development process, an open and pragmatic software license, and a desire to create high quality software that leads the way in its field. We consider ourselves not simply a group of projects sharing a server, but rather a community of developers and users.

How We Work, Take 2

Community created code

Our code should be exceptional

Structure of the ASF - dev

Volunteer Driven Organization

Software Projects are managed by Project Management Committees (PMCs)

PMCs vote in new PMC members and committers

At the end of the day: People / Individual focused

Structure of the ASF - legalMember-based corporation - individuals only

Members nominate and elect new members

Members elect a board - 9 seats

Semi-annual meetings via IRC

Each PMC has a Chair - eyes and ears of the board (oversight only)

ASF “Org Chart”Development Administrative

Users

Patchers/Buggers

Contributors

Committers

PMC Members

Members

Officers

Board

Issues with Dual Stacks

Despite clear differentiation, sometimes there are leaks

eg: PMC chair seen as “lead” developer

Sometimes officers are assumed to have too much power if they venture into development issues

“hats”

Why Open Source?Access to the source code

Avoid vendor lock-in (or worse!)

Much better software

Better security record (more eyes)

Much more nimble development - frequent releases

Direct user input

The draw of Open SourceHaving a real impact in the development and direction of IT

Personal satisfaction: I wrote that!

Sense of membership in a community

Sense of accomplishment - very quick turnaround times

Developers and engineers love to tinker - huge opportunity to do so

Open Source FUD

No quality or quality control

Prevents or slows development

Have to “give it away for free”

No real innovation

What is Open Source?Open Source Licensing

OSI Approved

Free Software

As in Free Speech, not Free Beer

What is Open Source?Basically, it’s a “new” way to develop, license and distribute code

Actually, there was “open source” even before it was called that

The key technologies behind the Internet and the Web are all Open Source based

True Open Source

For software to be Open Source, it must be under an OSI approved Open Source License

At last count, over 70 exist

Open Source LicensesGive Me Credit

AL, BSD, MIT

Give Me Fixes

(L)GPL, EPL, MPL

Give Me Everything

GPL

- Dave Johnson http://rollerweblogger.org/page/roller?entry=gimme_credit_gimme_fixes_gimmem

The Apache License (AL)A liberal open source software license - BSD-like

Business friendly

Requires attribution

Includes Patent Grant

Easily reused by other projects & organizations

Give Me Fixes

MPL / EPL / (L)GPL

Used mostly with platforms or libraries

Protects the licensed code, but allows larger derivative works with different licensing

Still very business friendly

Give Me EverythingGPL (copyleft)

Derivative works also under GPL

Linked works could also be under GPL

Viral nature may likely limit adoption

GPL trumps all others or else incompatible

License Differences

Mainly involve the licensing of derivative works

Only really applies during (re)distribution of work

Where the “freedom” should be mostly focused: the user or the code itself

One True License

There is no such thing

Licensing is selected to address what you are trying to do

In general, Open Standards do better with AL-like license

The Apache Way

Although the term is deprecated, “The Apache Way” relates to how the ASF (and its projects) work and operate

Basically, the least common denominators on how PMCs operate

Basic MemesMeritocracy

Peer-based

Consensus decision making

Collaborative development

Responsible oversight

Meritocracy

“Govern by Merit”

Merit is based on what you do

Merit never expires

Those with merit, get more responsibility

Peer-basedDevelopers represent themselves - individuals

Mutual trust and respect

All votes hold the same weight

Community created code

Healthy communities create healthy code

Poisonous communities don’t

Look Familiar?

These concepts are not new or unique

Best practices regarding how the Scientific and Health community works

Publish or Perish

In Open Source, frequent releases indicate healthy activity

What is collaborative s/w development other than peer review?

Think how restrictive research would be w/o open communication

Why Community -> CodeSince we are all volunteers, people’s time and interests change

A healthy community is “warm and inviting” and encourages a continued influx of developers

Poisonous people/communities turn people off, and the project will die

End result - better code, long-term code

Consensus decision makingKey is the idea of voting

+1 - yes

+0 - no real comment

-1 - veto

Sometimes you’ll also see stuff like -0, -0.5, etc...

Voting

The main intent is to gauge developer acceptance

Vetos must be justifiable and have sound technical merit

If valid, Vetos cannot be overruled

Vetos are very rare

Commit ProcessReview Then Commit (RTC)

A patch is submitted to the project for inclusion

If at least 3 +1s and no -1s, code is committed

Good for stable branches

Ensures enough “eyes on the code” on a direct-to-release path

Commit ProcessCommit Then Review (CTR)

A patch is committed directly to the code

Review Process happens post commit

Good for development branches

Depends on people doing reviews after the fact

Allows very fast development

Commit ProcessLazy Consensus

variant of RTC

“I plan on committing this in 3 days”

Provides opportunity for oversight, but with known “deadline”

As always, can be vetoed after the fact

Collaborative Development

Code is developed by the community

Voting ensures at least 3 active developers

Development done online and on-list

If it didn’t happen on-list, it didn’t happen

Collaborative Development

Mailing lists are the preferred method

Archived

Asynchronous

Available to anyone - public list

Collaborative DevelopmentOther methods are OK, if not primary

Wikis

IRC

F2F

Always bring back to the list

Responsible Oversight

Ensure license compliance

Track IP

Quality code

Quality community

The Apache Incubator

Entry point for all new projects and codebases

Indoctrinates the Apache Way to the podling

Ensures and tracks IP

Contributor License Agreement

aka: iCLA (for individual)

Required of all committers

Guarantees:

The person has the authority to commit the code

That the ASF can relicense the code

Does NOT assign copyright

Success Stories - HTTPDApache HTTP Server (“Apache”)

Reference implementation of HTTP

Most popular web server in existance

Found in numerous commercial web servers

Oracle, IBM,...

Influenced countless more

Success Stories - HTTPDBy having a “free” and open source reference implementation, the drive to create a separate proprietary version was reduced.

“Why spend time and money, when we can use this”

This allowed HTTP (and the Web) to grow and STAY usable (compare to the old browser wars)

Success Stories - TomcatApache Tomcat (Servlet Container)

The default standard servlet container

Each version maps to a specific spec.

Bundled with numerous Java apps out there

Likely a major influence on the diminishing relevance of JEE

Success Stories - OthersApache Geronimo (Server Framework - JEE )

Not “just” a JEE server

Apache Maven (Java project build/management tool)

Another industry standard

Apache Logging, Axis, Struts, ...

Internal AdvantagesMuch tighter community

better communication (esp. with “users”)

tighter feedback loop

Ready made audience

Don’t need to “sell” the aspects of Open Source which scare/confuse IT people

Governance Gotchas

Trust and Merit are earned, which implies time

The available pool might be relatively small

Some Concluding ThoughtsTrust your developers AND your users

Communication is key

Open Source is NOT the Good Housekeeping Seal Of Approval

But don’t believe in all the FUD either

Success is not measured in market share, but in adoption

Some Concluding Thoughts

Open Source should have a viable business or emotional reason

Give some thought to licensing

Make it easier for developers and users to “join”

Give them a reason to

Helpful links

The Apache Software Foundation

www.apache.org

Want to help a great organization?

www.marylandstateboychoir.org

That’s It

Thank you!

Any questions?