the apache web server

82
Workbook 1. The Apache Web Server

Upload: vandars

Post on 30-Dec-2015

272 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Apache Web Server

Workbook 1. The Apache Web Server

Page 2: The Apache Web Server

Workbook 1. The Apache Web Server

Page 3: The Apache Web Server

Table of Contents1. Webserver Basics...................................................................................................................................5

Discussion..........................................................................................................................................5Web Servers..............................................................................................................................5Installation the Apache Web Server.........................................................................................5Web Server Layout...................................................................................................................6The Document Root:/var/www/html/ .................................................................................7Content Types...........................................................................................................................8Directories................................................................................................................................9Web Server Logging:/var/log/httpd/{access,error}_log......................................10The Anatomy of a Web Request: the HTTP Protocol (Optional, but Interesting).................12The Hyper Text Markup Language (HTML) (Optional)........................................................17

Exercises..........................................................................................................................................18Specification...........................................................................................................................18Deliverables............................................................................................................................19Clean Up.................................................................................................................................19

Questions..........................................................................................................................................20

2. Apache Configuration..........................................................................................................................24

Discussion........................................................................................................................................24Apache Configuration:/etc/httpd/conf/httpd.conf ..................................................24The Global Section.................................................................................................................25The Main Section...................................................................................................................30The Answer Book:http://localhost/manual...............................................................35

Exercises..........................................................................................................................................36Specification...........................................................................................................................36Deliverables............................................................................................................................37

Questions..........................................................................................................................................37

3. Apache Configuration: Containers....................................................................................................41

Discussion........................................................................................................................................41Tailoring Customization to Particular Content: Containers ...................................................41Common Container Configuration.........................................................................................42Red Hat Enterprise Linux Default Configuration...................................................................46Location Containers: server-status and server-info................................................................48

Exercises..........................................................................................................................................50Specification...........................................................................................................................50Deliverables............................................................................................................................52

Questions..........................................................................................................................................52

4. Virtual Hosts ........................................................................................................................................57

Discussion........................................................................................................................................57Virtual Hosts...........................................................................................................................57IP Based Virtual Hosting........................................................................................................57Name Based Virtual Hosts......................................................................................................58

Exercises..........................................................................................................................................59Specification...........................................................................................................................59Deliverables............................................................................................................................62

Questions..........................................................................................................................................62

iii

Page 4: The Apache Web Server

5. The Squid Proxy Server......................................................................................................................67

Discussion........................................................................................................................................67Proxy Servers..........................................................................................................................67Thesquid Proxy Server..........................................................................................................68Squid Configuration:/etc/squid/squid.conf ................................................................68The server’s identity:http_port..........................................................................................69Squid Access Control Lists:acl andhttp_access ............................................................69Configuring Proxies for Web Clients......................................................................................73Squid Logging:/var/log/squid/access.log ................................................................75Finding Out More...................................................................................................................76

Exercises..........................................................................................................................................76Specification...........................................................................................................................76Deliverables............................................................................................................................78Challenge Exercises................................................................................................................78

Questions..........................................................................................................................................78

rha230-5.0-1-en-2008-01-21T07:12:18-0500Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violationof U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrievalsystem, or otherwise duplicated whether in electronic or printformat without prior written consent of Red Hat, Inc. If you b elieve Red Hat course materials are being used, copied, or otherwise improperly distributed please [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.

iv

Page 5: The Apache Web Server

Chapter 1. Webserver Basics

Key Concepts

• The web server that ships with Red Hat Enterprise Linux is theApache webserver.

• In general terms, web servers map URL requests onto files within the local directory, using theDocument Root (/var/www/html/) as the base of the translation.

• The web server associates meta-data with requested files, such as content types.

• When a client requests a directory instead of a file, Apache serves the fileindex.html (if it exists),generates a dynamically generated directory listing (if it’s allowed to), or returns an access deniederror.

• Web servers and web clients communicate using the HTTP protocol.

• Often, the information served from a web server is structured using the HTML markup language.

Table 1-1. The Apache Web Server

Packages httpd (with apr andhttpd-suexec dependencies), plus other modules(usually startingmod_...), andhttpd-manual.

Service httpd

Daemon /usr/sbin/httpd

Config Files /etc/httpd/conf/httpd.conf, /etc/httpd/conf.d/*

Logging /var/log/httpd/{access,error}_log

Ports 80/tcp (http), 443/tcp (https)

Discussion

Web ServersThis lesson focuses on installing and starting the Apace webserver, and publishing information using thedefault configuration. We also introduce some of the basics of the HTTP protocol and the HTML markuplanguage, for those who are interested.

Installation the Apache Web ServerIn Red Hat Enterprise Linux, the Apache web server is easy to install and start in its defaultconfiguration, using the conventional trio of commands to install thehttpd package and start thehttpd

5

Page 6: The Apache Web Server

Chapter 1. Webserver Basics

service:yum install ...; service ... start; chkconfig ... on.

[root@station ~]# yum install httpd

...Dependencies Resolved

=============================================================================Package Arch Version Repository Size

=============================================================================Installing:httpd i386 2.2.3-6.el5 rha-rhel 1.1 M

...Installed: httpd.i386 0:2.2.3-6.el5Complete!

Thehttpd service can now be started and "chkconfiged on".

[root@station ~]$ service httpd start

Starting httpd: [ OK ][root@station ~]$ chkconfig httpd on

The availability of the Web Server can be confirmed by using any Web browser to referencehttp://localhost. The following example useselinks, but thefirefox browser could have been used just aseasily.

[root@station ~]$ elinks -dump http://localhost

Red Hat Enterprise Linux Test Page

This page is used to test the proper operation of the Apache HTTP serverafter it has been installed. If you can read this page, it means that theApache HTTP server installed at this site is working properly.

...

Web Server LayoutOnce installed, arpm query to list files (rpm -ql) always serves as a good introduction to the layout ofa new product.

[root@station ~]$ rpm -ql httpd

/etc/httpd/etc/httpd/conf/etc/httpd/conf.d/etc/httpd/conf.d/README...

Skimming the output, the following relevant files and directories could be seen.

Table 1-2. Web Server Filesystem Layout

rha230-5.0-1-en-2008-01-21T07:12:18-0500Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation

of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrievalsystem, or otherwise duplicated whether in electronic or print

format without prior written consent of Red Hat, Inc. If you b elieve Red Hat course materials are being used, copied, or otherwise improperly distributed please email

[email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.

6

Page 7: The Apache Web Server

Chapter 1. Webserver Basics

Directory Purpose

/etc/httpd/ Configuration files, including/etc/httpd/conf/httpd.conf.

/usr/lib/httpd/modules/

Dynamically loaded modules.

/var/log/httpd/ Log files, includingaccess_log anderror_log.

/var/www/html/ The Web Server Document Root (more on this in a moment).

The Document Root: /var/www/html/

The purpose of the Web Server is to serve information. Usually, this involves reading a file from the filesystem and transferring it to a web browser, which then displays or renders the file.

As an arbitrary example, the file/etc/sysctl.conf can be copied to the document root(/var/www/html) directory. Any web browser referencing http://localhost/sysctl.conf should displaythe contents of the file just as could be done with thecat command. (Some web browsers may manglethe whitespace within the file, essentially placing the entire contents of the file on one line. This issuearises because of misguided "Content Type" negotiations. More on this later.)

[root@station ~]$ cp /etc/sysctl.conf /var/www/html/

[root@station ~]$ elinks http://localhost/sysctl.conf

[root@station ~]$ elinks -source http://localhost/sysctl.conf

# Kernel sysctl configuration file for Red Hat Linux## For binary values, 0 is disabled, 1 is enabled. See sysctl(8) and# sysctl.conf(5) for more details.

# Controls IP packet forwardingnet.ipv4.ip_forward = 0...

Instead of a single file, entire directory trees can be copiedinto the/var/www/html directory.

[root@station ~]$ cp -a /etc/sysconfig /var/www/html/

Now, by accessing http://localhost/sysconfig with a web browser, the contents of the directory should bevisible, with "clickable" file and subdirectory links.

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any otheruse is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwiseduplicated whether in electronic or print format without pr ior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, orotherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919)754 3700.

7

Page 8: The Apache Web Server

Chapter 1. Webserver Basics

Figure 1-1. Browsing thesysconfig Directory

Notice the shift in perspective. What we would call the directory/var/www/html/sysconfig, the webserver refers to as just/sysconfig. This translation is the essence of the term "Document Root".

Web browsers request information using "Uniform Resource Locators", or more commonly just "URL"s.Web related URL’s are usually composed of a hostname and a filepath.

http://hostname/dir1/dir2/filename

Thehostname is simply the hostname or IP address of the host running the server, while thedir1/dir2/filename is thought of as being a path to a particular file on the server.When locating thefile, the web server assumes that the root of the "URL Namespace" is the document root directory(/var/www/html).

Thehttp portion of the URL is theprotocol, which tells the web browser both which port to connect to,and what "language" to expect to speak to whomever is listening on that port. For web servers, the port is80, and the language is known as theHypertext Transfer Protocol, or HTTP.

Of course, it’s not a machine’s configuration files that one usually chooses to publish to the world. We’llmove on to more interesting content.

Content TypesThe purpose of the web server is to serve the content of files, but web clients seem to learn not just thecontent of the file, but how to interpret the content, as well.As an example, consider a text file such as/etc/hosts, an HTML file such as/usr/share/doc/samba-version/htmldocs/manpages/net.8.html, and an image file, such as/usr/share/backgrounds/tiles/neurons.png, each of which are copied to a web server’sdocument root.

[root@station ~]# mkdir /var/www/html/example

[root@station ~]# cd /var/www/html/example

[root@station example]# cp /etc/hosts .

[root@station example]# cp /usr/share/doc/samba-*/htmldocs/manpages/net.8.html .

[root@station example]# cp /usr/share/backgrounds/tiles/neurons.png .

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use isa violation of U.S. and international copyrights. No part ofthis publication may be photocopied, duplicated, stored ina retrieval system, or otherwise duplicated whetherin electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributedplease email [email protected] or phone toll-free (USA)+1 866 626 2994 or +1 (919) 754 3700.

8

Page 9: The Apache Web Server

Chapter 1. Webserver Basics

[root@station example]# ls

hosts net.8.html neurons.png

How does a web client handle each of these? If you’re sitting at a student workstation, try for yourself.(Of course, you will first need to perform the above commands to put the files in place.)

http://localhost/example/hostshttp://localhost/example/net.8.htmlhttp://localhost/example/neurons.png

Note: Make sure to create or copy files underneath the /var/www/html directory as the root user.Do not move already existing files into the directory. If you’re having trouble, give it a pass for now,until you read the section "But What Could Go Wrong?" below.

All of the files should have been treated reasonably by the client: thehosts file as a simple text file, thenet.8.html file as a marked up man page, complete with bolded titles, italics, and hyperlinks, andneuron.png as a picture of blue blobs.

Now lets shake things up a bit.

[root@station example]# cp hosts hosts.html

[root@station example]# cp net.8.html net.8.txt

[root@station example]# cp hosts hosts.png

[root@station example]# cp neurons.png neurons.txt

Again, if at a student workstation, try the following.

http://localhost/example/hosts.htmlhttp://localhost/example/net.8.txthttp://localhost/example/hosts.pnghttp://localhost/example/neurons.txt

For those not able to follow along,hosts.html lost all of it’s formatting,net.8.txt dumped what youwould see if youcatted the file directly,hosts.png caused the browser to complain about a malformedimage, andneurons.txt showed a bunch of glyphs representing binary data.

There’s obviously some expectations on the part of the browser about how to interpret the data itreceives: text to dump, marked up text (html) to format, or animage to render. The expectation aboutwhat type of data the client is receiving is known as the data’scontent type.

Apparently, thecontent typeis determined by the file’s filename extension. We still don’tknow if theextension is being interpreted into acontent typeby the server (before the file’s content is transmitted) orby the client (after the content is received). The answer is the server, and the server communicates thatcontent type, as well as a lot of other meta-data about the transfer, using the HTTP protocol.

DirectoriesWe’ve seen how the web server responds when the web server requests a file: it returns the contents ofthe file to the client. How does the web server handle directories? In general, a webserver responds inone of three ways.

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any otheruse is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwiseduplicated whether in electronic or print format without pr ior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, orotherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919)754 3700.

9

Page 10: The Apache Web Server

Chapter 1. Webserver Basics

First, the web server checks to see if an index file (a file namedindex.html) exists in the directory. Ifso, the webserver returns the contents of the file, as if the request for http://localhost/example were forhttp://localhost/example/index.html.

Secondly, if no index file exists, the web server checks to seeif the Indexes option is enabled. If so, theweb server returns a dynamically generated directory listing. Otherwise, the webserver returns an error tothe client. (How theIndexes option is set or not set will be covered in a following lesson.In Red HatEnterprise Linux, the option is set by default.)

Table 1-3. Web Server Responses to Directory Requests

Configuration Response

index.html exists Return the contents ofindex.html

noindex.html, Indexes enabled Return a dynamically generated directory listing

noindex.html, Indexes disabled Return error 403 ("Access Denied")

Assuming you followed along above, create the file/var/www/html/example/index.htmlwith thefollowing content (you should be able to cut and paste directly from the browser).

<h1>Examples</h1>[<a href="hosts">hosts</a>][<a href="net.8.html">net man page</a>][<a href="neurons.png">picture of neurons</a>]

What happens when you now view http://localhost/example? You should see the marked up contents ofthe index file. Is the effect any different if you view http://localhost/example/index.html directly? (Itshouldn’t be.)

Figure 1-2. Contents of http://localhost/example

What about the file/var/www/html/hosts.html? Is it still available? You should be able to access itby manually entering the URL http://localhost/example/hosts.html, but there is no way to click to itdirectly (except from this page, of course). Content behindan index file, which is not referenced directly,is obscured, but still available if someone knows it’s there.

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use isa violation of U.S. and international copyrights. No part ofthis publication may be photocopied, duplicated, stored ina retrieval system, or otherwise duplicated whetherin electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributedplease email [email protected] or phone toll-free (USA)+1 866 626 2994 or +1 (919) 754 3700.

10

Page 11: The Apache Web Server

Chapter 1. Webserver Basics

Web Server Logging: /var/log/httpd/{access,error}_log

The Apache web server logs information about every request it handles to the file/var/log/httpd/access_log. A sample of the log file’s contents follows.

[root@station ~]# tail -3 /var/log/httpd/access_log

127.0.0.1 - - [13/Jul/2005:06:34:24 -0400] "GET /example/net.8.html HTTP/1.1" 200 26196 "http://localhost/rhasb/curr/rha230/html-instructor-classroom/rha230_httpd_http.html" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.10) Gecko/20050720Fedora/1.0.6-1.1.fc4 Firefox/1.0.6"

127.0.0.1 - - [13/Jul/2005:06:34:24 -0400] "GET /example/samba.css HTTP/1.1" 404290 "http://localhost/example/net.8.html" "Mozilla/5.0 (X11; U; Linux i686; en-

US; rv:1.7.10) Gecko/20050720 Fedora/1.0.6-1.1.fc4 Firefox/1.0.6"127.0.0.1➊- - [13/Jul/2005:06:34:25 -0400]➋"GET /favicon.ico HTTP/1.1" 404➌284➍"-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.10) Gecko/20050720 Fedora/1.0.6-1.1.fc4 Firefox/1.0.6"

Amongst any line, we find the following information.

➊ The IP address of the client who made the request.

➋ A timestamp of when the request occurred.

➌ The response code associated with the request. A response ofcode of 200 implies success, anythingelse is usually some type of failure.

➍ The length of the content returned, not to be confused with the response code which proceeds it.

Any request that does not complete successfully (i.e., whose response code is not 200) also generatesinformation in theerror_log.

[root@station ~]# tail -3 /var/log/httpd/error_log

[Tue Jul 13 06:34:24 2005] [error] [client 127.0.0.1] File does not exist: /var/www/html/example/samba.css, referer: http://localhost/example/net.8.html[Tue Jul 13 06:34:25 2005] [error] [client 127.0.0.1] File does not exist: /var/www/html/favicon.ico

Theaccess_log and theerror_log are one of the first places an administrator should look whentrying to figure out why something doesn’t seem to be working.The following table itemizes some of thereturn codes associated with various errors (or successes).

Table 1-4. HTTP return codes

Code Meaning

200 Success

301 Authorization Required

403 Access Denied

404 File Not Found

501 Internal Server Error

There are many others, but these tend to be the most common. (In general, the HTTP protocol follows anresponse code convention used by many network services: partial success are in the 100’s, successes inthe 200’s, incomplete transactions in the 300’s, client errors in the 400’s, and server errors in the 500’s.

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any otheruse is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwiseduplicated whether in electronic or print format without pr ior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, orotherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919)754 3700.

11

Page 12: The Apache Web Server

Chapter 1. Webserver Basics

Watch closely the output the next time you use the simpleftp client, for example.)

But What Could Go Wrong?

In it’s default configuration, there’s really only two things that could cause problems: permissions, andSELinux.

First, files must be readable by the system userapache. Thehttpd process, like any other process, musthave the right permissions to access a file. For security reasons, the web server runs as the userapache.Therefore, any file served by the web server must be readable by the user apache.

Secondly, the Apache web server is one of the services constrained by the Red Hat Enterprise LinuxSELinux targeted policy. Therefore, any file serviced by theApache web server must have an appropriateSELinux context. For now, the context of the/var/www/html directory (httpd_sys_content_t) willsuffice. Any filecreatedin this directory (including subdirectories) should inherit this context, and befine. The problem occurs when files are created somewhere else, andmovedto this directory - they thenretain their original (inappropriate) SELinux context.

At any rate, whenever the web server complains in its log file that it cannot access a file you think itshould be able to, try the following commands to set appropriate permissions and SELinux context.

[root@station ~]# chmod a+r filename

[root@station ~]# chcon --reference /var/www/html filename

or

[root@station ~]# restorecon /var/www/html/filename

The Anatomy of a Web Request: the HTTP Protocol (Optional,but Interesting)This section introduces the HTTP protocol. The intent is notto be thorough, but instead to give studentsan impression of what is meant when people use terms such asHTTP headers, GET, andResponse Code.For those who don’t get enough, all of the details can be foundat the World Wide Web Consortium’s(http://www.w3.org) website (http://www.w3.org/Protocols).

In order to introduce the HTTP protocol, it’s easiest to start with an example. The entire conversationbetween a web client and a web server can be captured using thewireshark network analyzer. If notalready installed,yum install wireshark-gnomeshould do the trick. A capture is started by openingwireshark, choosingCapture:Start... from the menu, specifying a capture filter of (in this case)port

80, and "OK"ing. (Enabling "Update list of packets in real time" and "Automatic scroll in live capture"tends to make things more interesting for small captures, aswell.)

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any otheruse is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwiseduplicated whether in electronic or print format without pr ior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, orotherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919)754 3700.

12

Page 13: The Apache Web Server

Chapter 1. Webserver Basics

Figure 1-3. Specifying a Wireshark Capture filter

Once Wireshark is capturing packets, any conversations between a web client and a web server whichoccur on the local machine should be captured. For example, the following displays a conversationbetween a web client requestinghttp://station53.rosemont.wlan/example/hosts and a webserver providing the answer. Oncewireshark has been stopped, the individual IP packets can be browsedfrom a list.

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any otheruse is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwiseduplicated whether in electronic or print format without pr ior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, orotherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919)754 3700.

13

Page 14: The Apache Web Server

Chapter 1. Webserver Basics

Figure 1-4. A Wireshark Capture Packet List

More interestingly for our purposes,wireshark can easily assemble the payload from each of theindividual packets which compose a TCP/IP conversation by right clicking on any packet, and choosingFollow TCP Stream.

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any otheruse is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwiseduplicated whether in electronic or print format without pr ior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, orotherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919)754 3700.

14

Page 15: The Apache Web Server

Chapter 1. Webserver Basics

Figure 1-5. Viewing a TCP Conversation with Wireshark

The web client, in red, is making a request of the web server, in blue. The "language" the client andserver use is the HTTP protocol.

The HTTP Protocol: the Request (Client to Server)

A web request is composed of three parts: a request line, a series of HTTP headers, and the "body" (orcontent).

Note: In the following, some portions of the text have been replaced with "..." for readability. Thesame convention is used many places in the text.

GET➊/example/hosts➋HTTP/1.1 ➌

Host: station53.rosemont.wlan ➍

User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5) Geck... ➎

Accept: text/xml,...text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 ➏

Accept-Language: en-us,en;q=0.5Accept-Encoding: gzip,deflateAccept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7Keep-Alive: 300Connection: keep-alive

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any otheruse is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwiseduplicated whether in electronic or print format without pr ior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, orotherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919)754 3700.

15

Page 16: The Apache Web Server

Chapter 1. Webserver Basics

The entire first line is known as theRequest-Line, and contains exactly three pieces of information in aspecified order.

➊ The request method, which for our purposes can be thought of either being aGET or aPOST. WithaGET, the client is requesting information. With aPOST, the client is submitting information.

➋ The URI, or "Universal Resource Identifier". Think of this asthe path portion of a URL. (The serverportion has already been used to open the TCP/IP connection.)

➌ The exact protocol that the client is speaking. Only two protocols are generally considered,HTTP/1.0andHTTP/1.1, and any modern client should be using the latter.

The next series of lines, which all have the formheader: data, are known as theHTTP headers. Theseare used to associate any metadata with the request. Some HTTP request headers relevant to ourdiscussion are the following.

➍ Host: The content of the host portion of the URL requested by the client.

➎ User-Agent:TheUser Agentis the client software. In this case, the client is the Firefox webbrowser, which identifies itself as a variant ofMozilla.

➏ Accept: A list of the content types that the browser is willing to accept. This browser prefers toreceivetext/xml or text/html, but will also handletext/plain. For images, the browserprefersimage/png, but in the end, the browser will accept*/*, or anything the server will throw atit.

After a blank line, indicating the end of the HTTP headers, the content of the request would follow. ForGET requests, such as this one, there is no content.

The HTTP Protocol: the Response (Server to Client)

The server responds with the following, which is again composed of three parts: a response line, a set ofresponse HTTP headers, and the response "body" (or content).

HTTP/1.1➊200➋OK ➌

Date: Sat, 13 Aug 2005 11:09:51 GMTServer: Apache/2.0.54 (Fedora)Last-Modified: Sat, 13 Aug 2005 10:26:31 GMTETag: "406ee-104-105723c0"Accept-Ranges: bytesContent-Length: 260Connection: closeContent-Type: text/plain; charset=UTF-8 ➍

# Do not remove the following line, or various programs ➎

# that require network functionality will fail.127.0.0.1.localhost.localdomain.localhost192.168.218.254.rosemont.#192.168.218.254.s.#192.168.218.53.w.192.168.0.5.s.192.168.0.6.w.192.168.201.254 rw

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any otheruse is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwiseduplicated whether in electronic or print format without pr ior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, orotherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919)754 3700.

16

Page 17: The Apache Web Server

Chapter 1. Webserver Basics

The Response-Line, like the Request-Line, is composed of three ordered parts. In the case of theresponse, however, the latter two fields are redundant.

➊ The exact protocol the server is using.

➋ The response code of the transaction, which is used to imply success, or qualify a type of failure. Inthis case, the response code 200 implies success. (More on these later.)

➌ A text representation of the response code. This is suppliedonly for diagnostic (debugging)purposes, as the response code is what’s really important. The textOK is associated with theresponse code of 200.

Again, the next series of lines, which all have the formheader: data, are known as theHTTP headers.We will only focus on one of the HTTP response headers.

➍ Content-Type: The server is providing the client with the type of the content, so the browser canrender the data appropriately. For this response, the content type istext/plain, so the browser willdisplay the content "as is", preserving whitespace. Other content types could includeimage/png,text/html, or application/msword.

After a blank line, indicating the end of the HTTP headers, the content of the response follows.

➎ For this response, the content is a simple text file. (In the output above, tabs have been replaced withperiods, an artifact of howwireshark displays non-printing characters.)

The Hyper Text Markup Language (HTML) (Optional)This workbook is about managing the Apache webserver as a system administrator, not about designingweb content. However, during this workbook you will encounter files which use HTML to markup theircontent, so a brief introduction will be useful. Again, those who do not get enough can find more at theWorld Wide Web Consortium’s (http://www.w3.org) website (http://www.w3.org/MarkUp).

Fundamentally, HTML provides three things.

1. Structure: HTML allows text to be identified as titles or inlined quotes,or organized into lists andtables.

2. Embedded Media:HTML allows authors to embed media into their text, usually in the form ofimages, but also as videos and sound.

3. Links: HTML allows authors to easily reference other information,so that anyone reading the textcan locate the other information with the click of a mouse.

All three of the above capabilities rely on embedding HTMLtagsinto the text, where a tag is any textembedded between brackets, such as <table>, <img>, or <a>.

Because the brackets are now considered syntax, there needsto be some way to represent the bracket.This is done using HTMLentities, which begin with an ampersand (&) and end with a semicolon. Forexample, the entity for a left bracket is&lt; (for "less than"), and the entity for a right bracket is&gt;(for "greater than"). Entities are also used for glyphs not often found on keyboards, such as the copyrightsymbol. Since the ampersand starts entities, there must also be some way of representing it, and theanswer is itself an entity:&amp;.

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any otheruse is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwiseduplicated whether in electronic or print format without pr ior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, orotherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919)754 3700.

17

Page 18: The Apache Web Server

Chapter 1. Webserver Basics

Rather than provide a full introduction to HTML in the text, asample document is provided athttp://rha-server/pub/rha/rha230/sample.html. Students are encouraged to examine this document, bothas it is rendered by a web browser and the underlying text (which can usually be viewed in a browser byright clicking and choosingview page source).

Exercises

Lab ExerciseObjective: Install, start, and contribute content to an Apache web site.

Estimated Time: 45 mins.

This exercise has you download and install material for yourweb server, using the web server’s defaultconfiguration. The material consists of three texts which are not optimally organized for the Apache webserver. The lab has you perform some simple renamings and repositioning of the material so that it ismore naturally viewed using a web browser.

Specification

1. If thehttpd package is not already installed on your machine, install itnow.

2. Start thehttpd service (if it is not already started), and configure the service to be started by defaultupon reboots.

3. Download a copy of the file http://rha-server/pub/rha/rha230/readings.tgz, and extract its contentsinto your web server’s document root directory (/var/www/html/). Properly extracting thecontents should result in a new/var/www/html/readings directory.1

4. Using a web browser, browse the http://localhost/readings directory. You should be able to view theHTML files the_god_of_mars.html andwar_of_the_worlds appropriately.

5. Correct a misnamed index file.

a. Again using a web browser, examine the contents of the http://localhost/readings/relat10h/subdirectory. You should discover the fileindex.htm. Try examining this file through the webbrowser: http://localhost/readings/relat10h/index.htm.

b. Apparently, the intent of the author was that this page should serve as an index page, but the fileis named incorrectly for Apache’s default configuration. Inthe/var/www/html/readings/relat10h/ directory, create a link ofindex.htm namedindex.html (either hard or soft).

c. Using a browser, again view the URL http://localhost/readings/relat10h/. You should now seethe contents of the index page.

d. To make life a little easier for anyone browsing your site,in the/var/www/html/readingsdirectory, create a symlink to therelat10h directory calledrelativity.

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy.Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, orotherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are beingused, copied, or otherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.

18

Page 19: The Apache Web Server

Chapter 1. Webserver Basics

e. Confirm that you may now access the content of the fileindex.htm usinghttp://localhost/readings/relativity/.

6. Correct a misnamed directory.

a. If you can stomach the physics (and, in fact, even if you cannot), skim the first appendix toEinstein’s theory of relativity, either by following the link from the main page, or by referencinghttp://localhost/readings/relativity/ap01.htm directly.

b. You might notice that many of the equations, such as equation 29, equation 30, etc., are missing.Examine the end of/var/log/httpd/access_log, and note the many requested images fileswhich received a 404 ("File Not Found") response code.

c. Examine the end of the file/var/log/httpd/error_log, and you will discover more helpfulmessages.

[root@station ~]# tail /var/log/httpd/error_log

[Tue Jul 20 16:53:14 2005] [error] [client 127.0.0.1] File does not exist: /var/www/html/readings/relat10h/pics, referer: http://localhost/readings/relat10h/ap01.htm...

d. Examining the log messages closely, you may discover the problem. All of the web pages areexpecting images to be in a directory namedpics, but this directory does not exist.

e. Through a simple directory renaming, or perhaps another symlink, solve the problem, so that allof the images of equations are properly displayed.

7. Now that you have completed the hard work, relax a little, by deriving the equation for the Lorentztransformation, following the steps in chapter 11. Place your results in a file titledthat_was_easyin your academy user’s home directory. (Just kidding.)

Deliverables

1. An installed and runninghttpd service, configured to start by default on bootup.

2. The text of three books, browsable from the URL http://localhost/readings.

3. The table of contents of Einstein’s theory of relativity at http://localhost/readings/relat10h.

4. The table of contents of Einstein’s theory of relativity,also at http://localhost/readings/relativity.

5. The images of equations in appendix 1 (found at http://localhost/readings/relativity/ap01.htm) are displayedproperly.

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any otheruse is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwiseduplicated whether in electronic or print format without pr ior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, orotherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919)754 3700.

19

Page 20: The Apache Web Server

Chapter 1. Webserver Basics

Clean UpYou will want to leave the/var/www/html/readings directory in place, as you will need it in the nextsection.

Questions

1. In Red Hat Enterprise Linux 5, which of the following packages provides the Apache web server?

( ) a.httpd

( ) b. apache

( ) c.webserver

( ) d. apr

( ) e.None of the above

2. Which of the following directories serves as the web server’s document root?

( ) a./opt/docroot

( ) b. /var/pub/

( ) c./var/www/html/

( ) d. /etc/httpd

( ) e.None of the above

After migrating the contents of a web site from one operatingsystem to another, web clients, when viewing the URLhttp://localhost/zsh.txt, are displaying raw html instead of a formatted page:

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violationof U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrievalsystem, or otherwise duplicated whether in electronic or printformat without prior written consent of Red Hat, Inc. If you b elieve Red Hat course materials are being used, copied, or otherwise improperly distributed please [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.

20

Page 21: The Apache Web Server

Chapter 1. Webserver Basics

3. What is the simplest solution to the problem?

( ) a. Install themod_html package.

( ) b. Create aindex.html file to reference this page.

( ) c. Use thetxt2html utility to assign the file the HTML file type.

( ) d. Rename the filezsh.html.

( ) e.Usechconto assign the file the appropriate SELinux context.

Use the output of the following command to answer the next question, assuming the default Red Hat EnterpriseLinux configuration of the Apache web server.

[root@station1 ~]# ls /usr/share/backgrounds/*/usr/share/backgrounds/images:default.png ladybugs.jpg riverstreet_rail.jpgdewdop_leaf.jpg leafdrops.jpg sneaking_branch.jpg...

/usr/share/backgrounds/tiles:3dgreen.png dunes.png Planning-And-Probing-1.jpgAll-Good-People-1.jpg fibers.png plasma.png...[root@station ~]# cp -a /usr/share/backgrounds/ /var/www/html/

4. What would you expect to see if you pointed the Firefox web browser to the URLhttp://localhost/backgrounds/images/?

( ) a. A dynamically generated web page which displays the images as pictures.

( ) b. A "404: File not Found" error page.

( ) c. A "403: Forbidden" error page.

( ) d. A page containing binary data, because the web server tries to interpret the directory as if it were a file.

rha230-5.0-1-en-2008-01-21T07:12:18-0500Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is aviolation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in aretrieval system, or otherwise duplicated whether inelectronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributedplease email [email protected] or phone toll-free (USA)+1 866 626 2994 or +1 (919) 754 3700.

21

Page 22: The Apache Web Server

Chapter 1. Webserver Basics

( ) e.A dynamically generated web page which lists the contents ofthe directory by filename.

5. If, when the directory above is referenced, you would preferweb clients to see the contents of a file, what shouldthe relevant file be named?

( ) a. README.html

( ) b. HEADER.html

( ) c. index.htm

( ) d. DIR.htm

( ) e.None of the above

6. In what file are all web requests from clients ("hits") logged?

( ) a. /var/log/secure

( ) b. /var/log/httpd/error_log

( ) c. /var/log/messages

( ) d. /var/log/httpd/access_log

( ) e.Both C and D

7. If, when runningservice httpd start, the webserver fails to start, what file might contain helpful debuggingmessages?

( ) a. /var/log/secure

( ) b. /var/log/xferlog

( ) c. /var/log/httpd/error_log

( ) d. /var/log/httpd/access_log

( ) e.Both B and D

8. In what file are web requests that generate errors logged?

( ) a. /var/log/secure

( ) b. /var/log/httpd/error_log

( ) c. /var/log/messages

( ) d. /var/log/httpd/access_log

( ) e.Both B and D

rha230-5.0-1-en-2008-01-21T07:12:18-0500Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violationof U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrievalsystem, or otherwise duplicated whether in electronic or printformat without prior written consent of Red Hat, Inc. If you b elieve Red Hat course materials are being used, copied, or otherwise improperly distributed please [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.

22

Page 23: The Apache Web Server

Chapter 1. Webserver Basics

9. Which is the web server’s "well known" port?

( ) a. 8080

( ) b. 22

( ) c. 25

( ) d. 80

10.Apache’s dynamically loaded modules are conventionally found in what directory?

( ) a./usr/lib/httpd/modules

( ) b. /usr/lib/apache

( ) c./usr/libexec/apache

( ) d. /usr/share/httpd/modules

( ) e.None of the above

Notes1. An excellent source for public domain texts it the Gutenberg project (http://www.gutenberg.org).

rha230-5.0-1-en-2008-01-21T07:12:18-0500Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any otheruse is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwiseduplicated whether in electronic or print format without pr ior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, orotherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919)754 3700.

23

Page 24: The Apache Web Server

Chapter 2. Apache Configuration

Key Concepts

• The Apache server is configured using the/etc/httpd/conf/httpd.conf and/etc/httpd/conf.d/*.conf configuration files.

• The configuration file is informally divided into theGlobal, Main, andVirtual Server sections.

• TheGlobal section defines aspects which pertain to the server as a whole, including client connectiondynamics, server pool parameters, binding address, and which modules to load.

• TheMain section defines aspects which may be redefined by any virtual server, such as the documentroot, logging behavior, and URL namespace remappings.

• Comprehensive documentation is provided by thehttpd-manual package, which, when installed,can be access at http://localhost/manual.

Discussion

Apache Configuration: /etc/httpd/conf/httpd.conf

The Apache web server is configured with text configuration files which are read upon startup. Theprimary configuration file is/etc/httpd/conf/httpd.conf, but the files/etc/httpd/conf.d/*.conf are "slurped up" into the configuration, as well.

[root@station ~]# ls /etc/httpd/conf /etc/httpd/conf.d/

/etc/httpd/conf:httpd.conf magic

/etc/httpd/conf.d/:README welcome.conf

The apache configuration file syntax is straightforward, andtends to be well documented (both ascomments in the default configuration file, and in a separate manual to be discussed later). A sample ofthe configuration file’s syntax follows.

# ➊

# DocumentRoot: The directory out of which you will serve your# documents. By default, all requests are taken from this directory, but# symbolic links and aliases may be used to point to other locations.#DocumentRoot "/var/www/html" ➋

## Each directory to which Apache has access can be configured with respect# to which services and features are allowed and/or disabled in that

24

Page 25: The Apache Web Server

Chapter 2. Apache Configuration

# directory (and its subdirectories).# ...<Directory /> ➌

Options FollowSymLinksAllowOverride None

</Directory>

➊ Any empty line, or line which begins with a hash ("#"), is considered a comment.

➋ Any line which is not a comment generally starts with a keyword referred to as adirective.Directives are not case sensitive, but of course spelling isimportant. The syntax of the remainder ofthe line depends on the directive, but all of a directive’s arguments must occur on a single line.

➌ The only other way a line can begin is with a XML-like tag, which begins acontainer. Containersend with a XML-like closing tag. Generally, all directives found within a container only take effectwithin the scope of the container. We will discuss the effects of different types of containers in alater lesson.

The file is thought of as occurring in three sections, although the syntax does not formally enforce them.

1. The Global Section:This section contains configuration which applies to the webserver as awhole, including any virtual servers.

2. The Main Section:Configuration which applies to the main server (as opposed toany virtualservers) belongs in this section. Any configuration in this section can be overridden by a virtualserver.

3. Virtual Servers: The Apache web server can take on the appearance of being multiple distinctservers. Virtual servers will be discussed in more detail inthe next lesson.

We begin by examining configuration relevant to the server asa whole. You might want to open the file/etc/httpd/conf/httpd.conf in a pager or text editor and follow along as you read the followingsections. (You should consider setting the editor into a "read only" mode, or making a backup of the fileand browsing it).

The Global SectionThe Global section of the configuration file includes configuration that effects the server as a whole.

Figure 2-1./etc/httpd/conf/httpd.conf

### Section 1: Global Environment#

35 # The directives in this section affect the overall operation of Apache,# such as the number of concurrent requests it can handle or where it# can find its configuration files.

Configuration Context: ServerRoot

TheServerRoot directive establishes a home base for all of the remaining server context, while thesecond directive is a simple example of making use of this home base.

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any otheruse is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwiseduplicated whether in electronic or print format without pr ior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, orotherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919)754 3700.

25

Page 26: The Apache Web Server

Chapter 2. Apache Configuration

Figure 2-2./etc/httpd/conf/httpd.conf

46 ## ServerRoot: The top of the directory tree under which the server’s# configuration, error, and log files are kept.

...#

55 # Do NOT add a slash at the end of the directory path.#ServerRoot "/etc/httpd" ➊

#60 # PidFile: The file in which the server should record its process# identification number when it starts.#PidFile run/httpd.pid ➋

➊ TheServerRoot directive establishes context for future file references within the configurationfile. Any relative file reference (one that does not begin witha "/") will be relative to theServerRoot, which in Red Hat Enterprise Linux is/etc/httpd.

➋ In Unix, daemons traditionally record the fact that they arerunning by creating a file in thefilesystem which contains their process id, called apid file. ThePidFile directive specifies wherethis file should be located.

Examining the/etc/httpd directory, we find it’s populated with several symbolic links.

[root@station ~]$ ls -l /etc/httpd

total 28drwxr-xr-x 4 root root 4096 Jul 25 06:33 confdrwxr-xr-x 2 root root 4096 Jul 25 06:33 conf.dlrwxrwxrwx 1 root root 19 Jul 25 06:33 logs -> ../../var/log/httpdlrwxrwxrwx 1 root root 27 Jul 25 06:33 modules -> ../../usr/lib/httpd/moduleslrwxrwxrwx 1 root root 13 Jul 25 06:33 run -> ../../var/run

In thehttpd.conf configuration file, file references that beginlogs/, modules/, orrun/ are mappedto the relevant directories. Can you convince yourself thatthe daemon’s pid file would be found at/var/run/httpd.pid?

It’s important to understand the role of the ServerRoot directive, and the use of the symbolic links in the/etc/httpd directory, but there’s seldom any reason to change these values.

Client Connection Dynamics: Timeout and KeepAlive

The following directives control how long the server will wait on badly behaved clients.

Figure 2-3./etc/httpd/conf/httpd.conf

65 ## Timeout: The number of seconds before receives and sends time out.#Timeout 120 ➊

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use isa violation of U.S. and international copyrights. No part ofthis publication may be photocopied, duplicated, stored ina retrieval system, or otherwise duplicated whetherin electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributedplease email [email protected] or phone toll-free (USA)+1 866 626 2994 or +1 (919) 754 3700.

26

Page 27: The Apache Web Server

Chapter 2. Apache Configuration

70 ## KeepAlive: Whether or not to allow persistent connections (more than# one request per connection). Set to "Off" to deactivate.#KeepAlive Off ➋

75## MaxKeepAliveRequests: The maximum number of requests to allow# during a persistent connection. Set to 0 to allow an unlimited amount.# We recommend you leave this number high, for maximum performance.

80 #MaxKeepAliveRequests 100 ➌

## KeepAliveTimeout: Number of seconds to wait for the next request from the

85 # same client on the same connection.#KeepAliveTimeout 15 ➍

➊ A particular httpd process can only communicate with one client at a time. A badly behaved client,which opens a TCP/IP connection but never uses it, could therefore tie up a server indefinitely. TheTimeout directive specifies how long, in seconds, before a server terminates a connection with abadly behaved client.

➋➌➍ These directives decide if the server honors "Keep Alive" requests from a client, how manyrequest can be made over a "Keep Alive" connection, and how long before an inactive connectionshould time out.

The HTTP protocol is termed a "stateless" protocol, meaningthat the server doesn’t record anyinformation about the client between one request and the next. In the original HTTP/1.0 protocol, clientsare required to open a new socket for every request. Downloading a web page with 10 images, therefore,would require the client to open 11 sockets (one for the page,and one for each referenced image).

The HTTP/1.1 protocol tried to improve efficiency by allowing a client to leave a single socket open for"follow up" requests. Such a persistent socket is called a "Keep Alive" socket. Clients are more likely toabuse such persistent connections, however, by leaving them open but not making any followup requests,so stricter timeout values are usually assigned to such connections.

Managing the Server Pool: StartServers, {Min,Max}SpareServers,MaxClients, and MaxRequestsPerChild

Recall that most Unix daemons use a forking model. Upon receiving a new client connection, the serverprocess forks (duplicates itself), dedicating the new child to the newly connected client, while the parentreturns to listening for new connections.

In order to gain efficiency, the Apache web server takes the uncommon approach of "pre-forking" childdaemons to handle client connections, before the clients ever arrive. Even on an unused web server,several httpd processes exist. The parent daemon is generally run as the userroot, and the pre-forkedchild daemons as the userapache. The collection of httpd process are often referred to as the"serverpool".

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use isa violation of U.S. and international copyrights. No part ofthis publication may be photocopied, duplicated, stored ina retrieval system, or otherwise duplicated whetherin electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributedplease email [email protected] or phone toll-free (USA)+1 866 626 2994 or +1 (919) 754 3700.

27

Page 28: The Apache Web Server

Chapter 2. Apache Configuration

[root@station ~]# ps aux | grep httpd

root 2334 0.0 2.0 19504 10488 ? Ss 05:57 0:00 /usr/sbin/httpdapache 2359 0.0 2.0 19504 10624 ? S 05:57 0:00 /usr/sbin/httpdapache 2360 0.0 2.0 19504 10624 ? S 05:57 0:00 /usr/sbin/httpdapache 5248 0.0 2.0 19504 10628 ? S 07:04 0:00 /usr/sbin/httpdroot 7636 0.0 0.1 3768 716 pts/5 S+ 08:56 0:00 grep httpd

The following directive manage the dynamics of the server pool.

Figure 2-4./etc/httpd/conf/httpd.conf

# prefork MPM# StartServers: number of server processes to start

95 # MinSpareServers: minimum number of server processes which are kept spare# MaxSpareServers: maximum number of server processes which are kept spare# ServerLimit: maximum value for MaxClients for the lifetime of the server# MaxClients: maximum number of server processes allowed to start# MaxRequestsPerChild: maximum number of requests a server process serves

100 <IfModule prefork.c>StartServers 8 ➊

MinSpareServers 5 ➋

MaxSpareServers 20 ➌

ServerLimit 256 ➍

105 MaxClients 256 ➎

MaxRequestsPerChild 4000 ➏

</IfModule>

➊ StartServers: The initial size of the server pool (in numberof processes).

➋➌ {Min,Max}SpareServers: The server pool scales dynamically. If a web server gets blitzed withmany requests, more child daemons will be started. If thingsgo quiet, unused child daemons will bekilled. These directives place bounds on the server pool size.

➍➎ ServerLimit, MaxClients: The number of concurrent requests can be limited. Connection requestabove this limit will be greeted with a quick "I’m busy... come back later", rather than actuallyhandled. The distinction between theServerLimit andMaxClients directives is subtle, and inpractice they are set together to the same value.

➏ MaxRequestsPerChild: In order to improve stability, a given child daemon will only serve so manyrequests until it kills itself, and a new daemon must be started. (This suicide helps curtail memoryleaks in poorly written libraries and CGI executables.)

Controlling the Server Address: Listen

Figure 2-5./etc/httpd/conf/httpd.conf

125 ## Listen: Allows you to bind Apache to specific IP addresses and/or# ports, in addition to the default. See also the <VirtualHost># directive.#

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use isa violation of U.S. and international copyrights. No part ofthis publication may be photocopied, duplicated, stored ina retrieval system, or otherwise duplicated whetherin electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributedplease email [email protected] or phone toll-free (USA)+1 866 626 2994 or +1 (919) 754 3700.

28

Page 29: The Apache Web Server

Chapter 2. Apache Configuration

130 # Change this to Listen on specific IP addresses as shown below to# prevent Apache from glomming onto all bound IP addresses (0.0.0.0)##Listen 12.34.56.78:80Listen 80

TheListen directive controls which address the server binds to. In thedefault configuration (above),the server binds to internal IP address 0.0.0.0 (implying every active interface), port 80. MultipleListenlines can be used to specify that the daemon should bind to multiple ports and/or addresses.

Extending the Web Server: LoadModule

The Apache web server is modular by design. The core web server is actually fairly minimal, withvarious modules providing much of the interesting behavior. Modules may either be "static", meaningthat they’re part of the core executable and can never be removed, or "dynamic", meaning that anadministrator can control if the module is loaded or not during startup.

Apache dynamic modules are located in the/usr/lib/httpd/modules, and are loaded using theLoadModule directive.

Figure 2-6./etc/httpd/conf/httpd.conf

136 ## Dynamic Shared Object (DSO) Support## To be able to use the functionality of a module which was built as a DSO you

140 # have to place corresponding ‘LoadModule’ lines at this location so the# directives contained in it are actually available _before_ they are used.# Statically compiled modules (those listed by ‘httpd -l’) do not need# to be loaded here.#

145 # Example:# LoadModule foo_module modules/mod_foo.so#LoadModule auth_basic_module modules/mod_auth_basic.so ➊

LoadModule auth_digest_module modules/mod_auth_digest.so150 LoadModule authn_file_module modules/mod_authn_file.so

LoadModule authn_alias_module modules/mod_authn_alias.soLoadModule authn_anon_module modules/mod_authn_anon.so...LoadModule include_module modules/mod_include.soLoadModule log_config_module modules/mod_log_config.so

165 LoadModule logio_module modules/mod_logio.soLoadModule env_module modules/mod_env.soLoadModule ext_filter_module modules/mod_ext_filter.soLoadModule mime_magic_module modules/mod_mime_magic.so...

206 ## Load config files from the config directory "/etc/httpd/conf.d".#Include conf.d/*.conf ➋

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use isa violation of U.S. and international copyrights. No part ofthis publication may be photocopied, duplicated, stored ina retrieval system, or otherwise duplicated whetherin electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributedplease email [email protected] or phone toll-free (USA)+1 866 626 2994 or +1 (919) 754 3700.

29

Page 30: The Apache Web Server

Chapter 2. Apache Configuration

210

➊ The various modules tend to introduce new configuration directives to modify their behavior. Forexample, thelog_config_moduleprovides theLogFormat directive, which we will encounter later.In the configuration file, the module must be loaded (withLoadModule) before any directives itprovides are encountered.

➋ In order to ease the distribution of modules using a package managed system (such as RPM), theInclude directive specifies external configuration files to include,either directly or by usingpathname expansion (file globbing).

The Main SectionThe Main section of the configuration file includes configuration that effects the primary server, butdirectives in this section can be overridden by any virtual server.

Figure 2-7./etc/httpd/conf/httpd.conf

### Section 2: ’Main’ server configuration#

235 # The directives in this section set up the values used by the ’main’# server, which responds to any requests that aren’t handled by a# <VirtualHost> definition. These values also provide defaults for# any <VirtualHost> containers you may define later in the file.#

240 # All of these directives may appear inside <VirtualHost> containers,# in which case these default settings will be overridden for the# virtual host being defined.

Server Identity: ServerName and ServerAdmin

The first two directives in the main section help establish the identity of the server.

Figure 2-8./etc/httpd/conf/httpd.conf

245 ## ServerAdmin: Your address, where problems with the server should be# e-mailed. This address appears on some server-generated pages, such# as error documents. e.g. [email protected]#

250 ServerAdmin root@localhost ➊

## ServerName gives the name and port that the server uses to identify itself.# This can often be determined automatically, but we recommend you specify

255 # it explicitly to prevent problems during startup....264 #ServerName www.example.com:80 ➋

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any otheruse is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwiseduplicated whether in electronic or print format without pr ior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used,copied, or otherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994or +1 (919) 754 3700.

30

Page 31: The Apache Web Server

Chapter 2. Apache Configuration

➊ TheServerAdmin directive is mainly cosmetic. The email address is listed inthe footer of thedefault error pages.

➋ For simple hosts, with a single external interface and therefore a clear concept of a hostname, theServerName can be automatically determined. If in doubt, however, it should be specifiedmanually. (For example, if the server is bound to multiple interfaces, the preferred name should beconfigured explicitly).

Server Content: the DocumentRoot

TheDocumentRoot directive, one of the most fundamentally important, identifies where in thefilesystem the information to be be served is found. Recall that when the file portion of a URL istranslated to a file in the filesystem, the document root provides the base of that translation. Thisdirective is probably the most often overridden by a VirtualHost.

The following default specifies the Red Hat Enterprise Linuxdocument root as/var/www/html.

Figure 2-9./etc/httpd/conf/httpd.conf

# DocumentRoot: The directory out of which you will serve your# documents. By default, all requests are taken from this directory, but# symbolic links and aliases may be used to point to other locations.#

280 DocumentRoot "/var/www/html"

Specifying the Directory Index File: DirectoryIndex

In a previous lesson, we discussed the role of an index file, called index.html. We now see that thename of the file is configurable.

Figure 2-10./etc/httpd/conf/httpd.conf

## DirectoryIndex: sets the file that Apache will serve if a directory# is requested.

385 ## The index.html.var file (a type-map) is used to deliver content-# negotiated documents. The MultiViews Option can be used for the# same purpose, but it is much slower.#

390 DirectoryIndex index.html index.html.var

Notice that if multiple file names are specified, each will be searched for in sequence. Specifying toomany alternatives, however, could lead to poor performance.

For example, if migrating content from a Microsoft based server, settingDirectoryIndex to thefollowing would be easier than renaming every file namedindex.htm to index.html.

DirectoryIndex index.html index.htm

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy.Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, orotherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are beingused, copied, or otherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.

31

Page 32: The Apache Web Server

Chapter 2. Apache Configuration

Tip: Index files can even be specified as an absolute reference. What do you think would be theeffect of a configuration such as the following?

DirectoryIndex index.html /cgi-bin/index.cgi

Collecting Client Identities: HostnameLookups

Buried deep withing the configuration file is an important directive calledHostnameLookups.

Figure 2-11./etc/httpd/conf/httpd.conf

435 ## HostnameLookups: Log the names of clients or just their IP addresses# e.g., www.apache.org (on) or 204.62.129.132 (off).# The default is off because it’d be overall better for the net if people# had to knowingly turn this feature on, since enabling it means that

440 # each client request will result in AT LEAST one lookup request to the# nameserver.#HostnameLookups Off

The web server can easily determine the IP address of any client which is making a web request: it’s partof the request’s IP protocol header. In order to determine the hostname of the client, however, the webserver must work harder: it must perform a reverse DNS lookupon the client’s IP address. This reverselookup increases both time and network traffic on the part of the server, so by default, it’s disabled. As aresult, all logging and access control list are implementedby IP address, not by hostname.

If you desire logs and access control lists to use client hostnames instead of IP addresses, and are willingto pay the price in performance,HostnameLookup can be set toon.

Logging: ErrorLog, LogLevel, LogFormat, and CustomLog

The apache web server maintains two types of logs: transaction logs, and error logs. Transaction loggingoccurs with every web request ("hit"), and is highly configurable, potentially logging to multiple files. Incontrast, there is only one error log, and only two questionsassociated with it: where, and how much. Westart with the simpler of the two.

Error Logging: ErrorLog and LogLevel

Figure 2-12./etc/httpd/conf/httpd.conf

#465 # ErrorLog: The location of the error log file.

# If you do not specify an ErrorLog directive within a <VirtualHost># container, error messages relating to that virtual host will be# logged here. If you *do* define an error logfile for a <VirtualHost># container, that host’s errors will be logged there and not here.

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use isa violation of U.S. and international copyrights. No part ofthis publication may be photocopied, duplicated, stored ina retrieval system, or otherwise duplicated whetherin electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributedplease email [email protected] or phone toll-free (USA)+1 866 626 2994 or +1 (919) 754 3700.

32

Page 33: The Apache Web Server

Chapter 2. Apache Configuration

470 #ErrorLog logs/error_log

## LogLevel: Control the number of messages logged to the error_log.

475 # Possible values include: debug, info, notice, warn, error, crit,# alert, emerg.#LogLevel warn

By default, the web server logs to the file/var/log/httpd/error_log (recall the role of theServerRoot directive, and the/etc/httpd/logs symlink). For the main server, it’s hard to think of areason to ever change it, though virtual hosts often override it.

More interesting is theLogLevel, which determines how much information is logged. The vocabularydraws directly from thesyslogservice. When troubleshooting, an administrator often ratchets up thelogging by setting theLogLevel to debug, for example. Of course, more copious logging slows downoverall performance, so once a problem has been resolved, logging is returned to a more suitable default.

Transaction Logging: LogFormat and CustomLog

For every web request, there is a large amount of informationthat an administrator can choose to log (ornot). Such transaction logs are often referred to as "accesslogs". TheLogFormat directive allowsadministrators to assign names to collections of information, so that they are easy to refer to later. This isall LogFormat does, however. In order to use one of the formats, they must beassociated with aCustomLog.

Figure 2-13./etc/httpd/conf/httpd.conf

480 ## The following directives define some format nicknames for use with# a CustomLog directive (see below).#LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined

485 LogFormat "%h %l %u %t \"%r\" %>s %b" commonLogFormat "%{Referer}i -> %U" refererLogFormat "%{User-agent}i" agent

# "combinedio" includes actual counts of actual bytes received (%I) and sent (%O); this490 # requires the mod_logio module to be loaded.#LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" %I %O" combinedio

The following table illustrates some of the parameters mostcommonly used in access logs.

Table 2-1. Apache Log Parameters

ParameterReferences Example

%h Remote host (IP or hostname) 127.0.0.1

%u Remote user (for HTTP authentication) elvis

rha230-5.0-1-en-2008-01-21T07:12:18-0500Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation

of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrievalsystem, or otherwise duplicated whether in electronic or print

format without prior written consent of Red Hat, Inc. If you b elieve Red Hat course materials are being used, copied, or otherwise improperly distributed please email

[email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.

33

Page 34: The Apache Web Server

Chapter 2. Apache Configuration

ParameterReferences Example

%t Timestamp [15/Jul/2005:06:55:44 -0400]

%r Request line (from HTTP protocol) GET /icons/compressed.gif HTTP/1.1

%s HTTP response status code 200

%b Response size (in bytes) 1079

%{name}iHTTP headername (depends onname)

Many more exist as well. As usual, with all of this flexibilitycomes the need for convention. Twocommonly used conventions are thecommonformat and thecombinedformat, which are the first twoformats defined above. Thecommonformat records IP address, username (if any), timestamp, requestline, response status, and number of bytes transferred.1

Thecombinedformat adds the identity of the client application, and the referring page (if any). While thecombinedformat is used by default in Red Hat Enterprise Linux, administrators could well choose todrop back to thecommonformat to save space and improve performance.

Many external log analysis utilities (such aswebalizer) rely on logs being in a standard format, so anadministrator should consider the consequences before changing the log format arbitrarily.

Finally, once a format has been decided, it can be associatedwith a log file using theCustomLogdirective.

Figure 2-14./etc/httpd/conf/httpd.conf

# The location and format of the access logfile (Common Logfile Format).495 # If you do not define any access logfiles within a <VirtualHost>

# container, they will be logged here. Contrariwise, if you *do*# define per-<VirtualHost> access logfiles, transactions will be# logged therein and *not* in this file.#

500 #CustomLog logs/access_log common

## If you would like to have separate agent and referer logfiles, uncomment# the following directives.

505 ##CustomLog logs/referer_log referer#CustomLog logs/agent_log agent

#510 # For a single logfile with access, agent, and referer information

# (Combined Logfile Format), use the following directive:#CustomLog logs/access_log combined

As the above configuration suggests, multiple log files, eachcontaining different information, could beupdated with each hit, though of course performance is a consideration. By default, Red Hat EnterpriseLinux only updates the single file/var/log/httpd/access_log, using thecombinedformat.

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any otheruse is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwiseduplicated whether in electronic or print format without pr ior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, orotherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919)754 3700.

34

Page 35: The Apache Web Server

Chapter 2. Apache Configuration

Remapping the URL Namespace: Alias

Up until now, we have had a very clean concept of the URL namespace: the file portion of a URL mapsdirectly to a file which exists underneath the document root directory. TheAlias directive allowsadministrators to make arbitrary mappings from a portion ofthe URL namespace to any directory in thefilesystem.

Figure 2-15./etc/httpd/conf/httpd.conf

# Aliases: Add here as many aliases as you need (with no limit). The format is# Alias fakename realname## Note that if you include a trailing / on fakename then the server will

530 # require it to be present in the URL. So "/icons" isn’t aliased in this# example, only "/icons/". If the fakename is slash-terminated, then the# realname must also be slash terminated, and if the fakename omits the# trailing slash, the realname must also omit it.#

535 # We include the /icons/ alias for FancyIndexed directory listings. If you# do not use FancyIndexing, you may comment this out.#Alias /icons/ "/var/www/icons/"

As an example, the default Red Hat Enterprise Linux configuration aliases http://localhost/icons/ to thedirectory/var/www/icons/, which is not underneath the document root, but a sibling of it. Theremapping should be easy enough to confirm by following the above link, and taking als of theiconsdirectory.

For better or for worse, we now have a way to expose portions ofour filesystem which are not under thedocument root. Another option is the use of symbolic links, which will be discussed in more detailshortly.

Also, notice the comments about trailing slashes, which have often been a source of confusion. TheApache webserver automatically redirects clients which refer to directories without the trailing slash toan equivalent URL which does (watch closely as you access http://localhost/example, and note that thebrowser ends up showing the omitted trailing slash). This causes some directory related configurationwhich doesn’t specify the omitted slash to be interpreted twice, which can cause confusion.

The Answer Book: http://localhost/manual

By now you could well be bewildered by the many different configuration directives, and in many wayswe’ve just touched the tip of the iceberg. This seems a good time to introduce the manual, which in RedHat Enterprise Linux ships as the separatehttp-manual package. Once installed, the manual can beaccessed at http://localhost/manual.

[root@station ~]# yum install httpd-manual

...=============================================================================Package Arch Version Repository Size

=============================================================================

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use isa violation of U.S. and international copyrights. No part ofthis publication may be photocopied, duplicated, stored ina retrieval system, or otherwise duplicated whetherin electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributedplease email [email protected] or phone toll-free (USA)+1 866 626 2994 or +1 (919) 754 3700.

35

Page 36: The Apache Web Server

Chapter 2. Apache Configuration

Installing:httpd-manual i386 2.2.3-6.el5 rha-rhel 831 k

...Installed: httpd-manual.i386 0:2.2.3-6.el5Complete![root@station ~]# service httpd restart

Stopping httpd: [ OK ]Starting httpd: [ OK ]

The manual provides comprehensive documentation, organized by directive name, module name, or bytopic (such as "Log Files" or "Virtual Hosts"). Anyone wishing to quickly refresh memories, or learnmore about Apache configuration, should definitely load the manual as well.

Exercises

Lab ExerciseObjective: Configure the Apache web server.

Estimated Time: 45 mins.

SpecificationYou will probably want to make a backup of the main Apache configuration file(/etc/httpd/conf/httpd.conf) before starting this exercise, so that you can later restore the defaultconfiguration. If you have not already downloaded http://rha-server/pub/rha/rha230/readings.tgz andextracted its contents into the/var/www/html directory (as specified in the previous exercise), do sonow.

Edit your Apache configuration so that the server meets the following specifications. The suggestedtechnique is to duplicate the relevant lines of your configuration file, comment out the originalconfiguration, and edit the new line to make your changes. Youwill probably want to make incrementalchanges, checking your configuration as you go.

1. Configure the Apache webserver so that it accepts HTTP/1.1KeepAlive requests, but will only wait3 seconds for a followup request before closing the connection.

Hint: you can confirm this configuration by capturing a transaction between the Firefox browser andyour webserver withethereal, and examining the HTTP headers of both the request and response.

2. Manage the bounds of the server pool, such that there are always between 2 and 4 (inclusive) childdaemons present.

3. The Apache server should be bound to port 8888 (of at least the loopback address), in addition toport 80 (on all interfaces). (Note: you will need to drop SELinux into permissive mode in order toallow Apache to bind to a port other than 80 and 443).

4. Configure the web server such thatindex.htm is recognized as an index file, as well asindex.html. Confirm your configuration by removing the file

rha230-5.0-1-en-2008-01-21T07:12:18-0500Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other

use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwiseduplicated whether in electronic or print format without pr ior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used,copied, or otherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994or +1 (919) 754 3700.

36

Page 37: The Apache Web Server

Chapter 2. Apache Configuration

/var/www/html/readings/relat10h/index.html that you created in the previous exercise, ifit exists, leaving the original/var/www/html/readings/relat10h/index.htm, andreferencing http://localhost/readings/relat10h/.

5. Configure the server so that clients are logged by hostname(when available) as opposed to IPaddress. (Hint: You are not expected to need to edit anyLogFormat directives).

6. Set the log level for the error log todebug.

7. In addition to the default logging, have every web requestlogged to the file/var/log/httpd/common_log, using what is commonly referred to as thecommonformat.

8. In the separate configuration file/etc/httpd/conf.d/rha.conf, establish an Alias, so that theURL http://localhost/images/ refers to the directory/var/www/html/readings/relat10h/pics. (If the relevant directory is still namedpicts,rename it or symlink it topics).

Deliverables

1. A running Apache webserver, that accepts Keep-Alive requests, but will close connections after 3 seconds ofinactivity.

2. The server should maintain a server pool of between 2 and 4 pre-forked child daemons.

3. The server should be bound to the loopback address’s port 8888, in addition to the normal port 80.

4. The server should treat files namedindex.htm as index files, in addition to the standardindex.html.

5. Transaction logging should log clients by hostname, if available.

6. The error log should log all messages withdebug and higher priority.

7. In addition to the standardaccess_log, a transaction log named/var/log/httpd/common_log should bekept, logging in thecommonformat.

8. The URL http://localhost/images/ should resolve to/var/www/html/readings/relat10h/pics, due to analias established in the/etc/httpd/conf.d/rha.conf configuration file.

Questions

For all of the following questions, assume the default Red Hat Enterprise Linux configuration of theApache webserver, unless the question states otherwise.

1. Which directory serves as theServerRoot directory (i.e., the directory used as the base for all relative filereferences in the configuration file) ?

( ) a./var/www/html

rha230-5.0-1-en-2008-01-21T07:12:18-0500Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violationof U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrievalsystem, or otherwise duplicated whether in electronic or printformat without prior written consent of Red Hat, Inc. If you b elieve Red Hat course materials are being used, copied, or otherwise improperly distributed please [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.

37

Page 38: The Apache Web Server

Chapter 2. Apache Configuration

( ) b. /var/log/httpd

( ) c./etc/httpd

( ) d. /etc

( ) e.None of the above

2. Which file(s) is(are) used to configure the Apache web server upon startup?

( ) a./etc/httpd/conf/httpd.conf

( ) b. /etc/apache.conf

( ) c./etc/httpd/conf.d/*.conf

( ) d. /etc/sysconfig/apache

( ) e.Both A and C

3. Which of the following directives could be used to improve the performance of a heavily loaded web server?

( ) a. KeepAlive

( ) b. MaxClients

( ) c. MaxSpareServers

( ) d. Timeout

( ) e.All of the above

4. Which of the following directives can be used to defend against memory leaks and other instabilities in poorlywritten libraries and CGI scripts?

( ) a. MaxClients

( ) b. MaxRequestsPerChild

( ) c. ServerLimit

( ) d. KeepAlive

( ) e. Listen

5. Which of the following best describes the default Apache server model?

( ) a. The server uses a traditional Unix forking model, where a newdaemon is forked to handle connections fora particular client.

( ) b. The server uses a pre-forking model, whereby clients are distributed amongst a dynamic pool ofpre-existing daemons.

( ) c. The server uses a multi-threaded model, whereby a single process clones multiple threads, each handling adistinct client.

rha230-5.0-1-en-2008-01-21T07:12:18-0500Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violationof U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrievalsystem, or otherwise duplicated whether in electronic or printformat without prior written consent of Red Hat, Inc. If you b elieve Red Hat course materials are being used, copied, or otherwise improperly distributed please [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.

38

Page 39: The Apache Web Server

Chapter 2. Apache Configuration

( ) d. The server uses a single process polling model, whereby the single process polls a collection of activeconnections for activity.

6. Which of the following lines would cause the web server to bind to port 8080 on the loopback address?

( ) a.Bind 127.0.0.1:8080

( ) b. Bind 127.0.0.1 8080

( ) c.Listen 127.0.0.1:8080

( ) d. Listen 127.0.0.1 8080

( ) e.None of the above

7. The apache manual states that%h is used to log the remote hostname or IP address. Yet, even using this parameter,and administrator finds a log file logs using IP addresses instead. Which of the following configurations would allowclient hostnames to be logged?

( ) a. DNS /etc/resolv.conf

( ) b. HostnameLookups On

( ) c. LogNames On

( ) d. LogLevel info

( ) e.None of the above

8. Which of the following directives would have the same end effect as cd /var/www/html/data; ln -s

../images images ?

( ) a. Alias /data/images/ /var/www/html/images/

( ) b. Symlink /images/ /data/images/

( ) c. Alias /images/ /data/images/

( ) d. View /var/www/html/images/ /data/images/

( ) e.None of the above

9. Assuming thehttpd-manual package is installed, where can Apache documentation be found?

( ) a. http://localhost/help

( ) b. http://localhost/guide

( ) c. http://localhost/apache

( ) d. http://localhost/man

( ) e.None of the above

rha230-5.0-1-en-2008-01-21T07:12:18-0500Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violationof U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrievalsystem, or otherwise duplicated whether in electronic or printformat without prior written consent of Red Hat, Inc. If you b elieve Red Hat course materials are being used, copied, or otherwise improperly distributed please [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.

39

Page 40: The Apache Web Server

Chapter 2. Apache Configuration

10.After editing an Apache configuration file, what should be done for changes to take effect?

( ) a. chkconfig httpd on

( ) b. service httpd restart

( ) c. chkconfig httpd reload

( ) d. service httpd status

( ) e.No action is required, because the apache daemon actively monitors its configuration file.

Notes1. The observant might notice the omission of the second field, inevitably a hyphen ("-"). This field

used to refer to the username as returned by the legacyidentd service, which is seldom implementedtoday.

rha230-5.0-1-en-2008-01-21T07:12:18-0500Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any otheruse is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwiseduplicated whether in electronic or print format without pr ior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, orotherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919)754 3700.

40

Page 41: The Apache Web Server

Chapter 3. Apache Configuration: Containers

Key Concepts

• The Apache web server allows context dependent configuration through the use ofDirectory,Location, Files, andVirtualHost containers.

• Often, theOptions directive is used within containers to allow or disallow symbolic link resolution(with FollowSymLinks) and dynamic directory generation (withIndexes), among otherparameters.

• Often, theOrder, allow from, anddeny from directives are used within containers to implementaccess control based on the client’s IP address or hostname.

• The default Red Hat Enterprise Linux configuration allows the resolution of symbolic links almosteverywhere, but limits the generation of dynamic indexes tothe intended document root directory.

• Dynamic information about the Apache webserver can be obtained using custom handlers which areconventionally associated with the/server-status and/server-info locations.

Discussion

Tailoring Customization to Particular Content: Container sThe Apache webserver allows configuration to be customized to particular files or directories usingcontainers. Containers start with an XMLish opening tag, such as<Directory ...>, and end with anXMLish closing tag, such as</Directory>. Directives found within the container only affect fileswhich fall under the container’s scope.

There are essentially four types of scoping containers, which are exemplified below and itemized in thefollowing table.

Figure 3-1. Sample Apache Containers

<Directory "/var/www/icons">Options Indexes MultiViewsAllowOverride NoneOrder allow,denyAllow from all

</Directory>

<Location /server-status>SetHandler server-statusOrder deny,allowDeny from allAllow from .example.com

</Location>

41

Page 42: The Apache Web Server

Chapter 3. Apache Configuration: Containers

<Files ~ "*.hide">Order allow,denyDeny from all

</Files>

<VirtualHost *:80>ServerAdmin [email protected] /www/docs/dummy-host.example.comServerName dummy-host.example.comErrorLog logs/dummy-host.example.com-error_logCustomLog logs/dummy-host.example.com-access_log common

</VirtualHost>

Table 3-1. Apache Scoping Containers

Directive Scope

Directory All files which exist in or underneath the specified directoryin the filesystem,after URL to filename translation occurs.

Location All files which exist in or underneath the specified location in the URLnamespace,beforeURL to filename translation occurs.

Files All files which match the specified pattern, no matter where they exist in thefilesystem or URL namespace.

VirtualHost All files served by a particular virtual server. Virtual hosts will be covered indetail in a later lesson.

The argument to the opening tag specifies the relevant file or directory (or, in the case ofVirtualHost,IP address). The filename may either be explicit, or shell-like pathname expansion (file globbing) can beused.

Common Container ConfigurationSkimming the containers exemplified above, one finds that container configuration often involves thefollowing three concepts.

1. Options: Various capabilities of the web server are grouped under a generalOptions directive.

2. ACLs: The web server allows access control lists (or ACLs, informally pronounced "Ack-uls") tospecify which clients are allowed to access information, using theOrder, Allow, andDenydirectives. (Access control can also be based on authenticated users, unfortunately a topic beyondthe scope of the current course).

3. Overrides: If allowed with theAllowOverride directive, local configuration files intermixed withwebserver content can dynamically override the startup configuration.

We look at each of these syntaxes in turn.

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any otheruse is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwiseduplicated whether in electronic or print format without pr ior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, orotherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919)754 3700.

42

Page 43: The Apache Web Server

Chapter 3. Apache Configuration: Containers

General Options: Options

The Apache server supports the following options, which arespecified as arguments to theOptionsdirective, usually within a scoping container. Of these, the first two are most commonly used.

Table 3-2. Apache Options

Option Effect

Indexes When a URL references a directory (as opposed to a regular file), and noindex.html file is present (more on this in a bit), and this option is enabled, theweb server will return an automatically generated directory listing. If Indexes isdisabled, a 403 error page will be returned to the client (Access Forbidden).

FollowSymLinks This option must be enabled in order for the webserver to resolve (follow) asymbolic link.

SymLinksIfOwnerMatch

A qualification of theFollowSymLinks option, where the symlink will only befollowed if the file owner of the resulting file is the same as the file owner of thelink itself.

ExecCGI Allow CGI executables to be executed from withing this scope. (More on theselater).

Includes,IncludesNOEXEC

Server side includes are allowed (or, in the latter case, mostly allowed) fromwithin this scope. Server side includes are beyond the scopeof this course.

Multiviews If enabled, content negotiation between the client and the server is supported.This allows a server to serve a document in the most appropriate of multiplelanguages, for example. Further discussion ofMultiviews is beyond the scope ofthis course.

All This option refers to all of the previous options collectively, with the exception ofMultiviews. Unless otherwise specified, this is the default configuration. (Recallthat in Red Hat Enterprise Linux, however, a different policy applies to the rootdirectory, effectively establishing a different default.)

Why not Indexes?

The decision to allow the web server to automatically generate indexes or not is really a matter ofcontrol. If indexes are automatically generated, then merely locating a file underneath the document rootallows anyone to view it or copy it (often with automated command line clients such aswget), unless anindex.html file is created to hide files within a particular directory. Incontrast, if indexes are notallowed, files must be explicitly linked from other files (index.html or otherwise) to be easilydiscovered.

Many low maintenance, public web sites leave indexes on (such as the official Linux kernel repository(http://www.kernel.org/pub/linux)). Other web sites, hoping for a more professional look or more refinedcontrol of information, do not.

Why not resolve Symbolic Links?

Again, the decision to allow symlink resolution is basically one of control. If symlinks are not allowed,

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any otheruse is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwiseduplicated whether in electronic or print format without pr ior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, orotherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919)754 3700.

43

Page 44: The Apache Web Server

Chapter 3. Apache Configuration: Containers

an administrator has a clear concept of what portions of the file system are exposed through the webserver (only files underneath the document root). If symlinks are resolved, however, a symlinkunderneath the document root could expose any other part of the filesystem.

More subtly, the decision to not resolve symlinks can degrade performance. When resolving a path toreference a file, the kernel automatically resolves symlinks. (If you were tocat the file/foo/biz/baz/buzz, you do not need to worry if the directorybiz or baz is actually a symlink). Ifsymlinks are disabled, however, the web server must make a system call on each of the nodes within afile path, asking "is it a symlink? is it a symlink? is it a symlink?" This degradation is one of the reasonswhy the default Red Hat Enterprise Linux configuration leavesFollowSymLinks enabled.

Options Syntax

TheOptions directive takes effect for the scope specified by its enclosing container. For example, thefollowing container would enable indexes and symlink resolution for all files underneath the directory/var/www/html.

<Directory /var/www/html>Options FollowSymLinks Indexes

</Directory>

The following container, however, would enable indexes andserver side includes underneath/var/www/html/widgets.

<Directory /var/www/html/widgets>Options Indexes Includes

</Directory>

The directory/var/www/html/widgets doesnot inherit its options from/var/www/html, butinstead gets its configuration entirely from the newOptions line. BecauseFollowSymLinks is notmentioned, symlinks underneath/var/www/html/widgets will not be resolved.

In contrast, options can be preceded by a "+" or "-", implyingthat options should be inherited from theenclosing scope, with the simple addition or stripping of a particular option. Consider rewriting theabove container as follows.

<Directory /var/www/html/widgets>Options +Includes

</Directory>

In this case, the/var/www/html/widgets directory would haveIncludes, Indexes, andFollowSymLinks enabled (the latter two inherited from/var/www/html).

Similarly, the following container would leave/var/www/html/widgets with only theFollowSymLinks option enabled.

<Directory /var/www/html/widgets>Options -Indexes

</Directory>

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any otheruse is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwiseduplicated whether in electronic or print format without pr ior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, orotherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919)754 3700.

44

Page 45: The Apache Web Server

Chapter 3. Apache Configuration: Containers

Client Access Control: Order, Allow, Deny

The Apache web server allows an administrator to impose access control restrictions on a directory bydirectory (or even file by file) basis using access control lists. These ACL’s are composed of thefollowing directives.

The Allow Directive

TheAllow directive uses the following syntax to specify which clients are allowed to connect to a givenresource.

Allow from client_specification

Theclient_specification is composed of a whitespace separated list of any of the followingelements.

Table 3-3. Apache ACL client specification

Syntax Example Meaning

ALL ALL All clients

Full IP addresses 192.168.0.3 The specified client

Partial IP addresses 172.63. All clients whose IP address begins as specified

Network/Netmasknotation

192.168.1.64/255.255.255.192All clients who belong to the specified subnet

CIDR notation 192.168.1.64/26 All clients who belong to the specified subnet(this example is completely equivalent to thepreceding example).

A full or partial domainname

.example.com All clients whose reverse lookup domain nameends as specified (reverse lookups must be enabledwith HostnameLookups)

The Deny Directive

TheDeny directive uses an identical syntax to specify which clientsarenot allowed to connect to a givenresource.

Deny from client_specification

Theclient_specification is composed of the same elements as for theAllow directive.

The Order directive

Here’s where things get interesting. Whenever client ACLs are specified with theAllow andDenydirectives, the order of precedence must be specified with theOrder directive.

TheOrder directive usually comes in one of two forms.

Order Allow,Deny

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any otheruse is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwiseduplicated whether in electronic or print format without pr ior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, orotherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919)754 3700.

45

Page 46: The Apache Web Server

Chapter 3. Apache Configuration: Containers

In this case, any clients which are unspecified (not matchingany rule) or over specified (they match bothan allow and deny rule) aredenied.

Order Deny,Allow

In this case, any clients which are unspecified or over specified areallowed. Surprisingly, no spaces areallowed around the comma in either case.

Some examples are in order.

Example 1

<Directory /some/sensitive/content>Order Deny,AllowDeny from AllAllow from 192.168.0.

</Directory>

In this case, only clients from within the 192.168.0.0/255.255.255.0 subnet are allowed to access filesunderneath/some/sensitive/content.

Example 2

<Directory /keep/them/out>Order Allow,DenyAllow from 192.168.0.Deny from 192.168.0.4

</Directory>

In this case, clients from within the 192.168.0.0/255.255.255.0 subnet are allowed to access filesunderneath/keep/them/out, except for client 192.168.0.4. All clients outside of the subnet are notallowed access.

Example 3

<Directory /only/for/example>HostNameLookups onOrder Allow,DenyAllow from .example.com

</Directory>

In this case, clients from within theexample.comdomain allowed to access files underneath/only/for/example.

If you are having trouble figuring out how the term "order" applies to the effect of theOrder directive,your author sympathizes. However, with a little experience, a certain sense of the syntax can be made.Until then, make sure that you confirm any ACLs by actually trying to access the material from theappropriate clients.

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any otheruse is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwiseduplicated whether in electronic or print format without pr ior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, orotherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919)754 3700.

46

Page 47: The Apache Web Server

Chapter 3. Apache Configuration: Containers

Red Hat Enterprise Linux Default ConfigurationNow that we know a little about containers, we’re ready to examine some of the containers that come inthe default Red Hat Enterprise Linux Apache configuration. The first container encountered establishes afairly paranoid default policy.

Figure 3-2. /etc/httpd/conf/httpd.conf

## Each directory to which Apache has access can be configured with respect# to which services and features are allowed and/or disabled in that

285 # directory (and its subdirectories).## First, we configure the "default" to be a very restrictive set of# features.#

290 <Directory />Options FollowSymLinksAllowOverride None

</Directory>

In this case, the "/" in the opening tag is not syntax, but a reference to the root directory. So from the rootdirectory on down (i.e., everywhere), the specified policies apply. Specifically, the only allowedOptionis FollowSymLinks, and no overrides are allowed.

The next container loosens things up a bit for the directory/var/www/html. (Why was this directorypicked for special attention?)

Figure 3-3. /etc/httpd/conf/httpd.conf

290 <Directory "/var/www/html">

## Possible values for the Options directive are "None", "All",# or any combination of:

310 # Indexes Includes FollowSymLinks SymLinksifOwnerMatch ExecCGI MultiViews## Note that "MultiViews" must be named *explicitly* - - - "Options All"# doesn’t give it to you.#

315 # The Options directive is both complicated and important. Please see# http://httpd.apache.org/docs-2.0/mod/core.html#options# for more information.#

Options Indexes FollowSymLinks320

## AllowOverride controls what directives may be placed in .htaccess files.# It can be "All", "None", or any combination of the keywords:# Options FileInfo AuthConfig Limit

325 #AllowOverride None

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use isa violation of U.S. and international copyrights. No part ofthis publication may be photocopied, duplicated, stored ina retrieval system, or otherwise duplicated whetherin electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributedplease email [email protected] or phone toll-free (USA)+1 866 626 2994 or +1 (919) 754 3700.

47

Page 48: The Apache Web Server

Chapter 3. Apache Configuration: Containers

## Controls who can get stuff from this server.

315 #Order allow,denyAllow from all

</Directory>

In answer to the above question, access to content beneath/var/www/html is loosened a bit becausethat directory contains the expected content to be served from the webserver. The container also containssome client access control configuration, but only as an example, as the effect of the configuration is toallow everyone.

Location Containers: server-status and server-infoWe find the following two examples ofLocation containers within the default configuration file, bothcommented out.

Figure 3-4./etc/httpd/conf/httpd.conf

## Allow server status reports generated by mod_status,# with the URL of http://servername/server-status

900 # Change the ".example.com" to match your domain to enable.##<Location /server-status># SetHandler server-status# Order deny,allow

905 # Deny from all# Allow from .example.com#</Location>

#910 # Allow remote server configuration reports, with the URL of

# http://servername/server-info (requires that mod_info.c be loaded).# Change the ".example.com" to match your domain to enable.##<Location /server-info>

915 # SetHandler server-info# Order deny,allow# Deny from all# Allow from .example.com#</Location>

Both of these provide examples of virtual locations, in that, if enabled (and customized a bit), the serverwould respond to requests for http://localhost/server-info and http://localhost/server-status. The URL’sdo not map to any particular directory on the filesystem, however, so aDirectory container would havebeen inappropriate.

Each of these containers implements a custom handler using theSetHandler directive. A thoroughdiscussion of the concept of ahandleris beyond the scope of the current class, but essentially a handler

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any otheruse is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwiseduplicated whether in electronic or print format without pr ior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, orotherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919)754 3700.

48

Page 49: The Apache Web Server

Chapter 3. Apache Configuration: Containers

determines how the server responds to a request. The defaulthandler, which returns the contents of thereferenced file to the client, is the only handler we’ve encountered so far. Other handlers allow the webserver to respond differently to requests.

The server-status Handler

Theserver-statushandler, when invoked, returns a page of status information(formatted as HTML) backto the client. The following configuration would attach thishandler to the http://localhost/server-statusurl, but restrict access to 127.0.0.1.

<Location /server-status>SetHandler server-statusOrder deny,allowDeny from allAllow from 127.0.0.1

</Location>

The Apache web server responds to http://localhost/server-status with a page of status informationsimilar to the following.

Figure 3-5. Apache Web Server Status Page

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any otheruse is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwiseduplicated whether in electronic or print format without pr ior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, orotherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919)754 3700.

49

Page 50: The Apache Web Server

Chapter 3. Apache Configuration: Containers

The server-info Handler

Similarly, theserver-infohandler returns a dynamically generated page which reportsthe web server’scurrent configuration.

<Location /server-info>SetHandler server-infoOrder deny,allowDeny from allAllow from 127.0.0.1

</Location>

With this configuration active, the Apache web server responds to http://localhost/server-info with a pageof configuration information similar to the following.

Figure 3-6. Apache Web Server Status Page

Exercises

Lab ExerciseObjective: Configure the Apache web server using containers.

Estimated Time: 45 mins.

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any otheruse is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwiseduplicated whether in electronic or print format without pr ior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, orotherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919)754 3700.

50

Page 51: The Apache Web Server

Chapter 3. Apache Configuration: Containers

SpecificationIf you have not already downloaded http://rha-server/pub/rha/rha230/readings.tgz and extracted itscontents into the/var/www/html directory (as specified in the previous exercise), do so now.Also,apply the image directory fix by renaming/var/www/html/readings/relat10h/picts to/var/www/html/readings/relat10h/pics if you have not already somehow resolved the problem.

Edit your Apache configuration so that the server meets the following specifications. Place all of yourconfiguration in the file/etc/httpd/conf.d/rha.conf.

You should be starting with directory structure similar to the following.

[root@station ~]# tree /var/www/html/readings/

/var/www/html/readings/|-- relat10h| |-- ap01.htm| |-- ap02.htm| |-- ...| |-- index.htm| |-- index.html -> index.htm| |-- pics -> picts/| |-- picts| | |-- arrow.gif| | |-- eq01.gif| | |-- ...| |-- preface.htm| ‘-- works-blue.css|-- relativity -> relat10h/|-- the_god_of_mars.html‘-- war_of_the_worlds.html

1. You decide that the use of symbolic links makes it too difficult to maintain control over a web site.Set options such that symbolic links are disabled everywhere underneath the/var/www/html/readings directory. (Notice that if you solved the image directory nameproblem with a symbolic link, you will need to now rename the directory instead).

2. You are willing to allow people to read Einstein’s relativity starting from the table of contents, but donot want people browsing the directory structure directly.Disable directory indexes underneath the/var/www/html/readings/relat10h directory.

3. You decide that you would like to restrict access to the allof the readings only to local clients.Implement a policy whereby the contents underneath the/var/www/html/readings directory isonly available to clients whose IP address starts 127.0.

4. However, you would like your graphics consultant to be able to review your images. For thedirectory/var/www/html/readings/relat10h/pics, allow access to all clients who start127.0., and the special IP address 127.1.1.1. Also, enable directory indexes for this directory.

5. Because symbolic links are now disabled, you will no longer be able to make use of therelativity symbolic link to access therelat10h directory. Instead, establish an alias such thathttp://localhost/readings/relativity references the/var/www/html/readings/relat10h

directory.

rha230-5.0-1-en-2008-01-21T07:12:18-0500Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other

use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwiseduplicated whether in electronic or print format without pr ior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used,copied, or otherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994or +1 (919) 754 3700.

51

Page 52: The Apache Web Server

Chapter 3. Apache Configuration: Containers

6. Again, because symbolic links are now disabled, you will no longer be able to solve theindex.htmproblem with a symbolic link. Make sure thatindex.htm is considered a directory index as well.(Note, you might just need to make sure that your implementation from the previous lesson’sexercise is still in place.)

7. You would like to monitor the performance of your web server. In the main/etc/httpd/conf/httpd.conf configuration file, enable the http://localhost/server-status andhttp://localhost/server-info location containers, so that you may view dynamically generatedperformance and configuration information.

8. You would like your graphics consultant to be able to monitor the performance as well, so allowboth 127.0.0.1 and 127.1.1.1 to access to these locations, but only these IP addresses.

Deliverables

1. The web server will not resolve symbolic links underneaththe/var/www/html/readings directory.

2. The web server will not generate directory indexes underneath the/var/www/html/readings/relat10hdirectory.

3. Only clients whose IP address begins 127.0 may access content under the/var/www/html/readings.

4. However, the/var/www/html/readings/relat10h/pics directory allows access to 127.1.1.1 in additionto the 127.0 clients. Dynamically generated indexes are also allowed for this directory.

5. The URL http://localhost/readings/relativity resolves to/var/www/html/readings/relat10h.

6. The URL http://localhost/server-status presents dynamically generated status information, but is only availableto 127.0.0.1 and 127.1.1.1.

7. The URL http://localhost/server-info presents dynamically generated status information, but is only available to127.0.0.1 and 127.1.1.1.

Questions

1. Which of the following isnot a legitimate keyword for opening an Apache scoping container?

( ) a. Files

( ) b. Directory

( ) c. Location

( ) d. Virtual Host

( ) e.All of these keywords are legitimate.

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violationof U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrievalsystem, or otherwise duplicated whether in electronic or printformat without prior written consent of Red Hat, Inc. If you b elieve Red Hat course materials are being used, copied, or otherwise improperly distributed please [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.

52

Page 53: The Apache Web Server

Chapter 3. Apache Configuration: Containers

Use the following excerpt from an Apache configuration file, and the following directory structure, to answer thenext 7 questions. You may assume there are no relevant URL aliases, and that all ownerships, permissions, andSELinux contexts are correct.

<Directory /var/www/html/pics>Options -Indexes -FollowSymLinksOrder deny,allowdeny from 192.168.1.

</Directory>

<Location /ogg>Options +IndexesOrder allow,denyallow from 192.168.0.

</Location>

[root@station ~]# tree /var/www/html

/var/www/html/|-- ogg/| |-- 01_track_1.ogg| |-- 02_track_2.ogg| |-- 03_track_3.ogg| ‘-- _hidden/| |-- 04_track_4.ogg| ‘-- 05_track_5.ogg‘-- pics/

|-- demo/| |-- 00001.jpg| |-- 00004.jpg| |-- 00010.vga.jpg| ‘-- index.html|-- feb/| |-- 15479.vga.jpg| ‘-- 15491.jpg|-- index.html|-- mar/| |-- 15651.jpg| ‘-- 15659.vga.jpg‘-- spring -> mar

(Note that/var/www/html/pics/spring is a symbolic link tomar).

2. What would be the result of the client 192.168.0.4 trying to access the URLhttp://server.example.com/pics/mar/?

( ) a. A dynamically generated index.

( ) b. A 403 "Forbidden" error.

( ) c. A 404 "File Not Found" error.

( ) d. The contents of the file/var/www/html/pics/demo/index.html

( ) e.None of the above

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violationof U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrievalsystem, or otherwise duplicated whether in electronic or printformat without prior written consent of Red Hat, Inc. If you b elieve Red Hat course materials are being used, copied, or otherwise improperly distributed please [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.

53

Page 54: The Apache Web Server

Chapter 3. Apache Configuration: Containers

3. What would be the result of the client 192.168.0.4 trying to access the URLhttp://server.example.com/pics/demo/?

( ) a. A 404 "File Not Found" error.

( ) b. The contents of the file/var/www/html/pics/demo/index.html

( ) c. A 403 "Forbidden" error.

( ) d. A dynamically generated index.

( ) e.None of the above

4. What would be the result of the client 192.168.0.4 trying to access the URLhttp://server.example.com/pics/spring/?

( ) a. The contents of the file/var/www/html/pics/index.html

( ) b. A dynamically generated index.

( ) c. A 404 "File Not Found" error.

( ) d. A 403 "Forbidden" error.

5. What would be the result of the client 192.168.1.1 trying to access the URLhttp://server.example.com/ogg/02_track_2.ogg?

( ) a. A 403 "Forbidden" error.

( ) b. The contents of the file/var/www/html/ogg/02_track_2.ogg

( ) c. A dynamically generated index.

( ) d. A 404 "File Not Found" error.

( ) e.None of the above

6. What would be the result of the client 192.168.0.4 trying to access the URLhttp://server.example.com/ogg/_hidden/05_track_5.ogg?

( ) a. A 404 "File Not Found" error.

( ) b. A 403 "Forbidden" error.

( ) c. The contents of the file/var/www/html/ogg/_hidden/05_track_5.ogg

( ) d. A dynamically generated index.

( ) e.None of the above

rha230-5.0-1-en-2008-01-21T07:12:18-0500Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violationof U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrievalsystem, or otherwise duplicated whether in electronic or printformat without prior written consent of Red Hat, Inc. If you b elieve Red Hat course materials are being used, copied, or otherwise improperly distributed please [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.

54

Page 55: The Apache Web Server

Chapter 3. Apache Configuration: Containers

7. What would be the result of the client 192.168.0.4 trying to access the URLhttp://server.example.com/ogg/*.ogg?

( ) a. The contents of all files matched by the glob.

( ) b. A 403 "Forbidden" error.

( ) c. A 404 "File Not Found" error.

( ) d. A dynamically generated index including all files matched bythe glob.

( ) e.None of the above

8. What would be the result of the client 192.168.1.1 trying to access the URLhttp://server.example.com/ogg/i_dont_exist.ogg?

( ) a. A 403 "Forbidden" error.

( ) b. A 404 "File Not Found" error.

( ) c. The contents of the file/var/www/html/ogg/i_dont_exist.ogg.

( ) d. A dynamically generated index.

( ) e.None of the above

Use the following excerpt from an Apache configuration file toanswer the next 2 questions. You may assume thereare no other relevant URL aliases, and that all ownerships, permissions, and SELinux contexts are correct.

<Location /server-status>SetHandler server-statusOrder deny,allowDeny from allAllow from 127.0.0.1

</Location>

9. What would be the result of the client 192.168.0.4 trying to access the URLhttp://server.example.com/server-status?

( ) a. A dynamically generated summary of the state of the each process in the Web Server Pool.

( ) b. A 404 "File Not Found" error.

( ) c. A 403 "Forbidden" error.

( ) d. The contents of the file/var/www/html/server-status.

( ) e.None of the above

rha230-5.0-1-en-2008-01-21T07:12:18-0500Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is aviolation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in aretrieval system, or otherwise duplicated whether inelectronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributedplease email [email protected] or phone toll-free (USA)+1 866 626 2994 or +1 (919) 754 3700.

55

Page 56: The Apache Web Server

Chapter 3. Apache Configuration: Containers

10.What would be the result of the client 127.0.0.1 trying to access the URLhttp://localhost/server-status?

( ) a. A dynamically generated summary of the contents of the/var/www/html/server-status/ directory.

( ) b. A 404 "File Not Found" error.

( ) c. A 403 "Forbidden" error.

( ) d. The contents of the file/var/www/html/server-status.

( ) e.None of the above

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any otheruse is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwiseduplicated whether in electronic or print format without pr ior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, orotherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919)754 3700.

56

Page 57: The Apache Web Server

Chapter 4. Virtual Hosts

Discussion

Virtual HostsOne of the reasons for the popularity of the Apache web serveris that it can easily take on the personaltyof any of multiple web servers, each of which is referred to asavirtual host.

As a pre-requisite to virtual hosting, DNS (domain name service) must resolve multiple hostnames to thesingle machine which is running the Apache web server. You will discover in the workbook on DNS, thisis not difficult to arrange. In our current discussions, however, we will assume that DNS is appropriatelyconfigured.

There are two approaches to virtual hosting supported by theApache web server: IP based virtualhosting, and name based virtual hosting. We look at each of these in turn.

IP Based Virtual HostingFor IP based virtual hosting, the machine running the Apacheserver must be assigned multiple IPaddresses. These addresses could either be a result of multiple Ethernet cards (and thus multiple distinctnetwork interfaces), or the result of a Linux trick calledIP aliasing, which assigns multiple IP addressesto a single Ethernet card.

For IP based virtual hosting, distinguishing the virtual hosts of the web server is trivial. The web servermerely needs to examine the server IP address which is part ofthe incoming client request TCP/IPpacket. Consider the machine which answers to the hostnamewww.republican.pol, with an IP addressof 192.168.0.1, and the hostnamewww.democrat.pol, with an IP address of 192.168.0.2. (No, there isno top level domain.pol - this is just an example).

<VirtualHost 192.168.0.1>ServerAdmin [email protected] www.republican.polDocumentRoot /var/www/republican.polErrorLog logs/republican.pol-error_logCustomLog logs/republican.pol-access_log common

</VirtualHost>

<VirtualHost 192.168.0.2>ServerAdmin [email protected] www.democrat.polDocumentRoot /var/www/democrat.polErrorLog logs/democrat.pol-error_logCustomLog logs/democrat.pol-access_log common

</VirtualHost>

Now, requests forhttp://www.republican.pol/propaganda.htmlwould be mapped to the file/var/www/republican.pol/propaganda.html, and similarly, requests for

57

Page 58: The Apache Web Server

Chapter 4. Virtual Hosts

http://www.democrat.pol/propaganda.htmlwould be mapped to the file/var/www/democrat.pol/propaganda.html. The same web server would be serving both websites, but the client has no way of knowing. To the client, they seem to be completely independent sites.

What configuration can be found within aVirtualHost container? Anything found within the Mainsection of the configuration file. The example above has the two hosts using distinct document roots andlogs. Just as easily, they could add distinctAliases,Options, and ACLs, and a host of otherconfiguration.

Name Based Virtual HostsWhile IP based virtual hosting is simple, it suffers from thefact that each distinct virtual host must beassigned a distinct IP address, while publicly routable IP addresses are often a precious resource. For thisreason, name based virtual hosting was developed.

With name based virtual hosting, multiple hostnames resolve to the same IP address. For example, thehostnameswww.democrat.pol, www.libertarian.pol , andwww.green.polcould all resolve to the IPaddress 192.168.0.2. In this case, however, the web server has a harder time distinguishing the varioushosts, because the IP address of the server in the TCP/IP request packet for each is the same.

The solution is that the web server needs to "dig deeper" intothe request HTTP protocol. Starting withHTTP/1.1, clients are required to supply ahostHTTP header with every web request, which identifiesthe hostname of the requested site. The server can then attempt to match the supplied hostname with theServerName of the requested site.

In order to configure the Apache web server to "dig deeper" into the HTTP protocol in this manner, theNameVirtualHost directive must be used to identify a server IP address as one which is being used forname based virtual hosting. Consider the following extension to the example above.

<VirtualHost 192.168.0.1>ServerAdmin [email protected] www.republican.polDocumentRoot /var/www/republican.polErrorLog logs/republican.pol-error_logCustomLog logs/republican.pol-access_log common

</VirtualHost>

NameVirtualHost 192.168.0.2 ➊

<VirtualHost 192.168.0.2>ServerAdmin [email protected] www.democrat.pol ➋

DocumentRoot /var/www/democrat.polErrorLog logs/democrat.pol-error_logCustomLog logs/democrat.pol-access_log common

</VirtualHost>

<VirtualHost 192.168.0.2>ServerAdmin [email protected] www.libertarian.pol ➌

DocumentRoot /var/www/libertarian.polErrorLog logs/libertarian.pol-error_log

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use isa violation of U.S. and international copyrights. No part ofthis publication may be photocopied, duplicated, stored ina retrieval system, or otherwise duplicated whetherin electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributedplease email [email protected] or phone toll-free (USA)+1 866 626 2994 or +1 (919) 754 3700.

58

Page 59: The Apache Web Server

Chapter 4. Virtual Hosts

CustomLog logs/libertarian.pol-access_log common</VirtualHost>

<VirtualHost 192.168.0.2>ServerAdmin [email protected] www.green.pol ➍

DocumentRoot /var/www/green.polErrorLog logs/green.pol-error_logCustomLog logs/green.pol-access_log common

</VirtualHost>

➊ NameVirtualHost: The IP address 192.168.0.2 has now been identified as an address for which theserver is implementing name based virtual hosting. Any request received over this IP address willnow have its HTTP headers examined for the name of the server.

➋➌➍ ServerName: The hostname supplied by the HTTP headers will be matched against theServerName directive of all virtual hosts which share the relevant IP address. TheServerNamedirective now takes on new importance.

What if the same virtual host should answer to more than one hostname (such aswww.democrat.polandjust democrat.pol)? TheServerAlias directive can be used to add multiple names to consider whenattempting to find a matching virtual host, as in the following example, where the relevant line has beenhighlighted.

<VirtualHost 192.168.0.2>ServerAdmin [email protected] www.democrat.polServerAlias democrat.pol democrat www.donkey.pol donkey.pol donkey

DocumentRoot /var/www/democrat.polErrorLog logs/democrat.pol-error_logCustomLog logs/democrat.pol-access_log common

</VirtualHost>

What if, probably due to a misconfiguration, a match in not found amongst the various 192.168.0.2virtual hosts? The answer is that Apache defaults to the firstdefined server on that IP address, in thiscase,www.democrat.pol. Once a virtual host has been defined for aNameVirtualHost IP address,requests over that IP address will never fall through to the main server.

Notice that, in the example above, the server is really simultaneously implementing IP based virtualhosting (over IP address 192.168.0.1) and name based virtual hosting (over IP address 192.168.0.2).

Exercises

Lab ExerciseObjective: Configure Apache virtual hosts

Estimated Time: 45 mins.

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any otheruse is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwiseduplicated whether in electronic or print format without pr ior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, orotherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919)754 3700.

59

Page 60: The Apache Web Server

Chapter 4. Virtual Hosts

SpecificationThis lab will consist of setting up virtual hosts for four distinct trade organizations which are all sharinga common web server. The various virtual hosts will be bound to variants of the loopback address, so allconfiguration will be local to you machine. The skills required to configure a "real world" external webserver would be nearly identical, however, only the IP addresses would need to change.

1. Create appropriate DNS entries.

As a prerequisite, DNS should be configured to resolve all relevant hostnames appropriately. For ourpurposes, simply adding the following entries to your local/etc/hosts file will suffice.

127.1.1.1 www.peanutbutterisgood.rha127.1.1.2 www.jellyisgood.rha127.1.1.2 www.jamisgood.rha127.1.1.2 www.marmaladeisgood.rha

If you have configured the file correctly, you should be able toindividually ping each of thehostnames, and confirm that they resolve correctly. (Don’t be concerned that there’s not really a toplevel domain called rha. We’ll fix that in an upcoming workbook.)

2. Four advocacy organizations, one each promoting peanut butter, jelly, jam, and marmalade, want touse common infrastructure to support what looks like four independent sites. You are to configureyour web server so that it serves four virtual hosts, with thefollowing parameters. In the followingtable, all document roots are relative to the directory/var/www/vhostlab, represented by ’...’.You will probably have to create this directory.

Hostname IP Ad-dress

Type Document Root

www.peanutbutterisgood.rha127.1.1.1

IPbased

.../pb_root

www.jellyisgood.rha127.1.1.2

Namebased

.../namevhost/jelly_root

www.jamisgood.rha127.1.1.2

Namebased

.../namevhost/jam_root

www.marmaladeisgood.rha127.1.1.2

Namebased

.../namevhost/marmalade_root

The content for the various websites can be found athttp://rha-server/pub/rha/rha230/pbandj_website.tgz. Each site consists of a singleindex.html filefound in the relevantly named directory. Eachindex.html file also references a background imagereferenced as/images/some_name.jpg.

a. Extract the tar archive, and position theindex.html files so that they are located within theappropriate document roots.

b. Within the tar archive, all four images are found in a singleimages directory. Install this

rha230-5.0-1-en-2008-01-21T07:12:18-0500Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy.

Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrievalsystem, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If y ou believe Red Hat coursematerials are being used, copied, or otherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or+1 (919) 754 3700.

60

Page 61: The Apache Web Server

Chapter 4. Virtual Hosts

directory on your web server as the directory/var/www/vhostlab/images. Configure yourweb server so each virtual host can reference images is this directory using a URL of the formhttp:// vhostname/images/some_name.jpg. You may use whatever method you like, as long asthe images are not moved (or copied) from theimages directory, and you do not modify theindex.html files.

If installed correctly, your site should have the followingminimumstructure. (You may have addedsome additional links or whatnot to solve the image directory problem).

/var/www/vhostlab/|-- images| |-- jam.jpg| |-- jelly.jpg| |-- marmalade.jpg| ‘-- peanutbutter.jpg|-- namevhost| |-- jam_root| | ‘-- index.html| |-- jelly_root| | ‘-- index.html| ‘-- marmalade_root| ‘-- index.html‘-- pb_root

‘-- index.html

3. Set options such that clients accessing http://www.peanutbutterisgood.rha/images receive adynamically generated index, but dynamically generated indexes forhttp://www.jamisgood.rha/images, http://www.jellyisgood.rha/images, andhttp://www.marmeladeisgood.rha/images are prohibited.

4. The site http://www.peanutbutterisgood.rha should loghits (client access) to the file/var/log/httpd/pb_access_log, using the common format. The three named based virtualhosts should all log hits to the file/var/log/httpd/fruity_access_log, again using thecommon format.

5. Older web clients use the HTTP/1.0 protocol, instead of the HTTP/1.1 protocol, and do not alwaysprovide the HTTPhost: header required to resolve name based virtual hosts. As a result, whenaccessing a site which uses named based virtual hosting, they are always bound to the default (firstdefined) virtual host.

In order to accommodate these older clients, create a new name based virtual host, with aServerName of DummyPlaceholder, and assign it a document root of/var/www/vhostlab/namevhost. Make sure that it’s definition occurs before any other virtualhost definitions for IP address 127.1.1.2.

Create the file/var/www/namedlab/namevhost/index.html, with the following content.

<p>Which of the following high quality sites are you trying to access?</p><ul><li><a href="/jelly_root">www.jellyisgood.rha</a></li><li><a href="/jam_root">www.jamisgood.rha</a></li><li><a href="/marmalade_root">www.marmaladeisgood.rha</a></li>

</ul>

rha230-5.0-1-en-2008-01-21T07:12:18-0500Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy.Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, orotherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are beingused, copied, or otherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.

61

Page 62: The Apache Web Server

Chapter 4. Virtual Hosts

You may confirm your configuration by accessing the web serverby IP address, instead ofhostname: http://127.1.1.2. Make sure that pages accessedthrough this new (unnamed) virtual hostresolve images correctly.

Deliverables

1. A local DNS configuration which resolveswww.peanutbutterisgood.rhato 127.1.1.1, and each ofwww.jellyisgood.rha, www.jamisgood.rha, andwww.marmaladeisgood.rhato 127.1.1.2.

2. An IP based virtual host on 127.1.1.1, with a document rootof /var/www/vhostlab/pb_root, with thespecified content, which logs hits to/var/log/httpd/pb_access_log using thecommonformat.

3. Three name based virtual hosts (www.jellyisgood.rha, www.jamgood.rha, andwww.marmaladeisgood.rha)which all share the IP address 127.1.1.2, mapped to the document roots/var/www/vhostlab/namevhost/jelly_root,/var/www/vhostlab/namevhost/jam_root, and/var/www/vhostlab/namevhost/marmalade_root, respectively, with the specified content.

4. Each name based host logs hits to the shared log file/var/log/httpd/fruity_access_log using thecommonformat.

5. Requests for all four virtual hosts should resolve the URL/images to the directory/var/www/namevhost/images.

6. For the IP based virtual hosts 127.1.1.1, requests to the URL /images should result in a dynamically generatedindex. For all named based virtual hosts, dynamic index generation of/images should be disabled.

7. In order to support legacy clients, all requests which resolve to the host 127.1.1.2 which donot directlyreference one of the specified name virtual hosts by name should resolve to the document root/var/www/namedlab, which contains the fileindex.html with the specified content.

Questions

1. Which of the following protocols does the Apache webserver use to associate an IP-based virtual host with aclient request?

( ) a. TCP/IP

( ) b. DNS

( ) c. ARP

( ) d. HTTP

( ) e.None of the above

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violationof U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrievalsystem, or otherwise duplicated whether in electronic or printformat without prior written consent of Red Hat, Inc. If you b elieve Red Hat course materials are being used, copied, or otherwise improperly distributed please [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.

62

Page 63: The Apache Web Server

Chapter 4. Virtual Hosts

2. Which of the following protocols does the Apache webserver use to associate a Name-based virtual host with aclient request?

( ) a. TCP/IP

( ) b. DNS

( ) c. ARP

( ) d. HTTP

( ) e.None of the above

3. Which of the following directives would younot be able to override using an Apache virtual host?

( ) a. DocumentRoot

( ) b. ServerName

( ) c. KeepAliveTimeout

( ) d. ErrorLog

( ) e. DirectoryIndex

Use the following excerpt from an Apache web server’s main configuration file to answer the following 7 questions.

...DocumentRoot /var/www/html...ErrorLog logs/error_logCustomLog logs/access_log combinedDirectoryIndex index.html...

<VirtualHost 192.168.24.32>

DocumentRoot /var/www/virtual/chipmunk.eduServerName www.chipmunk.eduErrorLog logs/chipmunk-error-logCustomLog logs/chipmunk-access-log combined

</VirtualHost>

NameVirtualHost 192.168.24.33

<VirtualHost 192.168.24.33>

DocumentRoot /var/www/virtual/hamster.eduServerName www.hamster.eduErrorLog logs/hamster-error-logCustomLog logs/hamster-access-log customDirectoryIndex nuts.html

Alias /seeds/ /usr/share/seeds/

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S.and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format withoutprior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.

63

Page 64: The Apache Web Server

Chapter 4. Virtual Hosts

</VirtualHost>

<VirtualHost 192.168.24.33>

DocumentRoot /var/www/virtual/gerbil.eduServerName www.gerbil.eduErrorLog /var/www/virtual/gerbil.edu/.hterrorsCustomLog logs/gerbil-access-log combined

<Location /seeds>Options -Indexes

</Location>

</VirtualHost>

You may assume that no omitted configuration affects URL to filename translation, and that an external DNS serverappropriately maps the following hostnames.

Hostname IP Address

www.chipmunk.edu 192.168.24.32

www.rat.edu 192.168.24.32

www.hamster.edu 192.168.24.33

www.gerbil.edu 192.168.24.33

www.lemming.edu 192.168.24.33

4. To what file does the URLhttp://www.chipmunk.edu/seeds/sunflower.htmlresolve?

( ) a./var/www/html/seeds/sunflower.html

( ) b. /var/www/virtual/hamster.edu/seeds/sunflower.html

( ) c./usr/share/seeds/sunflower.html

( ) d. /var/www/html/sunflower.html

( ) e./var/www/virtual/chipmunk.edu/seeds/sunflower.html

5. To what file does the URLhttp://www.hamster.edu/seeds/sunflower.htmlresolve?

( ) a./var/www/virtual/hamster.edu/seeds/sunflower.html

( ) b. /var/www/html/sunflower.html

( ) c./var/www/html/seeds/sunflower.html

( ) d. /var/www/virtual/chipmunk.edu/seeds/sunflower.html

( ) e./usr/share/seeds/sunflower.html

rha230-5.0-1-en-2008-01-21T07:12:18-0500Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violationof U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrievalsystem, or otherwise duplicated whether in electronic or printformat without prior written consent of Red Hat, Inc. If you b elieve Red Hat course materials are being used, copied, or otherwise improperly distributed please [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.

64

Page 65: The Apache Web Server

Chapter 4. Virtual Hosts

6. To what file does the URLhttp://www.lemming.edu/seeds/sunflower.htmlresolve?

( ) a./var/www/virtual/chipmunk.edu/seeds/sunflower.html

( ) b. /var/www/html/seeds/sunflower.html

( ) c./var/www/html/sunflower.html

( ) d. /var/www/virtual/hamster.edu/seeds/sunflower.html

( ) e./usr/share/seeds/sunflower.html

7. To what file does the URLhttp://www.rat.edu/seeds/sunflower.htmlresolve?

( ) a./var/www/virtual/chipmunk.edu/seeds/sunflower.html

( ) b. /var/www/html/sunflower.html

( ) c./usr/share/nuts/sunflower.html

( ) d. /var/www/virtual/hamster.edu/seeds/sunflower.html

( ) e./var/www/html/seeds/sunflower.html

8. When accessing the URLhttp://www.gerbil.edu/seeds/acorns/, a403 Access Denied error is generated.Assuming all filesystem ownerships, permissions, and SELinux contexts are correct, which of the following wouldallow access to the URL?

( ) a. Commenting out theLocation directive from the appropriate container.

( ) b. Creating the file/var/www/virtual/gerbil.edu/seeds/acorns/nuts.html

( ) c. Creating the file/var/www/virtual/gerbil.edu/seeds/acorns/index.html

( ) d. Any of the above

( ) e.Either A or C

9. To what file(s) would information about the above (403 Access Denied) transaction be logged?

( ) a. /var/log/httpd/log/gerbil-error-log

( ) b. /var/log/httpd/log/gerbil-access-log

( ) c. /var/www/virtual/gerbil-edu/.hterrors

( ) d. A and B

( ) e.A and C

rha230-5.0-1-en-2008-01-21T07:12:18-0500Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violationof U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrievalsystem, or otherwise duplicated whether in electronic or printformat without prior written consent of Red Hat, Inc. If you b elieve Red Hat course materials are being used, copied, or otherwise improperly distributed please [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.

65

Page 66: The Apache Web Server

Chapter 4. Virtual Hosts

10. In the standard Red Hat Enterprise Linux configuration, which of the following files could also be used toprovide virtual host configuration?

( ) a./etc/httpd/gerbil.virtual

( ) b. /etc/httpd/conf.d/gerbil.conf

( ) c./etc/httpd/conf.d/gerbil

( ) d. /var/www/html/.htgerbil

( ) e.B or C

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any otheruse is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwiseduplicated whether in electronic or print format without pr ior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, orotherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919)754 3700.

66

Page 67: The Apache Web Server

Chapter 5. The Squid Proxy Server

Discussion

Proxy ServersA proxy serveracts as a middleman between a client and a server. The use of a proxy server usuallyinvolves the following.

Figure 5-1. The Role of a Proxy Server

Client Machine Proxy Server Web Server

192.168.0.254

8080

1.1.1.1 2.2.2.2

mozilla squid httpd

192.168.0.1

80

1. A client is configured to use the proxy server. This is a one time configuration, which usuallyrequires the IP address and port of the proxy server.

2. When asked to connect to a service, instead of connecting directly, the client instead connects to theproxy server.

3. The proxy server accepts the request as if it were the server, but sends nothing back to the clientimmediately. Instead, the proxy server initiates the request to the real service, as if it were the client.

4. The true service receives the connection, and returns a response to the proxy server.

5. The proxy server then resends the response it received from the server to the client, as if it were theserver.

Why would anyone want to use such a convoluted scheme? The answer usually involves one of thefollowing.

• Access.The client may be on a machine that does not have a direct connection to the Internet, so itneeds the services of a proxy server which does. In the scenario diagrammed above, the client is on a192.168.0.0/24 private subnet, which by convention shouldnot be routed directly to the Internet.

• Caching.The proxy server may store the response of the server, as wellas returning it to the client. Ifthe client (or another client) asks for the same informationagain, the proxy server merely needs to askthe real server "has your information changed?" If not, the proxy server can return the local copy,reducing traffic between the proxy server and the true service.

• Filtering. The proxy server becomes a single control point for all clients which it serves. Therefore,traffic can be filtered or logged for later auditing at the proxy server.

Although our figure diagrams a web proxy server, our discussion has been intentionally vague aboutwhat client and what service we’re talking about, because the idea of a proxy server is a general concept.

67

Page 68: The Apache Web Server

Chapter 5. The Squid Proxy Server

The service in question could be a web server, an FTP server, or even an LDAP server, and the sameconcepts would apply.

The squid Proxy ServerMost often, if people use the termproxy serverwithout elaboration, they are referring to a HTTP (web)proxy server. Red Hat Enterprise Linux ships with a full featured and sophisticated proxy server, know asSquid. Squid supports FTP, gopher, and HTTP requests, SSL encapsulation, robust caching, extensiveaccess controls, and full transaction logging. Much like the Apache web server, a whole course could bedevoted to deploying and maintaining thesquid proxy server.

Like most Red Hat Enterprise Linux packaged products, however, the out-of-the-box configurationmakes it fairly easy to set up and use the proxy server in a basic configuration. We will cover how toinstall the server, define which port it should bind to, and specify which clients are able to connect to theservice.

The proxy server is packaged in thesquid package, and is managed as thesquid service. Therefore,the standard techniques can be used for installing the software and starting the service in its defaultconfiguration.

[root@station ~]# yum install squid

...=============================================================================Package Arch Version Repository Size

=============================================================================Installing:squid i386 7:2.6.STABLE6-3.el5 rha-rhel 1.2 M

...Installed: squid.i386 7:2.6.STABLE6-3.el5Complete![root@station1 ~]# service squid start

init_cache_dir /var/spool/squid... Starting squid: . [ OK ][root@station1 ~]# chkconfig squid on

The out-of-the-box configuration is not useful directly, however, as the default access control lists do notlet any useful clients connect.

Squid Configuration: /etc/squid/squid.conf

Upon startup, thesquid daemon reads the/etc/squid/squid.conf configuration file for itsconfiguration. The configuration file follows a very traditional Linux (and Unix) syntax.

• All white lines (lines which are empty or contain only white space) are ignored, as are all commentlines that begin with a "#".

• All other lines begin with a keyword, referred to as a "TAG". The syntax for arguments after the tagdepend on the tag, but must all occur on the same line.

Like many Red Hat Enterprise Linux default configuration files, the file attempts to be self documentingand provides copious comments with default configuration values commented out. Usually, changing a

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any otheruse is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwiseduplicated whether in electronic or print format without pr ior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, orotherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919)754 3700.

68

Page 69: The Apache Web Server

Chapter 5. The Squid Proxy Server

value to something other than the default involves uncommenting the default line, and changing its value(perhaps first duplicating the line to preserve documentation of the default value).

While the default configuration file is intimidating, weighing in at over 4300 lines, the relevantconfiguration is a mere 25 lines, as illustrated below.

[root@station ~]# wc /etc/squid/squid.conf

4325 24616 148129 /etc/squid/squid.conf[root@station ~]# grep -v \# /etc/squid/squid.conf | sed "/^$/d" | wc

25 91 756

For our purposes, we are only going to examine three relevanttags:http_port, acl, andhttp_access.

The server’s identity: http_port

Opening the/etc/squid/squid.conf configuration file with any text editor, you should be able toquickly find the first configuration tag,http_port.

Figure 5-2. /etc/squid/squid.conf: http_port

# NETWORK OPTIONS20 # -----------------------------------------------------------------------------

# TAG: http_port# Usage: port# hostname:port

25 # 1.2.3.4:port## The socket addresses where Squid will listen for HTTP client# requests. You may specify multiple socket addresses.# There are three forms: port alone, hostname with port, and

30 # IP address with port. If you specify a hostname or IP# address, Squid binds the socket to that specific# address. This replaces the old ’tcp_incoming_address’# option. Most likely, you do not need to bind to a specific# address, so you can use the port number alone....

72 # Squid normally listens to port 3128http_port 3128

By default,squid binds to port 3128, although by convention, HTTP proxy servers usually use the port8000 or 8080. An administrator could well want to add a line akin to the following.

http_port 8080

Note that, as the comment says, multiplehttp_port lines can be added, causingsquid to bind to morethan one port or interface, if necessary.

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any otheruse is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwiseduplicated whether in electronic or print format without pr ior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, orotherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919)754 3700.

69

Page 70: The Apache Web Server

Chapter 5. The Squid Proxy Server

Squid Access Control Lists: acl and http_access

More interestingly, we also explore manipulating the client access control configuration. Finding theaccess control configuration can be difficult, as the relevant configuration is found deep within the ratherlong file. Searching for the termacl, however, and pounding onFind Nextabout 10 times, you should beable to discover the following.

Figure 5-3. /etc/squid/squid.conf: acl Documentation

# ACCESS CONTROLS# -----------------------------------------------------------------------------

# TAG: acl2230 # Defining an Access List

## acl aclname acltype string1 ...# acl aclname acltype "file" ...#

2235 # when using "file", the file should contain one item per line## acltype is one of the types described below## By default, regular expressions are CASE-SENSITIVE. To make

2240 # them case-insensitive, use the -i option.## acl aclname src ip-address/netmask ... (clients IP address)# acl aclname src addr1-addr2/netmask ... (range of addresses)# acl aclname dst ip-address/netmask ... (URL host’s IP address)

2245 # acl aclname myip ip-address/netmask ... (local socket IP address)#...

2255 ## acl aclname srcdomain .foo.com ... # reverse lookup, client IP# acl aclname dstdomain .foo.com ... # Destination server from URL

Theacl tag assigns a name to a specification. The tag itself has no observable effect, but may instead bereferenced by other tags (such ashttp_access, below). Skimming the comments here and in the fileitself, we find thatacl specifications can involve a wide range of parameters, including the following.

Table 5-1. Squidacl Specifications

Keyword Parameter

src Requesting client’s IP address

dst Real server’s IP address

port Real server’s port

myip squid’s IP address

srcdomain Requesting client’s domain name

dstdomain Real server’s domain name

time Time of day

rha230-5.0-1-en-2008-01-21T07:12:18-0500Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation

of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrievalsystem, or otherwise duplicated whether in electronic or print

format without prior written consent of Red Hat, Inc. If you b elieve Red Hat course materials are being used, copied, or otherwise improperly distributed please email

[email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.

70

Page 71: The Apache Web Server

Chapter 5. The Squid Proxy Server

Keyword Parameter

url_regex Regular Expression matched against the Requested URL

proto Proxied protocol (HTTP, FTP, etc.)

reqheader Regular Expression matched against HTTP request headers

repheader Regular Expression matched against HTTP response headers

And this is only some of the parameters that can be specified. Obviously,squid is highly configurable interms of who it will let connect, and what content it is willing to proxy. We now turn our attention to thedefault configuration, which are the uncommented values found a few lines below.

Figure 5-4. /etc/squid/squid.conf: acl

#Recommended minimum configuration:acl all src 0.0.0.0/0.0.0.0

2395 acl manager proto cache_objectacl localhost src 127.0.0.1/255.255.255.255acl to_localhost dst 127.0.0.0/8acl SSL_ports port 443acl Safe_ports port 80 # http

2400 acl Safe_ports port 21 # ftpacl Safe_ports port 443 # httpsacl Safe_ports port 70 # gopheracl Safe_ports port 210 # waisacl Safe_ports port 1025-65535 # unregistered ports

2405 acl Safe_ports port 280 # http-mgmtacl Safe_ports port 488 # gss-httpacl Safe_ports port 591 # filemakeracl Safe_ports port 777 # multiling httpacl CONNECT method CONNECT

These lines define the following names, which can be referredto later.

Table 5-2. Default squidacl Definitions

Name Members

all All requests

manager squid internal cache management requests

localhost All requests originating from the loopback address

to_localhost All requests to the loopback address

Safe_ports All requests to the well known ports of servicessquid is willing to proxy

CONNECT All requests to initiate an SSL encapsulated connection

As theSafe_portsacl illustrates, a name may be assigned multiple times, resulting in the values being"or"ed together (i.e., a match on any of the individual values is considered a match on theacl as awhole).

Lastly, an access control policy is defined using multiplehttp_access tags which reference theacl’sdefined above. On any client request,squid will use a "stop on first match" policy while searching thefollowing list of http_access controls. Order is important. Oncesquid finds a specification that

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any otheruse is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwiseduplicated whether in electronic or print format without pr ior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, orotherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919)754 3700.

71

Page 72: The Apache Web Server

Chapter 5. The Squid Proxy Server

matches the client request, it stops searching and immediately implements the specifiedallow or denypolicy.

Figure 5-5. /etc/squid/squid.conf: http_access

# TAG: http_access# Allowing or Denying access based on defined access lists## Access to the HTTP port:

2485 # http_access allow|deny [!]aclname ... ➊

#...

#Recommended minimum configuration:## Only allow cachemgr access from localhost

2505 http_access allow manager localhost ➋

http_access deny manager# Deny requests to unknown portshttp_access deny !Safe_ports ➌

...# Example rule allowing access from your local networks. Adapt

2520 # to list your (internal) IP networks from where browsing should# be allowed#acl our_networks src 192.168.1.0/24 192.168.2.0/24 ➍

#http_access allow our_networks

2525 # And finally deny all other access to this proxyhttp_access allow localhost ➎

http_access deny all ➏

To the experienced eye, the comments leave little more to add, but we’ll walk through these lines just incase.

➊ The first argument to thehttp_access tag is either the keywordallow or deny, followed by oneor moreacl names, each possibly preceded by a "!". Theacl names are effectively "and"ed - allmust apply to the client request for thehttp_access policy to apply. The presence of a "!" invertsthe meaning of theacl.

➋ The first line allows management requests, butonly from the loopback address (i.e., from processesrunning on the proxy server). Notice that both themanagerandlocalhostacl’s must apply for thepolicy to take effect. The second line denies management requests from all other sources.

➌ Any request for a port other than one for whichsquid is willing to proxy is denied. (Notice theconvenient use of "!" to invert the meaning of thesafe_portsacl.)

➍ This is where the good guys are defined. More on this in a second.

➎ Any requests from the loopback interface are considered good.

➏ Any request not meeting the above policies is prohibited bydeny all.

Once we work our way through the default configuration, we realize that it only allows connections fromthe loopback address! If the proxy server is to be useful, theidentities of the intended clients need to bespecified. How should be evident from the comments. First, define theour_networksacl to match

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any otheruse is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwiseduplicated whether in electronic or print format without pr ior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, orotherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919)754 3700.

72

Page 73: The Apache Web Server

Chapter 5. The Squid Proxy Server

requests from the clients for whomsquid should be willing to proxy. Second, add ahttp_access ruleallowing connections that match theour_networksacl. (Of course, some name other thanour_networkscould have been used).

Order is important. The matching rule should occur after requests for bad ports are filtered out, butbefore thedeny all sledge hammer. For example, to allow clients to connect fromthe 192.168.0.0/24subnet, we could add the following lines just beneath theour_networkscomments.

acl our_networks src 192.168.0.0/255.255.255.0http_access allow our_networks

Or, the equivalent IP subnet CIDR notation 192.168.0.0/24 could have been used. Of course, aftermodifying the configuration file, thesquid service should be restarted.

[root@station ~]# service squid restart

Stopping squid: . [ OK ]Starting squid: . [ OK ]

It took a while to understand why, but in the end, configuringsquid to allow clients only involves a twoline edit, both of which can be easily deduced from existing comments: one to define who the good guysare, and another to modify the access control list chain to let them in.

Configuring Proxies for Web ClientsOnce a proxy server is up and running, clients must be configured to use it. The details will vary fromclient to client, but the essence is the same. Somehow, the client needs to be configured with the IPaddress and port number of the proxy server.

Configuring Firefox

Thefirefox web browser’s proxy configuration is found by choosing theConnection Settings... buttonfrom Preferences Dialog, which is opened by choosing theEdit:Preferences menu item.

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any otheruse is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwiseduplicated whether in electronic or print format without pr ior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, orotherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919)754 3700.

73

Page 74: The Apache Web Server

Chapter 5. The Squid Proxy Server

Figure 5-6. Firefox Proxy Configuration

Once open, the dialog allows you to specify an independent proxy server for each of several protocols,or, conveniently, to set all protocols to use the same server. A list of domains and IP address for whichthe client shouldnot proxy can also be specified, which is very useful for maintaining access to serversthe proxy server might not be aware of (such aslocalhostor rha-server).

Configuring curl

Command line web clients are often configured to use proxy servers through command line switches orenvironment variables. Opening thecurl man page, for example, and searching forproxy, one can(eventually) find the following.

-x/- -proxy <proxyhost[:port]>Use specified HTTP proxy. If the port number is not specified,it is assumed at port 1080.

This option overrides existing environment variables that setsproxy to use. If there’s an environment variable setting aproxy, you can set proxy to "" to override it.

And, a little further down, the following.

ENVIRONMENThttp_proxy [protocol://]<host>[:port]

Sets proxy server to use for HTTP.

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use isa violation of U.S. and international copyrights. No part ofthis publication may be photocopied, duplicated, stored ina retrieval system, or otherwise duplicated whetherin electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributedplease email [email protected] or phone toll-free (USA)+1 866 626 2994 or +1 (919) 754 3700.

74

Page 75: The Apache Web Server

Chapter 5. The Squid Proxy Server

HTTPS_PROXY [protocol://]<host>[:port]Sets proxy server to use for HTTPS.

...

NO_PROXY <comma-separated list of hosts%gt;list of host names that shouldn’t go through any proxy. If setto a asterisk

For example, to download the Red Hat home page using a the proxy server defined above, either of thefollowing techniques could be used.

[root@station ~]# curl -x http://station:8080 http://www.redhat.com

[root@station ~]# export http_proxy=http://station:8080

[root@station ~]# curl http://www.redhat.com

Squid Logging: /var/log/squid/access.log

Like the Apache web server,squid maintains a transaction log, found at/var/log/squid/access.log.

Squid uses its own log format, which displays details more pertinent to a proxy server than the standardcommonformat used by web servers. Theemulate_httpd_log directive can be set to use thetraditionalcommonformat instead, though information will be lost.

Table 5-3. Squid Log Format

Position Example Content

1 1124596159.068 A Unix standard timestamp.a

2 60355 Request duration, in milliseconds.

3 192.168.0.25 Client IP address

4 TCP_MISS/200 Squid result code

5 1381 Number of bytes transferred to the client

6 GET The request method

7 http://www.redhat.com The requested URL

8-10 ... Parameters relevant to internal cache

Notes:a. The Unix world (including Linux) conventionally recordstimestamps internally using "secondssince the epoch", with the epoch being January 1st, 1970. Using a signed 32bit integer, thisconveniently records times from around 1900 until around 2038. The Unix world was not concernedabout "Y2K" problems, but instead worries about "Y2038" problems. Your author feels this would bethe perfect time to come out of retirement and consult for legacy Linux systems.

Three sample log messages are found below.

1124596032.120 2 192.168.1.1 TCP_DENIED/403 1355 GET http://www.redhat.com/ - NONE/- text/html

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use isa violation of U.S. and international copyrights. No part ofthis publication may be photocopied, duplicated, stored ina retrieval system, or otherwise duplicated whetherin electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributedplease email [email protected] or phone toll-free (USA)+1 866 626 2994 or +1 (919) 754 3700.

75

Page 76: The Apache Web Server

Chapter 5. The Squid Proxy Server

1124596159.068 60355 192.168.0.25 TCP_MISS/200 1381 GET http://www.redhat.com/ - DIRECT/209.132.177.501124596167.650 1 192.168.0.25 TCP_HIT/200 12115 GET http://www.redhat.com/ - NONE/- text/html

The first is from a client which was not accepted by the client access control configuration, and soreceived aTCP_DENIED. The second is a request from a client for data not already in the cache, aTCP_MISS. The third is a followup request (perhaps from a reload of thesame page), whose data wasalready cached locally, generating aTCP_HIT. Notice that the only request which took a significantamount of time to fulfill was the cache miss, which consumed around 60000 milliseconds of cache time,as opposed to 1 or 2.

Finding Out MoreWe have only touched upon a few of Squid’s basics. Those interested in more, such as using squid as atransparent proxy server (or "accelerator"), can consult the FAQs (which reads almost like a manual)found at/usr/share/doc/squid-version/FAQ-html, or consult the Squid home page(http://www.squid-cache.org/).

Exercises

Lab ExerciseObjective: Configure the Squid Proxy Server

Estimated Time: 10 mins.

SpecificationThis lab will have you install, configure, and use thesquid proxy server. A "real world" use ofsquidwould require 3 machines: One to host the web server, one to host the proxy server, and of course theclient machine running a web browser.

Figure 5-7. Standard Squid Proxy Server Configuration

80httpd

8080

squid

43523 firefox

www.widgets.org118.23.53.1

proxy.example.com192.168.0.254165.23.84.5

station1.example.com192.168.0.1

The machine hosting the web server would need a publicly accessible IP address, as would the proxyserver. It could well be the case, however, that the client machine does not, withsquid running on a

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any otheruse is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwiseduplicated whether in electronic or print format without pr ior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, orotherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919)754 3700.

76

Page 77: The Apache Web Server

Chapter 5. The Squid Proxy Server

multi-homed host. Thesquid application would receive requests from client over a private IP address,and forward them to the Internet through its public IP address.

For our lab, we will instead run the web server, the client, and thesquid proxy server on the samemachine. The concepts map directly to the real world scenario. In the following diagram, 192.168.0.1should be replaced with your eth0 IP address.

Figure 5-8. Lab Squid Proxy Server Configuration

127.1.1.2:80httpd

student station

192.168.0.1:8080squid

firefox 127.1.1.2:35476

1. Configure the squid proxy server.

a. Ensure that thesquid package is installed.

b. As a precaution, make a backup of the file/etc/squid/squid.conf, copying it to/etc/squid/squid.conf.orig, for example.

c. In the file/etc/squid/squid.conf, search for thehttp_port option, around line 54. Setthehttp_port to 8080.

d. In the file/etc/squid/squid.conf, search for termour_network, around line 1860(!).Administrators are expected to set local access control policies at this location. Following thecommented out examples, define anacl our_networks, which matches all requests sourcedfrom your eth0 interface. For example, ififconfig eth0 reports your IP address as 192.168.0.5and your network mask as 255.255.255.0, then the following line would be appropriate. (If indoubt, you can specify your IP address directly, with a mask of 255.255.255.255). Oncedefined, add ahttp_access directive which allows the acl.

acl our_networks src 192.168.0.0/255.255.255.0http_access allow our_networks

e. Use the standardserviceandchkconfigcommands to start thesquid service, and enable theservice to start automatically on reboots. You might want touse thenetstatcommand toconfirm thatsquid is LISTENing for connections on port 8080.

2. Monitor squid and httpd requests. In two separate windows (or two separate virtual consoles),uselessto open the files/var/log/httpd/access_log and/var/log/squid/access_log,respectively. Withinless, hit SHIFT -F to enter "follow" mode. As new requests are made for eachservice, you should see a log line generated within the respective file. (PressingCTRL -C will returnlessto normal browsing mode.)

3. Configure firefox to use the proxy server.Using thefirefox browser, open theEdit: Preferencesdialog, and follow the path toGeneral andConnection Settings.... In the resulting dialog, choose

rha230-5.0-1-en-2008-01-21T07:12:18-0500Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy.Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, orotherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are beingused, copied, or otherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.

77

Page 78: The Apache Web Server

Chapter 5. The Squid Proxy Server

Manual proxy configuration, and set theHTTP Proxy to be your eth0 IP address, port 8080. Also,remove any text from theNo Proxy For text entry. OK your way out of the various dialogs.

4. Browse your webserver. Now usefirefox to browse the content of your webserver. If some of yourprevious labs are still in place, you may try http://localhost/relativity,http://www.peanutbutterisgood.rha, or http://www.jamisgood.rha. Otherwise, simply create a file inyour document root directory, and reference it. With each request, you should see a line similar tothe following in your/var/log/squid/access_log file.

1132416737.269 699 192.168.0.1 TCP_MISS/304 200 GET http://localhost/readings/the_god_of_mars.html - DIRECT/127.0.0.1 -

If not, make sure youreloada page from within the browser. If the page is in the browser’scache,then it will not actually generate a request.

Deliverables

1. A runningsquid server, bound to port 8080, which allows requests over the IPaddress assigned to theeth0interface.

2. Thesquid service is configured to start automatically upon reboot.

Challenge Exercises

1. Assuming your neighbors have set access control configuration appropriately, you should be able touse your proxy server to browse a neighbor’s website, or a neighbor’s proxy server to browse yourwebsite, or a neighbor’s proxy server to browse another neighbor’s website. Explore.

2. Configure your access control specifications so that one particular neighbor may access yoursquidproxy server, but another may not.

3. Notice the following line in the/etc/squid/squid.conf configuration file.

2512 # We strongly recommend the following be uncommented to protect innocent# web applications running on the proxy server who think the only# one who can access services on "localhost" is a local user#http_access deny to_localhost

What concern is this addressing? In order to convince yourself that denying theto_localhost aclis a good idea, enable the/server-status location within your Apache web server, but take theprecaution of only allowing requests from the loopback address 127.0.0.1. Then have a neighbor useyour proxy server to access http://localhost/server-status from their machine.

(Realize, of course, that fixing this security hole by denying requests matching theto_localhostacl would break the original lab.)

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violationof U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrievalsystem, or otherwise duplicated whether in electronic or printformat without prior written consent of Red Hat, Inc. If you b elieve Red Hat course materials are being used, copied, or otherwise improperly distributed please [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.

78

Page 79: The Apache Web Server

Chapter 5. The Squid Proxy Server

Questions

Use the/etc/squid/squid.conf excerpts below to answer the next 6 questions:

acl all src 0.0.0.0/0.0.0.0acl manager proto cache_objectacl localhost src 127.0.0.1/255.255.255.255acl public_terminal src 192.168.0.100/255.255.255.255acl public_hours time M-F 09:00-17:00acl intranet src 192.168.0.0/24acl vpn src 10.0.1.0/24acl media_files url_regex \.mp3$acl media_files url_regex \.avi$acl media_files url_regex \.mpeg$acl media_files url_regex \.wma$acl media_files url_regex \.wmv$acl hostile dstdomain cracker.orgacl to_localhost dst 127.0.0.0/8acl SSL_ports port 443 563acl Safe_ports port 80acl Safe_ports port 21acl Safe_ports port 443 563acl Safe_ports port 70acl Safe_ports port 210acl Safe_ports port 1025-65535acl Safe_ports port 280acl Safe_ports port 488acl Safe_ports port 591acl Safe_ports port 777acl CONNECT method CONNECT...http_access allow manager localhosthttp_access deny managerhttp_access deny !Safe_portshttp_access deny CONNECT !SSL_portshttp_access allow localhosthttp_access deny media_fileshttp_access deny public_terminal !public_hourshttp_access allow intranethttp_access deny allhttp_access allow vpn

1. What is the likely purpose of the "mediafiles" acl and associated http_access rule?

( ) a. To speed up access to music and video files by caching them

( ) b. To make it impossible to download music and video through theproxy

( ) c. To make downloading music and video through the proxy more difficult by blocking common fileextensions.

( ) d. To stop external systems from retrieving audio and video files from internal systems

( ) e.None of the above

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violationof U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrievalsystem, or otherwise duplicated whether in electronic or printformat without prior written consent of Red Hat, Inc. If you b elieve Red Hat course materials are being used, copied, or otherwise improperly distributed please [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.

79

Page 80: The Apache Web Server

Chapter 5. The Squid Proxy Server

2. What would happen if a request forhttp://www.somesite.com/files/hit_song.mp3was sent to theproxy server from 127.0.0.1?

( ) a. Not enough information to tell

( ) b. Access would be granted

( ) c. Access would be denied, but other URLs might work

( ) d. Access would be denied unless the file was already cached

( ) e.Access would be denied for any destination

3. What would happen if a request forhttp://www.somesite.com/files/hit_song.mp3was sent to theproxy server from 192.168.0.5?

( ) a. Not enough information to tell

( ) b. Access would be granted

( ) c. Access would be denied, but other URLs might work

( ) d. Access would be denied unless the file was already cached

( ) e.Access would be denied for any destination

4. What would happen if a request forhttp://www.somesite.com/files/hit_song.mp3was sent to theproxy server from 209.132.177.60?

( ) a. Not enough information to tell

( ) b. Access would be granted

( ) c. Access would be denied, but other URLs might work

( ) d. Access would be denied unless the file was already cached

( ) e.Access would be denied for any destination

5. To what extent could the system with IP address 10.0.1.5 use this proxy server?

( ) a. It would be able to access any url

( ) b. It would be able to access some URLs

( ) c. It would not be able to use the proxy at all

( ) d. Not enough information to tell

( ) e.None of the above

rha230-5.0-1-en-2008-01-21T07:12:18-0500Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is aviolation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in aretrieval system, or otherwise duplicated whether inelectronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributedplease email [email protected] or phone toll-free (USA)+1 866 626 2994 or +1 (919) 754 3700.

80

Page 81: The Apache Web Server

Chapter 5. The Squid Proxy Server

6. To what extent could the system with IP address 192.168.0.100 use this proxy server?

( ) a. It would be able to access any url

( ) b. It would be able to access some URLs

( ) c. It would not be able to use the proxy at all

( ) d. Not enough information to tell

( ) e.None of the above

7. What port does squid listen on by default?

( ) a. 8080

( ) b. 443

( ) c. 4400

( ) d. 8139

( ) e.None of the above

8. What is another common port for proxy servers to use?

( ) a. 8080

( ) b. 443

( ) c. 4400

( ) d. 8139

( ) e.None of the above

9. How does one configure proxy settings in the Firefox web browser?

( ) a. Tools:Proxies

( ) b. Edit:Preferences:Web Features:Proxies

( ) c. Tools:Settings:Connection Settings

( ) d. File:Use Proxy

( ) e.Edit:Preferences:General:Connection Settings

Use the following excerpt from a squid log to answer the next question:

1137789430.068 50405 192.168.0.50 TCP_MISS/200 1381 GET http://academy.redhat.com/ - DIRECT/209.132.177.60

rha230-5.0-1-en-2008-01-21T07:12:18-0500Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violationof U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrievalsystem, or otherwise duplicated whether in electronic or printformat without prior written consent of Red Hat, Inc. If you b elieve Red Hat course materials are being used, copied, or otherwise improperly distributed please [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.

81

Page 82: The Apache Web Server

Chapter 5. The Squid Proxy Server

10.What does this log message indicate?

( ) a. The client was denied access to the requested url

( ) b. The client’s request is being "held" pending approval by an administrator

( ) c. The client was granted access to the url, which was found in cache

( ) d. The client was granted access to the url, which was not in cache and will be retrieved by the proxy

( ) e.None of the above

rha230-5.0-1-en-2008-01-21T07:12:18-0500

Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any otheruse is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwiseduplicated whether in electronic or print format without pr ior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, orotherwise improperly distributed please email [email protected] or phone toll-free (USA) +1 866 626 2994 or +1 (919)754 3700.

82