configuring the apache web server

18
Server Management Bell College Configuring Web Servers page 1 Configuring the Apache Web Server Introduction to web servers ........................................................................................... 1 Web server processing steps ........................................................................................ 4 Running Apache.............................................................................................................. 6 Configuring Apache ........................................................................................................ 8 Configuring by editing httpd.conf ............................................................................... 11 Using .htaccess to password protect a directory ...................................................... 14 Lab: Configuring an Apache server ............................................................................ 15 Introduction to web servers At a very basic level, Web servers serve simple, static content -- HTML documents and images. A user request for a file through a browser is picked up by the Web server and taken to the host file system. The desired file is loaded from the disk where it travels back across the network and is finally delivered to the Web client (browser) by the Web server. When we talk about web servers we can actually mean two different things: the computer which contains the web site files, or the software which runs on the computer In these notes we are mostly referring to the software. HTTP The browser and the Web server talk to each other using Hypertext Transfer Protocol (HTTP). A single TCP connection is opened that transmits first the HTML document and then the images which the document requires. HTTP is used to transmit resources, not just files. A resource is some chunk of information that can be identified by a URL. The most common kind of resource is a file, but a resource may also be server-side script output. A browser is an HTTP client because it sends requests to an HTTP server (Web server), which then sends responses back to the client. The standard (and default) port for HTTP servers to listen on is 80, though they can use any port. Like most network protocols, HTTP uses the client-server model: An HTTP client opens a connection and sends a message called a request to an HTTP server; the server then returns a response, usually containing the resource that was requested.

Upload: webhostingguy

Post on 17-May-2015

2.233 views

Category:

Documents


11 download

TRANSCRIPT

Page 1: Configuring the Apache Web Server

Server Management Bell College

Configuring Web Servers page 1

Configuring the Apache Web Server

Introduction to web servers ........................................................................................... 1 Web server processing steps ........................................................................................ 4 Running Apache.............................................................................................................. 6 Configuring Apache........................................................................................................ 8 Configuring by editing httpd.conf ............................................................................... 11 Using .htaccess to password protect a directory ...................................................... 14 Lab: Configuring an Apache server ............................................................................ 15

Introduction to web servers

At a very basic level, Web servers serve simple, static content -- HTML documents and images. A user request for a file through a browser is picked up by the Web server and taken to the host file system. The desired file is loaded from the disk where it travels back across the network and is finally delivered to the Web client (browser) by the Web server.

When we talk about web servers we can actually mean two different things:

• the computer which contains the web site files, or • the software which runs on the computer

In these notes we are mostly referring to the software.

HTTP

The browser and the Web server talk to each other using Hypertext Transfer Protocol (HTTP). A single TCP connection is opened that transmits first the HTML document and then the images which the document requires. HTTP is used to transmit resources, not just files. A resource is some chunk of information that can be identified by a URL. The most common kind of resource is a file, but a resource may also be server-side script output.

A browser is an HTTP client because it sends requests to an HTTP server (Web server), which then sends responses back to the client. The standard (and default) port for HTTP servers to listen on is 80, though they can use any port.

Like most network protocols, HTTP uses the client-server model: An HTTP client opens a connection and sends a message called a request to an HTTP server; the server then returns a response, usually containing the resource that was requested.

Page 2: Configuring the Apache Web Server

Server Management Bell College

Configuring Web Servers page 2

After delivering the response, the server closes the connection. HTTP is a stateless protocol. This means that there is no connection information maintained between one transaction and the following transactions – HTTP communication is like asking a single question and getting an answer, rather than having a conversation.

The format of the request and response messages are similar. Both kinds of messages consist of:

• an initial line, • zero or more lines known as headers, • a blank line • an optional message body (e.g. a file, or query data, or query output).

Header lines provide information about the request or response.

Dynamic content

Many web sites nowadays feature dynamic content and user interaction, for example for online shopping. This requires accessing databases e or processing of other program code. The Web server delivers a dynamic content Web page to the browser that is created in response to the user input (direct or indirect). This process requires the use of CGI.

Common Gateway Interface (CGI)

CGI is a Web server extension protocol which defines how web clients can pass information to web servers. CGI is not language specific; it's a protocol that allows Web server to communicate with a program. The CGI standard defines how the Web server should run programs locally and transmit their output to the Web browser. For example, as a result of a client request, a Web server launches CGI program (e.g. processform.cgi) to send the parameters as requested by the client browser. It then retrieves output from the processform.cgi program to pass output back to the browser. This is how CGI programs dynamically serve HTML data based on user input. CGI's main disadvantage lies in its slow processing since each request for dynamic content relies on a new program to be launched. CGI scripts can be written in many languages including Perl, C and Python

Hypertext Transmission Protocol, Secure (HTTPS)

HTTPS is a security protocol that allows a secure Web connection. This means that with HTTPS it is safe for an exchange of sensitive data between user and the server across the insecure network. URLs that begin with 'https' are handled using SSL algorithm that setup a secure, encrypted link between a Web browser and a Web server.

Multipurpose Internet Mail Extension (MIME)

MIME type header is the primary mechanism to display content downloaded by the browser. It tells browser about the content type being delivered. MIME types are

Page 3: Configuring the Apache Web Server

Server Management Bell College

Configuring Web Servers page 3

identified using a type/subtype syntax associated with a file extension. Here are some examples:

text/xml xml video/mpg mpeg mpg video/quicktime qt mov

MIME headers are used by HTTP to specify the contents of any transported file. The header will specify a file's type.

Serving Web Pages

Today's Web servers are able to process and deliver multiple requests simultaneously to serve more users at a time. This is done using multi-threading and multi-processing. Today most Web sites run Web servers that support either multi-threading or multi-processing and thus can handle a much higher load.

Commonly used web server software

The most widely used web server software is Apache – according to www.netcraft.com more than 60% of web sites use it. Apache is an open source project and the server is available for Unix, Linux, Windows and other systems. If you have web space with an ISP account, it is likely to be hosted on a Unix or Linux system using the Apache server.

In this course you will look at the Apache server on a Linux system.

The next most popular is Microsoft’s Internet Information Services, which is only used on Windows systems. Other servers include SunONE and Zeus.

Application servers

Dynamic content is now often created by application servers. An application server sits in the middle of other programs and serves to process data for those other programs. In a web site, the application server usually sits between the web server and a database. It is sometimes referred to as middleware. The web server, application server and database can be on the same computer, or all on different computers. Common application server technologies are ColdFusion, ASP.NET, PHP and JSP/Java Servlets. A web server can often be configured to use an application server for processing – for example, Apache and the ColdFusion server can be used together.

Page 4: Configuring the Apache Web Server

Server Management Bell College

Configuring Web Servers page 4

Web server processing steps Web servers are designed around a certain set of basic goals:

• Accept network connections from browsers. • Retrieve content from disk. • Run local CGI programs or application server programs. • Transmit data back to clients. • Keep a log of user activity. • Be as fast as possible.

The following diagram shows the steps used by Apache to process a request.

Translate URL to filename

For example the URL of a document may look like:

http://hamilton.bell.ac.uk/index.html

Translate URL to filename

Parse request headers Access

control

Check user

Check MIME type

Invoke handler (sends response)

Log the request

Page 5: Configuring the Apache Web Server

Server Management Bell College

Configuring Web Servers page 5

The internal path in the filesystem is

/var/www/html/index.html

Thus this step converts the URL into the internal path where the document can be found on the server.

Parse request headers

The server analyzes HTTP headers of the request

Access control

Access restrictions can be defined on the resources of the server, according to certain characteristics of the client (IP address, or hostname).

Check user

If a resource is password protected, Apache checks if the password and the login provided by the client exist and are valid

Check MIME type of the object requested

Determines the MIME type of the document required in order to carry out certain actions (for example if it is a CGI file, the program is run).

Invoke handler (sends response)

The HTTP response is made up and sent to the client. This The response can be a static document, or can be generated dynamically, depending on the request.

Log the request

Records a trace of the transaction carried out by recording in one or more logfiles The logfiles can be analysed to obtain information about site visitors.

Page 6: Configuring the Apache Web Server

Server Management Bell College

Configuring Web Servers page 6

Running Apache You can download and install the appropriate version of Apache for your system from www.apache.org. Alternatively, many Linux distributions include Apache, and it can be installed at the same time as the operating system. In these notes we will look at running Apache 2.0 as installed on Fedora Linux.. Different versions of Linux or Unix may be configured slightly differently.

The Apache server on Linux runs as the httpd daemon (a daemon is a program that runs continuously and exists for the purpose of handling requests that a computer system expects to receive). Usually the system initialisation files are set up to start httpd when the system starts. It is also possible to start and stop httpd manually. You need to stop and restart Apache when you change anything in its configuration.

GUI: Use the Services tool under the Red Hat->System Settings->Server Settings in the menu on the Fedora desktop. This is similar to Services in Windows 2000.

You can select the httpd service and click Start, Stop or Restart. To check whether httpd is running, open a browser and enter http://localhost in the address bar. If it is running, you should see the test page:

Page 7: Configuring the Apache Web Server

Server Management Bell College

Configuring Web Servers page 7

Command line:

To start Apache, type the following command in a terminal (you need to be logged in as the root user):

/usr/sbin/apachectl start

To stop Apache, type

/usr/sbin/apachectl stop

Note that the location of the apachectl file may be different on other Linux systems.

It can be very useful to know how to work with Apache using the command line. Many web servers are located in data centres with fast internet access. Web server administrators often need to access their servers remotely using a simple interface such as telnet or ssh.

Page 8: Configuring the Apache Web Server

Server Management Bell College

Configuring Web Servers page 8

Configuring Apache Apache’s configuration information is kept in a configuration file called httpd.conf. Like most Linux configuration information, this is a text file which can be edited with a text editor such as Pico or Vi. Some versions of Linux, including Fedora, provide GUI tools for configuring httpd.conf. In Red Hat Linux, you can use the HTTP Configuration Tool.

Some common web server configuration tasks include:

Basic configuration

• Server name • IP address • TCP port • Webmaster email address

Site configuration

• Default filename(s) • Root directory

Access control and authentication

• Allowing or denying access from specific hosts • Password protecting pages

Virtual Hosting

• Hosting multiple we sites on the same server

Log files

• Defining what information is logged

Page 9: Configuring the Apache Web Server

Server Management Bell College

Configuring Web Servers page 9

Apache directories An Apache web site typically has the following directories:

• Document root all web pages go in here • CGI bin all CGI scripts go in here • Log directory all server logs are created in here • Manual all Apache documentation is in here

These are usually all located with one directory, such as /var/www, as in the example shown below:

What is Virtual Hosting?

Virtual hosts are Web sites with different names that all run on the same server hardware. The idea is that Apache knows which site the user is trying to get, even though they are all on the same server, and serves content from the right one.

This trick lets you run several Web sites on a single machine, from a variety of different domain names, and several names within one domain, so that one machine looks like a room full of servers.

More commonly, web hosting companies and ISPs can host many web sites on a single server computer using virtual hosting. Many web sites run for smaller businesses or individuals are hosted this way. Some companies, usually larger ones, have their own server computer on their own premises, or a dedicated server computer at located at a hosting company’s data centre.

Page 10: Configuring the Apache Web Server

Server Management Bell College

Configuring Web Servers page 10

Types of virtual hosting

Virtual hosts can be specified one of two ways. These are configuration differences on the server, and are not visible to the client - that is, there is no way that the user can tell what sort of virtual host they are using. Or even that they are using a virtual host, for that matter.

The two types are IP-based virtual hosting and name-based virtual hosting. In a nutshell, the difference is that IP-based virtual hosts have a different IP address for each virtual host, while name-based virtual hosts have the same IP address, but use different names for each one.

IP-based Virtual Hosting

In IP-based virtual hosting, you are running more than one web site on the same server machine, but each web site has its own IP address. In order to do this, you have to first tell your operating system about the multiple IP addresses.

Once you have given your machine multiple IP addresses, you will need to make sure that each IP address and host name is added to your DNS server.

Name-Based Virtual Hosts

Sometimes, you don't have the luxury of giving your machine multiple IP addresses. Public IP addresses are in short supply, and frequently, for example if you have an DSL connection, you many only have one. In this case, you need to use name-based virtual hosting.

You don't have to give your machine multiple IP addresses, but you still need to set up more than one DNS record on your DNS server for your machine. These extra records are called C-records, or CNames. (The main record pointing at a machine is called the AName, or A-record.) You can have as many CNames as you like pointing to a particular IP address.

Page 11: Configuring the Apache Web Server

Server Management Bell College

Configuring Web Servers page 11

Configuring by editing httpd.conf The httpd.conf file is a text file which contains directives which define the configuration. Apache can be configured by opening the file /etc/http/conf/httpd.conf in a text editor, for example KEdit, Pico or Vi. You can then change or add directives as required

Some Linux distributions include GUI tools to configure Apache. These simply provide a convenient way of editing parts of the file – when you change a setting on the GUI tool, the file is altered accordingly. In this course we will configure by editing httpd.conf, and the techniques learned here should work on any Linux or Unix server.

The file contains several sections, for different types of configuration. You can usually search the file for the item you want to change, then edit the relevant lines.

Basic settings are configured with Global directives in the section headed:

### Section 1: Global Environment

For example, to configure the server to listen for requests on 192.168.1.4 on port 80, httpd.conf must include the line

Listen 192.168.1.4:80

The following example configures the server to listen on any assigned IP address on port 80

Listen 80

Site configuration settings are configured in the section headed

### Section 2: 'Main' server configuration.

Examples of settings:

ServerAdmin root@localhost

DocumentRoot "/var/www/html"

ServerName mywebserver

DirectoryIndex index.htm index.html (the default filename)

Page 12: Configuring the Apache Web Server

Server Management Bell College

Configuring Web Servers page 12

Access control and authentication for specific directories are also configured here, for example:

<Directory "/var/www/html"> Order allow,deny Allow from all AllowOverride all </Directory>

Note that directives which define subsections are in bracketed, XML-like elements.

Virtual Hosting configuration settings are configured in the section headed

### Section 3: Virtual Hosts

To use name-base virtual hosting you need a NameVirtualHost directive. The following directive allows name based hosting for all IP addresses which listen on port 80. NameVirtualHost *:80

To create a virtual server you enclose the directives for that server in a <VirtualHost> directive. The following directive sets up a virtual server which uses the directory /another/directory as its root directory, and responds t requests for the host myotherhost. <VirtualHost *:80> ServerAdmin other@localhost DocumentRoot /another/directory ServerName myotherhost </VirtualHost> Logging for the default site is configured in Section 2, and for a virtual host is configured inside the <VirtualHost> directive. For example ErrorLog logs/error_log LogLevel warn TransferLog logs/access_log

User Directories is an example of an additional configuration option in httpd.conf. If it is enabled, then each user on the system can store web pages in a directory called public_html inside their own home directory. If a user called myuser has a page called mypage.html in this directory on the server myhost, then it can be viewed using the URL

http://myhost/~myuser/mypage.html

Page 13: Configuring the Apache Web Server

Server Management Bell College

Configuring Web Servers page 13

User Directories is enabled with these directives

UserDir enable <list of usernames> UserDir public_html Note that Apache runs with its own user id, so the permissions on the user directory need to be set to allow users other than the owner to read the files. Permissions of 755 on the user’s home directory, public_html directory and web pages will allow this.

Page 14: Configuring the Apache Web Server

Server Management Bell College

Configuring Web Servers page 14

Using .htaccess to password protect a directory An .htaccess file is simply a text file containing Apache directives. Those directives apply to the documents in the directory where the .htaccess file is located, and to all subdirectories under it as well. Other .htaccess files in subdirectories may change or nullify the effects of those in parent directories.

What you can put in these files is determined by the AllowOverride directive for a directory in httpd.conf. This directive specifies, in categories, what directives will be used if they are found in a .htaccess file. The following directive only allows the .htaccess file to control user authorisation.

AllowOverride AuthConfig

To restrict access to a directory /var/www/html/private to users who log in with a username of privateuser and a password of privatepassword, you need to do the following:

1. Edit httpd.conf to include the following <Directory> directive:

<Directory "/var/www/html/private"> AllowOverride all

</Directory>

2. Create a file called .htaccess inside the directory /var/www/html/private, containing the following:

AuthName “Private Directory” AuthType Basic AuthUserFile /var/www/html/private/.htpasswd Require user privateuser

3. You need to create a file called .htpasswd in /var/www/html/private which contains the allowed username and password. You can do this using the following command:

htpasswd –c /var/www/html/private/.htpasswd privateuser

You are prompted for a password. After you have entered and confirmed the password, there will be a file called .htpasswd, containing the username and an encrypted version of the password, something like this:

privateuser:e9Ad7d7qpbvAw

Now, when a user attempts to access a page stored in that directory, the user’s browser will show a pop-up box and ask for a username and password to access the page.

Page 15: Configuring the Apache Web Server

Server Management Bell College

Configuring Web Servers page 15

Lab: Configuring an Apache server In this lab you will configure an Apache server on a Red Hat Linux system as an intranet server on a private LAN.

You will need the following set up for you before you start:

• Linux system with Apache installed • Server name set to myserver • A test web page in the directory /var/www/html • A test web page in the directory /var/www/html/secretstuff • A test web page in the directory /var/otherwww/html • A test web page in the directory /var/yetanotherwww/html • A user on the system called labuser with a password of password • A test web page in the directory /home/labuser/public_html • Permissions set to 755 for all these directories and files and for labuser’s home

directory • A root password of password • A /etc/hosts file set up to resolve myserver, myotherserver, and yetanotherserver

to the local loopback address 127.0.0.1

Start up the system and log in as root with a password of password (actually, Linux administrators never log on directly as root, but we will do so here for simplicity).

Start Apache. You can use the GUI Services dialog or the command line.

Access the URL http://myserver in your browser. You can use any web browser which is configured on the system, for example Konqueror, accessed from the Red Hat->Internet->More Internet Tools desktop menu. You should see the test page.

Open httpd.conf for editing. For example, you can open a terminal window and type:

kedit /etc/httpd/conf/httpd.conf &

Make the configuration changes listed on the following pages. Take a note of what you did to achieve each change. Test each change as you make it – do not try to make all the changes at once! Tick the appropriate box when you have tested a change successfully.

You need to restart Apache after you make changes.

Page 16: Configuring the Apache Web Server

Server Management Bell College

Configuring Web Servers page 16

CONFIGURATION CHANGE 1: DEFAULT FILENAME

Add default.html to the list of default file names

______________________________________________________________________

______________________________________________________________________

______________________________________________________________________

TEST:

Open a browser.

Enter the URL http://myserver. You should see the test web page default.html.

CONFIGURATION CHANGE 2: PORT

Configure the server to listen to all addresses on port 8084. Restart Apache.

______________________________________________________________________

______________________________________________________________________

______________________________________________________________________

TEST:

Open a browser.

Enter the URL http://myserver:8084. You should see the test web page default.html. Note – this requires that configuration change 1 has been done.

When you have done this, reconfigure the server to listen to all addresses on port 80 and restart Apache.

OK

OK

Page 17: Configuring the Apache Web Server

Server Management Bell College

Configuring Web Servers page 17

CONFIGURATION CHANGE 3: PASSWORD PROTECTION

Password protect the directory /var/www/html/secretstuff with a username of secret and a password of stuff. Restart Apache.

______________________________________________________________________

______________________________________________________________________

______________________________________________________________________

______________________________________________________________________

TEST:

Open a browser.

Enter the URL http://myserver/secretstuff/ You should be prompted for a username and password. Enter secret and wrong. You should be denied access

Enter the URL http://myserver/secretstuff/. You should be prompted for a username and password. Enter secret and stuff. You should see the test web page default.html. Note – this requires that configuration change 1 has been done.

OK

OK

Page 18: Configuring the Apache Web Server

Server Management Bell College

Configuring Web Servers page 18

CONFIGURATION CHANGE 4: USER DIRECTORIES

Configure the server to listen to allow access to pages in labuser’s public_html directory. Restart Apache.

______________________________________________________________________

______________________________________________________________________

______________________________________________________________________

TEST:

Open a browser.

Enter the URL http://myserver/~labuser/. You should see the test web page default.html. Note – this requires that configuration change 1 has been done.

CONFIGURATION CHANGE 5: VIRTUAL HOSTS

Add a two name-based virtual hosts as follows, and restart Apache.

hostname: myotherserver document root: /var/otherwww/html hostname: yetanotherserver document root: /var/yetanotherwww/html

______________________________________________________________________

______________________________________________________________________

______________________________________________________________________

TEST:

Open a browser.

Enter the URL http://myotherserver/. You should see the appropriate test web page default.html. Note – this requires that configuration change 1 has been done.

AND

Enter the URL http://yetanotherserver/. You should see a different test web page default.html.

OK

OK

OK