APACHE Web Server

A web server is a file server that serves files in HTML format over a specific port. The browser client interprets these HTML files to draw the screen on the remote client system. Content served can be static webpages or dynamic content served up by imbedded scripting (PHP, ASP), Common Gateway Interface (CGI) scripts or binary programs or a combination technology (Java, Flash).

The webserver returns two parts of data in response to a GET or POST command browser client – the URL and URI. URL – Universal Resource Locator – is the webserver itself. The URI – Universal Resource Indicator – is the local target resource stored on the server along with an optional pathname as input data to the target resource. How this optional data is passed depends on the HTTP method used – GET or POST.

Installing Apache

Apache is the most popular Open Source webserver. It has a port to Windows replacing IIS. Under Most RedHat and Fedora Linux it is distributed filename starting with the “httpd” followed by a version number. Debian/Ubuntu method refers apt-get to binaries named “apache” plus a version #.

Installing Apache components under RedHat:

yum install httpd, httpd-dools, httpd-devel, system-config-httpd

After installation, use the chkconfig command to configure Apache to start at boot:

chkconfig httpd on

Use the httpd init script in the /etc/init.d directory to start,stop, and restart Apache after booting:

service httpd start

service httpd stop

service httpd restart

You can test whether the Apache process is running with

pgrep httpd

you should get a response of plain old process ID numbers.

It is not recommended to run Apache services as an XINETD application for performance reasons.

Configuring Apache

The configuration file used by Apache is /etc/httpd/conf/httpd.conf.Apache must be restarted for changes to this configuration file take effect. This file is a series of directives – global, per site, per container. Some directives use the standard HTML <tag> </tag> delimiters. If all virtual sites share a single IP address, all site directives need to remain in the main httpd.conf file. The directives indicate server/site attributes, file locations, loadable modules, access control list.

Files in the /etc/httpd/conf.d directory are read and automatically appended to the configuration in the httpd.conf file every time Apache is restarted. This is usually done for server supporting multiple siteson multiple IP address. Create one configuration file in this directory per Web site per dedicated IP address with its own set of NameVirtualHost , <VirtualHost> and <Directory>; then remove the corresponding directives from the main httpd.conf file (if applicable). The files located in the /etc/httpd/conf.d directory don't have to have any special names, and you don't have to refer to them in the httpd.conf file they are appended automatically. Convention usually has the config file name associated with the unqulified hostname of the corresponding./

Web Contents

All the statements that define the features of each web site are grouped together inside their own <VirtualHost> section, or container, in the httpd.conf file. The most commonly used statements, or directives, inside a <VirtualHost> container are:

  • servername: Defines the name of the website managed by the <VirtualHost> container. This is needed in named virtual hosting only.
  • ServerRoot: Defines directory where server configuration information is found.
  • DocumentRoot: Defines the directory in which the web pages for the site can be found.

By default, Apache expects to find all its web page files in the /var/www/html/ directory with a generic DocumentRoot statement at the beginning of httpd.conf. Apache searches the DocumentRoot directory for an index home, page named index.html. For example: a servername of with a DocumentRoot directory of /home/www/site1/, Apache displays, the first page on the website is /home/www/site1/index.html. Apache does not recognize the index pages topmost named as in Windows based systems: index.htm, default.htnl or default.htm unless specified in the DirectoryIndex parameter. So a link to index.html is sometimes requried for content generated by packages using this convention.

File Security

Apache will display Web page files as long as they are world readable and executable (chmod 755). You have to make sure you make all the files and subdirectories in your DocumentRoot have the correct permissions. It is a good idea to have the files owned by a nonprivileged user so that Web developers can update the files using FTP or SCP without requiring the root password.

  1. Create a user with a home directory of /home/www. useradd -g users www
  2. Recursively change the file ownership permissions of the /home/www directory and all its subdirectories.chown -R www:users /home/www
  3. Change the permissions on the /home/www directory to 755, which allows all users, including the Apache's httpd daemon, to read the files inside. chmod 755 /home/www

Use FTP or SCP to transfer new files to your web server as this new user. This will make all the transferred files automatically have the correct ownership. "403 Forbidden"indicates incorrect permissions on files or directories under DocumentRoot.

Virtual Hosting and DNS
HTML 1.0

Apache webservers are usually known by a www A entry in the DNS zone. The default specification under HTML 1.0 – one IP address per <VirtualHost> content to be served.

EXAMPLE 1: Apache listens on all interfaces and gives the same content for any IP address that resolves to the <VirtualHost *> directive enforces a single <VirtualHost> container per IP address ignoring any ServerName directives you may use inside it.

<VirtualHost *>

DocumentRoot /home/www/site1

</VirtualHost>

EXAMPLE 2: Apache listens on all interfaces, but gives different content for addresses 97.158.253.26 and 97.158.253.27. Web surfers get the site1 content if they try to access the web server on any of its other IP addresses:

<VirtualHost *>

DocumentRoot /home/www/site1

</VirtualHost>

<VirtualHost 97.158.253.26>

DocumentRoot /home/www/site2

</VirtualHost>

<VirtualHost 97.158.253.27>

DocumentRoot /home/www/site3

</VirtualHost>

HTML 1.1

Under HTML 1.1 “HTTP Headers” allows specification of more than one server per IP address by using the NameVirtualHost directive in the /etc/httpd/conf/httpd.conf file. The DocumentRoot directive defines the directory that contains the index page for that site.

The <VirtualHost> containerfiles tell Apache where it should look for the Web pages used on each Web site using the <servername> driective. You must specify the IP address for which each <VirtualHost> container applies and the primary Web site domain name for that IP address with the ServerName directive. You can list secondary domain names to serve the same content using the ServerAlias directive.

Apache searches for a perfect match of NameVirtualHost, <VirtualHost>, and ServerName. If no match, then Apache uses the first <VirtualHost> in the list that matches the target IP address. <VirtualHost *> statement indicates it should be used for all other Web queries (non-matched). A <VirtualHost> with a specific IP address always gets higher priority than a <VirtualHost *> to cover the same IP address, even if the ServerName directive doesn't match. As a result, always place <VirtualHost *> statements at the beginning of the list to cover addresses your server may have. You can also have multiple NameVirtualHost directives, each with a single IP address, in cases where your Web server has more than one IP address.

Example: a server is configured to provide content on 97.158.253.26.

NameVirtualHost 97.158.253.26

<VirtualHost *>

Default Directives. (In other words, not site #1 or site #2)

</VirtualHost>

<VirtualHost 97.158.253.26>

servername

Directives for site #1

</VirtualHost>

<VirtualHost 97.158.253.26>

servername

Directives for site #2

</VirtualHost>

With SSL

If you installed Apache with support for secure HTTPS/SSL, always direct the SSL request to a specific IP address. Virtual host wild cards don't work because Apache SSL module demands at least one explicit <VirtualHost> directive for IP-based virtual hosting. When you use wild cards, Apache interprets it as an overlap of name-based and IP-based <VirtualHost> directives and gives error messages because it can't make up its mind about which method to use:

Starting httpd: [Sat Oct 12 21:21:49 2002] [error] VirtualHost _default_:443 -- mixing * ports and non-* ports with a NameVirtualHost address is not supported, proceeding with undefined results

If you try to load any Web page on your web server, you'll see the error:

Bad request!

Your browser (or proxy) sent a request that this server could not understand.

If you think this is a server error, please contact the webmaster

Don't use virtual hosting statements with wild cards except for the very first <VirtualHost> directive that defines the web pages to be displayed when matches to the other <VirtualHost> directives cannot be found:.

NameVirtualHost *

<VirtualHost *>

Directives for other sites

</VirtualHost>

<VirtualHost 97.158.253.28>

Directives for site that also run on SSL

</VirtualHost>

Compressing / Compacting Web Pages

Apache has the ability to dynamically compress static Web pages into gzip or deflate format and then send the result to the browser using the mod_deflate.so loadable module (see web for current Apache directives implementing this module) . Most Web browsers support this format, transparently uncompressing the data and presenting it on the screen. Most commercial websites don’t use this format as compression can be very CPU intensive. Instead weberservers have SSL encryption and compression performed by built-in hardware modules or using outboard network devices to take the CPU load off of a web server.

Apache Website Security

Behind A NAT Firewall

If your web server sits behind a NAT firewall (public->Private NAT), you may want to have the server respond on both public and private IP addresses. Apache allows you to specify multiple IP addresses in the <VirtualHost> statements to serve the same content on both IP addresses:

NameVirtualHost 192.168.1.100

NameVirtualHost 97.158.253.26

<VirtualHost 192.168.1.100 97.158.253.26>

DocumentRoot /www/server1

ServerName

ServerAlias bigboy,

</VirtualHost>

In addition to having the latest Apache code, review Internet info on the latest Apache security patches and procedures, proper application coding technniques and code isolation (SELINUX CGI). Apache security is setup on a per-directory basis.

Disable Directory Listings

Include an index.html pages in each subdirectory under your DocumentRoot directory; otherwise Apache will provide a directory listing of all the files in that subdirectory unless you disable the directory listing by using a -Indexes option in the <Directory> directive for the DocumentRoot like this:

<Directory "/home/www/*">

Options MultiViews -Indexes SymLinksIfOwnerMatch IncludesNoExec

Users attempting to access the nonexistent index page will instead get a "403 Access denied" message.

Access Control Lists (By Hosts)
Apache can restrict site access by host or network the same as TCPD_WRAPPERs under XINETD. This is usually applied per site DocumentRoot directory as follows:

<Directory “/var/www/site1”>

Options Indexes FollowSymLinks

Order allow,deny .. specifies the order

Allow from all.. can also be a subnet or domain

Deny from none.. can also be a subnet or domain

AllowOverride all.. permmits use of user level security

<//Directory

Password Protected Web Pages

Web pages served from main and subdirectories of DocumentRoot can be password protected using the command “httpasswd. This command creates a user/password file similar to /etc/passwd. One password file for each directory to be protected in the site all the way up to DocumentRoot. The password file can have any name; by convention “.htpasswd” is used The password file SHOULD NOT be placed in a directory path under DocumentRoot where it might be exposed to a browser. There are two ways to specify a password file location – by placing a file called .htaccess in the directory to be protected or by specifying AuthUserFile as a Directory or Location directive within the site directives.

1)Use Apache's htpasswd password utility to create username/password combination in the .htpasswd file (web server, not system passwd file) for Web page access. Specify the location of the password file (/etc/httpd/conf is good) and if it doesn't yet exist, include a -c, or create switch on the command line to create it. Any directory away from DocumentRoot tree where Web users could possibly view it.

httpasswd -c /etc/httpd/conf/.htpasswd peter.. -c for first time)

httpasswd /etc/httpd/conf/.htpasswd paul.. each successive user

You will be prompted to supply a password as in the passwd command.

2)Make the .htpasswd file readable by all users: chmod 644 /etc/httpd/conf/.htpasswd

3)Create a .htaccess file in the directory to which you want password control with these entries.

AuthUserFile /etc/httpd/conf/.htpasswd

AuthGroupFile /dev/null

AuthName EnterPassword

AuthType Basic

require valid-user

or as follows in the site directives

<Directory “/var/www/site1”>

AllowOverride all.. permmits use of user level security

AuthUserFile /etc/httpd/conf/.htpasswd

AuthGroupFile /dev/null.. or a separate file in /etc/group format

AuthName “EnterPassword”.. name of the security “realm” displayed on the LOGIN box

AuthType Basic.. Digest authentication not always supported, use SSL isntead.

require valid-user.. required to specify use of the /etc/passwd file

<//Directory>

.htaccess password protects the directory and all its subdirectories. AuthUserFile tells Apache to use the .htpasswd file. The require user statement tells Apache that only user peter in the .htpasswd file should have access. If you want all .htpasswd users to have access, replace this line with require valid-user. AuthType Basic instructs Apache to accept basic unencrypted passwords from the remote users' Web browser.

4)Make the .htpasswd file readable by all users: chmod 644 /home/www/site1/.htaccess

5)Make sure your /etc/httpd/conf/http.conf file has an AllowOverride statement in a <Directory> directive for any directory requiring password authorization.

<Directory /home/www/*>

AllowOverride AuthConfig

</Directory>

6)Make sure that you have a <VirtualHost> directive that defines access to /home/www or another directory higher up in the tree.

<VirtualHost *>

ServerName 97.158.253.26

DocumentRoot /home/www

</VirtualHost>

7)You can combine Host ACLs and Password protection under specific Directory or Location site directives usng the “satisfy any” or “satisfy all”parameter

<Directory “/var/www/site1”>

Options Indexes FollowSymLinks

Order allow,deny .. specifies the order

Allow from all.. can also be a subnet or domain

Deny from none.. can also be a subnet or domain

AllowOverride all.. permmits use of user level security

AuthUserFile /etc/httpd/conf/.htpasswd

AuthGroupFile /dev/null.. or a separate file in /etc/group format

AuthName “EnterPassword”.. name of the security “realm” displayed on the LOGIN box

AuthType Basic.. Digest authentication not always supported, use SSL isntead.

require valid-user.. required to specify use of the /etc/passwd file

satisfy any

<//Directory>

Troubleshooting Apache

Testing Basic HTTP Connectivity

TELNET to port 80 (HTTP) or the specified Listen port at the desired URL. Failure to do so indicates connectivity issue (network or ACL), incorrect Listen port or the service is not started.

Basic HTTP Status Codes

Are found in the main httpd.conf file.

HTTP_Code / Description
200 / Successful request
304 / Successful request, but the web page requested hasn't been modified since the current version in the remote web browser's cache. This means the web page will not be sent to the remote browser, it will just use its cached version instead. Frequently occurs when a surfer is browsing back and forth on a site.
401 / Unauthorized access. Someone entered an incorrect username / password on a password protected page.
403 / Forbidden. File permissions prevents Apache from reading the file. Often occurs when the web page file is owned by user "root" even though it has universal read access.
404 / Not found. Page requested doesn't exist.
500 / Internal server error. Frequently generated by CGI scripts that fail due to bad syntax. Check your error_log file for further details on the script's error message.
Missing Web Pages – 404 Mesages

Default action in Apache for missing web pages is to display of a generic "404 file Not Found" message. You can configure tell Apache to display a predefined HTML file whenever a web surfer attempts to access a non-index page that doesn't exist by You placing this statement in the httpd.conf file:

ErrorDocument 404 /missing.htm

Then put a file with the name missing.htm in each DocumentRoot directory.

Browser 403 Forbidden Messages

Browser 403 Forbidden messages are usually caused by file permissions or security context issues as in SELINUX.

A sure sign of problems related to security context are "avc: denied" messages in your /var/log/messages log file;

Nov 21 20:41:23 bigboy kernel: audit(1101098483.897:0): avc: denied { getattr } for pid=1377 exe=/usr/sbin/httpd path=/home/www/index.html dev=hda5 ino=12 scontext=root:system_r:httpd_t tcontext=root:object_r:home_root_t tclass=file

Only The Default Apache Page Appears

When only the default Apache page appears, there are two main causes. The first is the lack of an index.html file in your Web site's DocumentRoot directory. The second cause is usually related to an incorrect security context for the Web page's file.

Server Name Errors

All ServerName directives must list a domain that is resolvable in DNS, or else you'll get an error similar to these when starting httpd.

Starting httpd: httpd: Could not determine the server's fully qualified domain name, using 127.0.0.1 for ServerName