NAME
httpd, save, imagemap, man2html, webls – HTTP server

SYNOPSIS
ip/httpd/httpd [–a srvaddr] [–c cert [–C certchain]] [–d domain] [–n namespace] [–w webroot]

ip/httpd/save [–b inbuf] [–d domain] [–r remoteip] [–w webroot] [–N netdir] method version uri [search]
ip/httpd/imagemap ...
ip/httpd/man2html ...
ip/httpd/webls ...

DESCRIPTION
Httpd serves the webroot directory of the file system described by namespace (default /lib/namespace.httpd), using version 1.1 of the HTTP protocol. It announces the service srvaddr (default tcp!*!http), and listens for incoming calls. If an X.509 certificate is supplied with the –c option, then the service is instead tcp!*!https. There should already be a factotum holding the corresponding private key. If the specified certificate has been signed by a certificate authority, the –C option may be used to specify a file containing a chain of signed certificates.

Httpd supports only the GET and HEAD methods of the HTTP protocol; some magic programs support POST as well. Persistent connections are supported for HTTP/1.1 or later clients; all connections close after a magic command is executed. The Content–type (default application/octet–stream) and Content– encoding (default binary) of a file are determined by looking for suffixes of the file name in /sys/lib/mimetype.

Redirection

Each requested URI is looked up in a redirection table, read from /sys/lib/httpd.rewrite. Fields are separated by spaces and tabs. Anything following a # is ignored. The first field of each line is a URI; the second a replacement path. If a prefix of the URI matches a redirection path, the URI is rewritten using the corresponding replacement path instead of the prefix, and a temporary redirect is sent to the HTTP client. If the replacement path does not specify a server name, and the request has no explicit host, then domain is the host name used in the redirection. The prefix can either be a domain root like http://system/ (which matches that URL only) or a path like /who/rob (which matches that path no matter what the requested server), but not both: http://system/who/rob will never match a request. If the first field ends in a slash, this is an exact match; otherwise it is a prefix match. The first field is a literal string, matched against each file prefix of each URL. The most specific, i.e., longest, pattern wins, and is applied once (there is no rescanning), except for the following exceptions. Httpd matches only the prefix and not subordinate pages if a replacement is prefixed with >. Httpd omits the unmatched part of the original URI from the rewritten URI if the replacement is prefixed with *. This permits many–to–one mappings; for example, to send all references to an old subtree to a single error page.

Httpd handles replacements prefixed with @ internally, treating the request as if it were for the replacement (without the @) but not informing the client of the rewritten name. Replacement URLs prefixed with = generate a permanent redirection instead of a temporary one. Httpd checks to see if this file has changed once every 50 new TCP connections. HTTP 1.1 persistent connection implies many pages may come in one browser connection, so to kick–start httpd, try
for(i in `{seq 50}) hget http://www.your–domain.com/ >/dev/null

Access Control

Before opening any file, httpd looks for a file in the same directory called .httplogin. If the file exists, the directory is considered locked and the client must specify a user name and password matching a pair in the file. .httplogin contains a list of space or newline separated tokens, each possibly delimited by single quotes. The first is a domain name presented to the HTTP client. The rest are pairs of user name and password. Thus, there can be many user name/password pairs valid for a directory.

Auxiliaries (magic)

If the requested URI begins with /magic/server/, httpd executes the file /bin/ip/httpd/server to finish servicing the request. All the auxiliaries take the same arguments. Method and version are those received on the first line of the request. Uri is the remaining portion of the requested URI. Inbuf contains the rest of the bytes read by the server, and netdir is the network directory for the connection. There are routines for processing command arguments, parsing headers, etc. in the httpd library, /sys/src/cmd/ip/httpd/libhttpd.a.$O. See httpd.h in that directory and existing magic commands for more details.

Save writes a line to /usr/web/save/uri.data and returns the contents of /usr/web/save/uri.html. Both files must be accessible for the request to succeed. The saved line includes the current time and either the search string from a HEAD or GET or the first line of the body from a POST. It is used to record form submissions.

Imagemap processes an HTML imagemap query. It looks up the point search in the image map file given by uri, and returns a redirection to the appropriate page. The map file defaults to NCSA format. Any entries after a line starting with the word #cern are interpreted in CERN format.

Man2html converts man(6) format manual pages into html. It includes some abilities to search the manuals.

Webls produces directory listings on the fly, with output in the style of ls(1). /sys/lib/webls.allowed and /sys/lib/webls.denied contain regular expressions describing what parts of httpd's namespace may and may not be listed, respectively. Webls.denied is first searched to see if access is by default denied. If so webls.allowed is then searched to see if access is explicitly allowed. Thus one can have very general expressions in the denied list (like .*), yet still allow exceptions. If webls.denied does not exist or is unreadable, all accesses are assumed to be denied unless explicitly allowed in webls.allowed.

Other sites will note that if neither webls.denied nor webls.allowed exist, any portion of httpd's namespace can be listed (however, webls will always endeavor to prevent listing of `.' and `..'). If webls.allowed exists but webls.denied does not, any directory to be listed must be described by a regular expression in webls.allowed. Similarly, if webls.denied exists but webls.allowed does not, any directory to be listed must not be described by a regular expression in webls.denied. If both exist, a directory is listable if either it doesn't appear in webls.denied, or it appears in both webls.denied and webls.allowed. In other words, webls.allowed overrides webls.denied. If a listing for a directory is requested and access is denied, or another error occurs, a simple error page is returned.

EXAMPLES
These are all examples of how to use httpd.rewrite.

A local redirection:
/netlib/c++/idioms/index.html.Z /netlib/c++/idioms/index.html

Redirection to another site:
/netlib/lapack/lawns           =http://netlib.org/lapack/lawns
http://inferno.bell–labs.com    =http://www.vitanuova.com

Root directory for virtual host:
http://www.ampl.com        /cm/cs/what/ampl

FILES
/sys/lib/mimetype        content type description file
/lib/namespace.httpd     default namespace file for httpd
/sys/lib/httpd.rewrite   redirection file
/sys/lib/webls.allowed   regular expressions describing explicitly listable pathnames; overrides webls.denied
/sys/lib/webls.denied    
regular expressions describing explicitly unlistable pathnames

SOURCE
/sys/src/cmd/ip/httpd

SEE ALSO
newns in auth(2), listen(8), rsa(8)