2002/04/02 update
2002/02/12
Server root of traditional web server does nothing to regulate accesses to the name space which the server is servicing.
The problem will become clear if we run CGI programs on the server:
all the files in the system will be seen from the CGI programs.
There is potentially a serious security problem.
By this reason CGI programs of users will be prohibited or will be regulated under the control of system administrator.
Fig.1 illustrates the relation among three name spaces: console space, service space, document space.
Console space is a set of files that can be seen from console and this
is also a set of files on the system. ( Console space is shown by bold
rectangle. )
Service space is a set of files on which the web server is servicing to the
client. Service space is also a set of files that can be accessed by CGI programs.
Service space is exactly equal to console space in traditional web server.
Document space is a set of documents that consists Web Pages.
The document space of alice is a set of document that can be accessed
using URI /~alice/
.
Document space of all users is in the service space and therefore they can be accessed equally by any CGI programs.
An essential progress was first made by the httpd of Plan9 second edition.
The server could encapsulate service space under server root(Fig.2).
The technique stood on the special ability of Plan9: per process name space.
The server root became "/
" in the name space that was seen by CGI programs of this server.
That is, CGI programs were encapsulated in the name space that is specified by server root.
Fig.2 illustrates the encapsulation. The area shown by gray color is a set of files out side of service space; the space is essentially hidden from CGI programs.
However the httpd of Plan9 second edition stood on HTTP/1.0 and was poor to handle CGI environment:
location of CGI program was fixed and POST method was not supported. The problem was fixed by Charles Forsys. ( Therefore I had long used his
httpd, though it stood on HTTP/1.0 )
Httpd of Plan9 third edition supports HTTP/1.1, however no advance was done for CGI environment.
I reconstructed httpd of Plan9 third edition to introduce new design and named Pegasus to the new web server.
http://some.dom.com/pathname # host document (main document) http://some.dom.com/~alice/pathname # user's document http://other.dome.com/pathname # a document of virtual host (different IP ) http://vertual.dom.com/pathname # a document of virtual host (same IP )! Generally speaking these documents are managed and administrated by different persons.
One of the problems of traditional web server (including that of Plan9) is the service space is shared among these persons.( see Fig.1 and Fig.2 ).
(Note that the problem is veary similar to that of address space of personal computers in the early days. Logical address was not supported.)
This means a CGI program of some person can access to the documents of other
persons. There exists potential possibility of interference among the persons
who have documents on the web server.
Pegasus fixed this problem.
All the documents have a single document root, /doc
, in service space. That is, there is only a single document root of one person in service space. ( see Fig.3a and 3b.)
Fig.3a illustrates the relation between service space and document space
when the server is accessed by /~alice
.
If the server is accessed by /~bob
, the relation will be
changed to Fig.3b.
This means any CGI programs are encapsulated from documents of others.
CGI programs cannot access to the files not due to access control of each files but due to the scope of name space.
There is no need to access files of other persons via CGI. Therefore this sever design should work without problem.
Currently only Pegasus have this unique ability.
data
' should be accessed only by herself and by her CGI.
Pegasus resolves this problem.
1. Run the server as user web
.
(user web
is not a real user, therefore there is no need to own files.)
2. add web
to /adm/users
as a group member of alice:
alice:alice:web3. alice sets access permition of
data
to be:-lrw-rw---- alice alice .... data(in case of `write' access)
Note: other files should be set as:
-lrw-r--r-- alice alice .... data(if this file is allowed to read by anyone.)
Why the problem of access protection is solved so simply?
Because service space of Pegasus is encapsulated to each user.
Unix resolves this problem using CGI wrapper( for example look http://download.sourceforge.net/cgiwrap ).
That is, CGI wrapper is set SUID of root
and httpd is forced to access to CGI only via CGI wrapper.
Comparing two method, we can conclude that:
1. Pegasus method is safer than CGI wrapper, because all files of a user will fall into danger under CGI wrapper if the user write a problematic CGI. On the other hand, only files that permit writing access to `web
' will fall into danger under Pegasus.
2. Pegasus method is much easier to administrate. There is almost nothing to administrate. The only thing to do is to run Pegasus as user `web
'.
CGI program of Pegasus is, if standard configuration is applied,
NCSA and Apache server also have an option that permits users to locate CGI files in document space. However they prohibit to do so and locate all CGI programs in the directory /cgi-bin/
if safty is required.
Pegasus realizes virtual documents using execution handler.
CGI of Pegasus is one of special configuration of execution handler.
It might be required to explain the term `execution handler', because the term may be original to Pegasus.
`Execution handler' is a program that processes files requested by clients.
User defines relation between path pattern of the request and the program to process it. ( The definition is written in /etc/handler
in service space. ) We call the program `handler' of the file. If requested file is same as the handler, the file is a CGI.
A special handler can be assigned to files with special suffix. Thus we can introduce Server Side Include using execution handler.
A special handler can also be assigned to specific directories. Thus we can
introduce auto-indexing mechanism for the directories of FTP service.
`Execution handler' may be applied to wide range of application and I would
like to emphasize: the execution handler is completely controllable by users
(not by system administrator).