Webfs presents a file system interface to the parsing and retrieving
of URLs. Webfs mounts itself at mtpt (default /mnt/web), and,
if service is specified, will post a service file descriptor in
/srv/service. The –d flag enables general debug printing to standard
error while the –D flag enables 9P debug prints.
If the environment variable httpproxy is set, all HTTP request
initiated by webfs will be made through that proxy url.
Webfs presents a three–level file system suggestive of the network
protocol hierarchies ip(3) and ether(3).
The top level contains the two files: ctl, and clone.
The top level ctl file is used to maintain parameters global to
the instance of webfs. Reading the ctl file yields the current
values of the parameters. Writing strings of the form ``attr value''
sets a particular attribute.
The following global parameters can be set:
useragent
| |
Sets the HTTP user agent string.
|
timeout
| |
Sets the request timeout in milliseconds.
|
flushauth url
| |
Flushes any associated authentication information for resources
under url or all resources if no url was given.
|
preauth url realm
| |
Preauthenticates all resources under url with the given realm
using HTTP Basic authentication. This will cause webfs to preemptively
send the resulting authorization information not waiting for the
server to respond with an HTTP 401 Unauthorized status.
|
The top–level directory also contains numbered directories corresponding
to connections, which may be used to fetch a single URL. To allocate
a connection, open the clone file and read a number n from it.
After opening, the clone file is equivalent to the file n/ctl.
A connection is assumed closed once all files
in its directory have been closed, and is then will be reallocated.
Each connection has a URL attribute url associated with it. This
URL may be an absolute URL such as http://www.lucent.com/index.html
or a relative URL such as ../index.html. The baseurl attribute
sets the URL against which relative URLs are interpreted. Once
the URL has been set by writing to the ctl file of the
connection, its pieces can be retrieved via individual files in
the parsed directory:
parsed/url
| |
http://pete:secret@www.example.com:8000/cgi/search?q=kittens#results
|
parsed/scheme
parsed/user
parsed/pass
parsed/host
parsed/port
parsed/path
parsed/query
parsed/fragment
If there is associated data to be posted with the request, it
can be written to postbody. Opening postbody or body initiates
the request. If the request fails, then opening the body or writing
to postbody file will fail and return a error string.
When the body file has been opened, response headers appear as
files in the connection directory. For example reading the contenttype
file yields the MIME content type of the body data. If the request
was redirected, the URL represented by the parsed directory will
change to the final destination.
The resulting data may be read from body as it arrives.
The following is a list of attributes that can be set for a connection
prior to initiating the request:
url,baseurl
useragent
| |
Sets a custom useragent string to be used with the request.
|
contenttype
| |
Sets the MIME content type of the postbody.
|
request
| |
Usually, the HTTP method used is POST when postbody file is opened
first or GET otherwise. This can be overridden with the request
attribute to send arbitrary HTTP requests.
|
headers
| |
Adds arbitrary HTTP headers to be sent with the request.
|
|