.TH INTRO O .SH NAME intro \- introduction to the Octopus File Protocol, Op .SH DESCRIPTION The octopus mounts file systems across network links with bad latency. Links exhibiting RTT times from 50 to 120 milliseconds are common. Such links connect octopus .I terminals to a central computer or .IR PC . A .I terminal in the octopus is a machine providing devices and other services to the .IR PC . The .I PC provides a central name space to .IR terminals . Both the .I PC and .I terminals run .I "file servers" to provide services to be mounted on the other end of the link. The .I Styx protocol (described in section 5) requires too many RPCs to be comfortable for interactive usage across such links, and this protocol along with .IR ofs (4) and .IR oxport (4) provides a mean to bridge Styx .I islands to require fewer RPCs between them. This section describes the protocol, and how it maps to Styx requests and file system calls. .PP The .IR "Octopus File Protocol" , .IR Op , is a network file system protocol used in the octopus for messages between .I clients and .IR servers , when bad latency links connect clients to servers. In .I Op a process called a .I client talks to a process called a .IR server . The .I server is a process that provides one hierarchical file system, or .I "file tree that may be accessed by remote .I client processes. The server responds to requests from .I clients to create, remove, put, and get files. The prototypical server is one that exports a subtree of its own name space. Perhaps, part of the tree corresponds to .I Styx servers that synthesize files on demand, perhaps based on information on data structures or by interfacing to an external device or to the native operating system underneath the octopus, hence Inferno, at a particular computer. .PP Usually, two (network) connections are set up between the .I PC and a .I terminal in the octopus. In one of them, the .I PC is a .I client (for terminal devices) and the .I terminal is a .IR server . In the other, the roles are exchanged. But note that even for the octopus implementation in Inferno, Styx is used within the central computer, and also within any terminal using Inferno. However, servers in the terminal speaking to clients in the central computer do so using .IR Op , and the same happens for exporting the central computer name space to terminal devices. .PP There may be a single client or multiple clients sharing the same connection to an .I Op server, but all of the clients must operate on behalf of the same user. .PP Op follows the design of 9P (or Styx), including its convention for packaging messages for transmission over the connection. In Op, a client transmits .I requests .RI ( T-messages ) to a server, which subsequently returns .I replies .RI ( R-messages ) to the client. The combined acts of transmitting (receiving) a request of a particular type, and receiving (transmitting) a reply for that request is called .I transaction or an .I RPC of that type. .PP But note that there may be more than a single reply message for a given request. In particular, .IR get (O) may ask the server to reply with several messages. In this case, the transaction finishes when all replies have been received (transmitted). .PP Each message consists of a sequence of bytes. Two-, four-, and eight-byte fields hold unsigned integers represented in little-endian order (least significant byte first). Data items of larger or variable lengths are represented by a two-byte field specifying a count, .IR n , followed by .I n bytes of data. Text strings are represented this way, with the text itself stored as a UTF-8 encoded sequence of Unicode characters (see .IR utf (6)). Text strings in Op messages are not null-terminated: .I n counts the bytes of UTF-8 data, which include no final zero byte. The .SM NUL character is illegal in all text strings in Op, and is therefore excluded from file names, user names, and so on. .PP Each Op message begins with a four-byte size field specifying the length in bytes of the complete message including the four bytes of the size field itself. The next byte is the message type, one of the constants in the module .IR op (2). The next two bytes are an identifying .IR tag , described below. The remaining bytes are parameters of different sizes. In the message descriptions, the number of bytes in a field is given in brackets after the field name. The notation .IR parameter [ n ] where .I n is not a constant represents a variable-length parameter: .IR n [2] followed by .I n bytes of data forming the .IR parameter . The notation .IR string [ s ] (using a literal .I s character) is shorthand for .IR s [2] followed by .I s bytes of UTF-8 text. Messages are transported in byte form to allow for machine independence. .SH MESSAGES .PP The following messages are defined in the current version of the protocol. Following manual pages in this section document them. Refer to .IR op (2) for a module providing a Limbo interface. .ta \w'\fLTsession 'u .IP .ne 2v .IR size [4] .B Rerror .IR tag [2] .IR ename [ s ] .IP .ne 2v .IR size [4] .B Tattach .IR tag [2] .IR uname [ s ] .IR path [ s ] .br .IR size [4] .B Rattach .IR tag [2] .IP .ne 2v .IR size [4] .B Tflush .IR tag [2] .IR oldtag [2] .br .IR size [4] .B Rflush .IR tag [2] .IP .ne 2v .IR size [4] .B Tput .IR tag [2] .IR path [ s ] .IR fd [2] .IR mode [2] .IR stat [ n ] .IR offset [8] .IR count [4] .IR data [ count ] .br .IR size [4] .B Rput .IR tag [2] .IR fd [2] .IR count [4] .IR qid [13] .IR mtime [4] .IP .ne 2v .IR size [4] .B Tget .IR tag [2] .IR path [ s ] .IR fd [2] .IR mode [2] .IR nmsgs [2] .IR offset [8] .IR count [4] .br .IR size [4] .B Rget .IR tag [2] .IR fd [2] .IR mode [2] .IR stat [ n ] .IR count [4] .IR data [ count ] .br .IP .IR size [4] .B Tremove .IR tag [2] .IR path [ s ] .br .IR size [4] .B Rremove .IR tag [2] .PP Each T-message has a .I tag field, chosen and used by the client to identify the message. The reply to the message will have the same tag. When a .I Tget request demands more than one reply, all replies must have the same .I tag field (and are considered as a single reply, made of multiple messages). Clients must arrange that no two outstanding messages on the same connection have the same tag. .PP The type of an R-message will either be one greater than the type of the corresponding T-message or .BR Rerror , indicating that the request failed. In the latter case, the .I ename field contains a string describing the reason for failure. .PP Each RPC is considered to be atomic with respect to its execution in the server. There is a limit on the ammount of data that may be sent in Op in a single request (or reply). No single message may carry more than .B MAXDATA bytes in the .B data field, as defined in .B op.m (this puts a limit on the maximum message size, assuming a reasonable maximum size for .B stat in messages carrying it). Nevertheless, .I Tget requests permit multiple messages for each reply, as said in .IR get (O). .PP The .B attach request identifies the user to the server. Permission checking and authentication must take place prior to this transaction. The server must not respond any other request before accepting an Attach RPC. .PP Files can be created (and directories) and their contents (and metadata) updated by means of .B put messages. They are removed by means of .B remove requests. File contents may be obtained (and their metadata) by means of .B get requests. .PP The .B flush request is meant to abort a previous, outstanding, request. It is used to abort ongoing transactions. .PP Everything else is similar to 9P or STYX, in particular, file metadata is exactly that used by STYX. .SH NAMES AND DESCRIPTORS Most T-messages request that an operation be made for a file. Usually, the file is identified by the .I path field of the T-message. The .I path file contains a string with a file name or path (rooted at the server's root directory). The path follows the UNIX (or Inferno or Plan 9) convention for file names. For example, .B /a/b means the file .B b inside the directory .B a inside the root of the server's file tree. Only absolute paths are meaningful for Op. Servers should refuse to accept relative paths. Clients should never send them inside a request. For example, the name for the root directory of the file tree in the server must be .B / (as it could be expected). .PP However, as said in .IR put (O) and .IR get (O), both Tput and Tget may identify the file using the .B fd field, which contains a small integer that represents a .I "file desriptor to the file. This descriptor is to be considered a .I cache of the .I path mentioned in the .B path field. When a valid descriptor is sent in a Tget (or a Tput) the server ignores the .I path and uses .I fd to identify the file to be used for the operation. If the .I fd is invalid, the file server uses .I path instead. The special value .B NOFD (~0) makes this field void and represents a null descriptor. .PP .I "File descriptors are numbers chosen by the server. They are allocated upon request. A client may specify in a Tget or Tput request that more requests of the same type will follow. In that case, the server must allocate a valid (unique) descriptor and send it back to the client in the R-message. The client may use the received descriptor for further requests, and the server must use it to operate on the file. When the client issues the last request (or the client the last reply) the descriptor is deallocated an .B NOFD is sent as .B fd in the reply. Note that the client must issue one last request to cause the descriptor to be deallocated. You may refer to .IR get (O) for an example. .PP When the .I Op server relies to Styx file servers (like .IR oxport (4) does), it must assign a fid (or a file descriptor) for each descriptor allocated for .I Op as described above. This means that a Styx server may still know when a client reaching the server across an .I Op link ceases to use the file. However, note that .I Op file descriptors are not .I fids and that a close (or clunk) on a file may cause an .I Op descriptor to be closed, even if other clients still have the file open. Note also that descriptors are unique for read or write access. That is, .I Op .I fds are allocated either for Put RPCs or for Get RPCs. A file being used both to read and to write would use two different .I Op file descriptors. .SH SEE ALSO .IR intro (2), .IR styx (2). .SH BUGS Still a child, hence doing nasty things and evolving quickly.