Expanding your Grid
D1362280118
Amycroftiv
#Plan 9 was designed as a distributed system. After you install the
#distribution from the cd, you have a self-sufficient one machine
#system, a standalone terminal. We will consider this as "Level 0" -
#how do you proceed from here to a network of Plan 9 machines and
#provide Plan 9 services to other clients? Note that this guide does
#not imply a strict dependence on the previous level, it is entirely
#possible to setup a Plan 9 DHCP/PXE boot server (described in level
#7) without performing all the steps described in previous levels.
#This is a rough guide which proceeds in the order of increasing
#number of machines used and increasing elaboration and customization
#of configuration.
#
#LEVEL 1: UPGRADING THE INSTALL TO A CPU SERVER
#
#The traditional first step is following the [Configuring a
#Standalone CPU Server] wiki page. The transformation from terminal
#to cpu server provides many additional capabilities:
#
# * Remote access via cpu(1)
# * Multiple simultaneous users if desired
# * Exportfs(4) and other services such as ftpd(8)
# * Authentication with authsrv(6) for securing network services
# * Integration with other operating systems via Drawterm
#
#Even if you are only planning on making use of a single Plan 9
#system, it is highly recommended to configure it as a cpu server.
#You probably use other operating systems as well as Plan 9, and
#configuring your Plan 9 system as a cpu server will make it vastly
#more useful to you by enabling it to share resources easily with
#non-Plan 9 systems.
#
#It should also be noted that the division between terminals and cpus
#is mostly a matter of conventional behavior. It is possible to
#configure all services to run on a machine using the terminal
#kernel, but there is no particular advantage to this. The cpu kernel
#is easy to compile and can run a local gui and be used as a combined
#terminal/cpu server.
#
#LEVEL 2: CPU AND DRAWTERM CONNECTIONS BETWEEN MULTIPLE MACHINES
#
#The core functionality of a cpu server is to provide cpu(1) service
#to Plan 9 machines and Drawterm clients from other operating
#systems. The basic use of cpu(1) is similar to ssh in unix or remote
#desktop in Windows. You connect to the remote machine, and have full
#use of its resources from your terminal. However, Plan 9 cpu(1) also
#makes the resources of the terminal available to the cpu at
#/mnt/term. This sharing of namespace is actually the method of
#control also, because the cpu accesses the terminal's devices
#(screen, keyboard, mouse) by binding them over its own /dev. If you
#are a new Plan 9 user, you are not expected to understand this.
#
#Drawterm from other operating systems behaves the same way. When you
#[drawterm] to a Plan 9 cpu server, your local files will be
#available at /mnt/term. This means you can freely copy files between
#Plan 9 and your other os without the use of any additional
#protocols. In other words, when working with drawterm, your
#environment is actually a composite of your local os and the Plan 9
#system - technically it is a three node grid, because the Drawterm
#program acts as an ultra-minimal independent Plan 9 terminal system,
#connecting your host os to the Plan 9 cpu server.
#
#For many home users, this style of small grid matches their needs. A
#single Plan 9 cpu/file/auth server functions both as its own
#terminal, and provides [drawterm] access to integrate with other
#operating systems. Some users like to add another light terminal
#only Plan 9 system as well. Recently (2013) Raspberry Pi's have
#become popular for this purpose. Another option with surprising
#benefits is using virtual machines for a cpu server. Because of Plan
#9's network transparency, it can export all of its services and its
#normal working environment through the network.
#
#LEVEL 3: A SEPARATE FILE SERVER AND TCP BOOT CPUS
#
#A standard full-sized Plan 9 installation makes use of a separate
#file server which is used to tcp boot at least one cpu server and
#possibly additional cpus and terminals. A tcp booted system loads
#its kernel and then attaches to a root file server from the network.
#Some of the strengths of the Plan 9 design become more apparent when
#multiple machines share the same root fs.
#
# * Sharing data is automatic when the root fs is shared
# * Configuration and administration for each machine is located in
# one place
# * Hardware use is optimized by assigning systems to the best-suited
# role (cpu/file/term)
# * New machines require almost no configuration to join the grid
#
#If you have already configured a standalone cpu server, it can also
#act as a file server if you instruct its fossil(4) to listen on the
#standard port 564 for 9p file service. Choosing "tcp" at the "root
#is from" option during bootup allows you to select a network file
#server. You can use plan9.ini(8) to create a menu with options for
#local vs tcp booting. A single file server can boot any number of
#cpu servers and terminals.
#
#The first time you work at a terminal and make use of a cpu server
#when both terminal and cpu are sharing a root fs from a network file
#server is usually an "AHA!" moment for understanding the full Plan 9
#design. In Plan 9 "everything is a file" and the 9p protocol makes
#all filesystems network transparent, and applications and system
#services such as the sam(1) editor and the plumber(4) message
#passing service are designed with distributed architecture in mind.
#A terminal and cpu both sharing a root fs and controlled by the user
#at the terminal can provide a unified namespace in which you can
#easily forget exactly what software is running on which physical
#machine.
#
#LEVEL 4: SHARING /SRV /PROC AND /DEV BETWEEN MULTIPLE CPUS
#
#The use of synthetic (non-disk) filesystems to provide system
#services and the network transparent 9p protocol allow Plan 9 to
#create tightly coupled grids of machines. This is the point at which
#Plan 9 surpasses traditional unix - in traditional unix, many
#resources are not available through the file abstraction, nfs does
#not provide access to synthetic file systems, and multiple
#interfaces and abstractions such as sockets and TTYs must be managed
#in addition to the nfs protocol.
#
#In Plan 9 a grid of cpus simply import(4) resources from other
#machines, and processes will automatically make use of those
#resources if they appear at the right path in the namespace(4). The
#most important file trees for sharing between cpus are /srv, /proc,
#and some files from /dev and /mnt. In general all 9p services
#running on a machine will post a file descriptor in /srv, so sharing
#/srv allows machines to make new attaches to services on remote
#machines exactly as if they were local. The /proc filesystem
#provides management and information about running processes so
#import of /proc allows remote machines to control each other's
#processes. If machines need to make use of each other's input and
#output devices (the cpu(1) command does this) access is possible via
#import of /dev. Cpus can run local processes on the display of
#remote machines by attaching to the remote rio(4) fileserver and
#then binding in the correct /mnt and /dev files.
#
#LEVEL 5: SEPARATE ARCHIVAL DATA SERVERS AND MULTIPLE FILE SERVERS
#
#This guide has been referring to "the file server" and not making a
#distinction between systems backed by venti(8) and those without. It
#is possible and recommended to use venti(8) even for a small single
#machine setup. As grids become larger or the size of data grows, it
#is useful to make the venti server separate from the file server.
#Multiple fossil(4) servers can all use the same venti(8) server.
#Because of data deduplication multiple independent root filesystems
#may often be stored with only a slight increase in storage capacity
#used.
#
#Administering a grid should include a system for tracking the
#rootscores of the daily fossil snapshots and backing up the venti
#arenas. A venti-backed fossil by default takes one archival snapshot
#per day and the reference to this snapshot is contained in a single
#vac: score. See vac(1). Because fossil(4) is really just a temporary
#buffer for venti(8) data blocks and a means of working with them as
#a writable fs, fossils can be almost instantaneously reset to use a
#different rootscore using flfmt -v.
#
#To keep a full grid of machines backed-up, all that is necessary is
#to keep a backup of the venti(8) arenas partitions and a record of
#the fossil(4) rootscores of each machine. The rootscores can be
#recovered from the raw partition data, but it is more convenient to
#track them independently for faster and easier recovery. The
#simplest and best system for keeping a working backup is keeping a
#second active venti server and using venti/wrarena to progressively
#backup between them. This makes your backup available on demand
#simply by formatting a new fossil using a saved rootscore and
#setting the backup venti as the target. If the data blocks have all
#been replicated, the same rootscores will be available in both. See
#venti-backup(8).
#
#LEVEL 6: MULTI-OS GRIDS USING U9FS, 9PFUSE, INFERNO, PLAN9PORT, 9VX
#
#The 9p protocol was created for Plan 9 but is now supported by
#software and libraries in many other operating systems. It is
#possible to provide 9p access to both files and execution resources
#on non Plan 9 systems. For instance, Inferno speaks 9p (also called
#"styx" within Inferno) and can run commands on the host operating
#system with its "os" command. Thus an instance of Inferno running on
#Windows can bring those resources into the namespace of a grid.
#Another example is using plan9port, the unix version of the rc
#shell, and 9pfuse to import a [hubfs] from a Plan 9 machine and
#attach a local shell to the hubs. This provides persistent unix rc
#access as a mountable fs to grid nodes.
#
#Please see [connecting to other OSes] and [connecting from other
#OSes].
#
#LEVEL 7: STANDALONE AUTH SERVER, PLAN 9 DHCP, PXE BOOT, DNS SERVICE, /NET IMPORTS, /NET.ALT
#
#Plan 9 provides mechanisms to manage system roles and bootup from a
#DHCP/PXE boot server. At the size of grid used by an institution
#such as Bell Labs itself or a university research department, it is
#useful to separate system roles as much as possible and automate
#their assignment by having Plan 9 function to assign system roles
#and ips to Plan 9 machines via pxe boot and Plan 9 specific dhcp
#fields. This requires configuring ndb(8) to know the ethernet
#addresses of client machines and which kernel to serve them and a
#well-controlled local network.
#
#One of the most common specialized roles in a mid-to large sized
#grid is the standalone auth-only server. Because auth is so
#important and may be a single point of failure of a grid, as well as
#for security reasons, it is often a good idea to make the auth
#server an independent standalone box which runs nothing at all
#except auth services and is hardened and secured as much as possible
#against failure and presents the minimal attack surface. In an
#institution with semi-trusted users such as a university, the auth
#server should be in a physically separate and secure location from
#user terminals.
#
#A grid of this size will probably also have use for DNS service. For
#personal users on home networks, variables such as the
#authentication domain are often set to arbitrary strings. For larger
#grids which will probably connect to public networks at some nodes,
#the ndb(8) and authsrv(6) configuration will usually be coordinated
#with the publicly assigned domain names. It is also at this point
#(public/private interface) where machines may be connected to
#multiple networks using /net.alt (simply a conventional place for
#binding another network interface or connection) and may make use of
#one of Plan 9's most famous applications of network transparency -
#the import(4) of /net from another machine. If a process replaces
#the local /net with a remote /net, it will transparently use the
#remote /net for outgoing and incoming connections.
#
#LEVEL 8: RESEARCH AND COMPUTATION GRIDS WITH CUSTOM CONTROL LAYERS AND UNIQUE CAPABILITIES
#
#In its role as a research operating system, the capabilities of Plan
#9 as a distributed system are often extended by specific projects
#for specific purposes or to match specific hardware. Plan 9 does not
#include any built-in capability for things like task dispatch and
#load balancing between nodes. The Plan 9 approach is to provide the
#cleanest possible set of supporting abstractions for the creation of
#whatever type of high-level clustering you wish to create. Some
#examples of research grids with custom software and capabilities:
#
# * The main Bell Labs Murray Hill installation includes additional
# software and extensive location specific configuration
# * The [Laboratorio de Sistemas | http://lsub.org] and project
# leader Francisco J Ballesteros (Nemo) created Plan B and the
# Octopus, derived from Plan 9 and Inferno respectively.
# * The [XCPU project | http://xcpu.sourceforge.net] is clustering
# software for Plan 9 and other operating systems created at the Los
# Alamos National Laboratory
# * The [Plan 9 on IBM Blue Gene |
# http://doc.cat-v.org/plan_9/blue_gene] project utilizes special
# purpose tools to let Plan 9 control the architecture of the IBM
# Blue Gene L.
# * The [ANTS | http://ants.9gridchan.org] Advanced Namespace ToolS
# are a 9gridchan project for creating failure-tolerant grids and
# persistent LAN/WAN user environments.
#
#These are examples of projects which are built on 9p and the Plan 9
#design and customize or extend the operating system for additional
#clustering, task management, and specific purposes. The flexibility
#of Plan 9 is one of its great virtues. Most Plan 9 users customize
#their setups to a greater or lesser extent with their own scripts or
#changes to the default configuration. Even if you aren't aspiring to
#build a 20-node "Manta Ray" swarm to challenge Nemo's Octopus,
#studying these larger custom systems may help you find useful
#customizations for your own system, and the Plan 9 modular design
#means that some of the software tools used by these projects are
#independently useful.
#
#LEVEL 9: POWER SET GRIDS?
#
#Because existing grids already have the label "9grid" it is
#theorized the as-yet unreached Ninth Level of Gridding corresponds
#to the power set operation. The previously described eight levels of
#gridding already encompass an infinite Continuum of possibilities.
#To surpass the existing level requires finding a way to construct
#the power set of this already infinite set. Level Nine grids are
#therefore transfinite, not merely infinite, and it is an open
#question if current physical reality could accommodate such
#structures. At the present moment, research indicates such
#hypothetical Level Nine grids would need to post mountable file
#descriptors from quasars and pulsars and store information by
#entropic dissipation through black hole event horizons. See astro(7)
#and scat(7).
#