.TL Rebuilding fossil from venti arenas .AU Steve Simon .AI steve \fBat\fI quintile \fBdot\fI net .SH Machine prerequisites .LP The machine used must have an ethernet card (though no active network is required). The loopback ether device cannot be used as its not currently built into the 9pccd kernel. A spare disk partition is also needed, which must be larger than the total size venti index to be build. .LP It is very useful to have hardcopies of the following manual pages: prep(8), plan9.ini(8), venti(8), ventiaux(8), fossilcons(8) fossil(8), and fs(3). .SH Example senario .LP In this example the fossil and index disks are not damaged but being replaced, this is actually slightly more complex than just rebuilding a server as the partitions and SCSI target numbers have to be changed. .LP The changes described below result from the replacement of the two venti index disks with a single index, and the creation of an secondary fossil filesystem, that is not backed up; by convention this is called \fBother\fR. .LP .TS center box ; l s s s s s s l l l l l l l l|l|l|l|l|l|l l|l|l|l|l|l|l l|l|l|l|l|l|l l|l|l|l|l|l|l l s s s s s s l s s s s s s l|l|l|l|l|l|l . Old layout = sd00 sd01 sd02 sd04 sd06 sd08 sd010 _ 9fat isect0 isect1 arenas0 nvram fossil swap _ New layout = 9fat arenas0 isect0 arenas0 nvram other mirror fossil swap .TE .NH Boot from CD .LP The machine to be modified should be booted from CD. .LP Note: The x86 bootable ISO image distributed by Bell Labs boots expects the CD drive to be attached to the secondary master IDE interface. .NH 2 Get the last valid Venti archive score .LP .DE To rebuild from a venti archive a score is needed, this must be a score as printed on the fossil console when the nightly \fBsnap -a\fR occurs, or a score (as here) extracted from a fossil archive by fossil/last. .LP A score as produced by the \fBvac\fR command on the fossil console or the \fBvac(1)\fR command line tool could be used, however the directory structure in the rebuilt fossil will NOT CONTAIN the top level \fI/active\fR, \fI/archive\fR, and \fI/snapshot\fR directories, so this is not reccomended. .DS .CW cpu% fossil/last /dev/sd00/fossil > /tmp/last.vac .DE .NH 2 Dump all VAC scores .LP If you don't have a recent VAC score with which to reinitialise your fossil from then you can extract all of them using /sys/src/cmd/venti/dumpvacroots. .IP If you have an old boot CD you may need to compile /sys/src/cmd/venti/8.printarenas and edit dumpvacroots setting the IP address of your venti server. Newer CDs have printarenas precompiled and dumpvacroots expects to use the \fIventi=\fR enviroment variable. .LP Dumpvacroots will print the scores of all the recent venti archives in date order, you most probably want to use the last one printed (I.E. the most recent). Dumpvacroots will take ten or fifteen mins to run. .DS .CW cpu% echo $venti tcp!192.168.0.5!17034 cpu% cd /sys/src/cmd/venti cpu% ./dumpvacroots | tail vac:823732... vac:5628943... .DE .NH Modify the venti config .LP Venti's configuration must be changed to reflect the new disk layout. .LP Venti's configuration is conventionally stored in a block at the start of the \fIarenas0\fR partition rather than in a file in the filessytem, This allows the system to boot directly from fossil/venti. .DS .CW cpu% venti/conf /dev/sd06/arenas0 index main isect /dev/sd01/isect0 isect /dev/sd02/isect1 arenas /dev/sd06/arenas0 .SM \fI # Dump old venti layout .DE .DS .CW cpu% venti/conf -w /dev/sd06/arenas0 < EOF index main isect /dev/sd08/isect0 arenas /dev/sd06/arenas0 EOF .SM \fI # Write new venti layout .DE .NH Initialise the fossil/nvram/9fat disk .LP The \fI9fat\fR partition will contain the low level boot loader and machine configuration file (Plan9.ini). \fINvram\fR holds the machines key allowing it to boot unattended. \fIFossil\fI is the write buffer for the filesystem holding snapshots and modified files not yet archived to venti. .LP By convention the \fI9fat\fR partition is the first partition on the disk, putting it further that 8½Gb into the disk can cause problems with booting as cylinder/head/sector addressing used by most BIOSs cannot address further than this into the disk - see the section on LBA in 9load(8). This partition need only be about 100Mb in length. .LP The \fInvram\fR partition requires only a single 512 byte sector. .LP The \fIfossil\fR partition need be only big enough to hold the biggest file you will need to write to the system, and will also limit the number of bytes you can write per day. The latter is not strictly true ad multiple archival snapshots may be taken per day, however it is a reasonable rule of thumb; fossil is typically between 2Gb and 8Gb. .DS .CW cpu% disk/mbr -m /386/mbr /dev/sd04/data cpu% disk/fdisk -baw /dev/sd04/data cpu% disk/prep /dev/sd04/plan9 .SM # see manual for usage of prep(8) .DE .NH Initialise isect/other disk .LP Venti performance can be improved if the venti indexes are split across several physical disks, however, this has not been done here. The total size of all the index slices needs to be only about five percent of the venti arenas. .DS .CW cpu% disk/mbr -m /386/mbr /dev/sd08/data cpu% disk/fdisk -baw /dev/sd08/data cpu% disk/prep /dev/sd08/plan9 .SM # see manual for usage of prep(8) .DE .NH Format each isect slice .LP Each slice must be branded with its name - usually the same name as the partition's name. This will take about 10 mins per slice. Only one isect slice is used in this example. .DS .CW cpu% venti/fmtisect isect0 /dev/sd08/isect0 .DE .NH Combine all isect slices into an index .LP All the index slices must now be combined into a single index, and populated with references into the venti archive. .DS .CW cpu% venti/fmtindex /dev/sd06/arenas0 .DE .NH Rebuild the index from the index slices .LP Here \fBother's\fR partition is used as temporary space for the index rebuild, alternatively another disk could have been added for the duration of the rebuild. The partition used must be bigger that the combined size of all the index slices. This process takes about 15 mins. .DS .CW cpu% venti/buildindex /dev/sd06/arenas0 /dev/sd08/other .DE .NH Start ethernet .LP Fossil and venti to communicate via TCP/IP so the ethernet device must be initialised. .DS .CW cpu% ip/ipconfig ether /net/ether0 add 192.168.0.5 255.255.255.0 .DE .NH Start venti .LP The -h attribute is required to start the http server built into venti, This is necessary only if you want to run dumpvacroots(1) below. .DS .CW cpu% venti/venti -h tcp!192.168.0.5!8000 -c /dev/sd06/arenas0 .DE .NH Load fossil's config .LP Fossil's configuration is conventionally stored in a block at the start of the \fIfossil\fR partition rather than a a file in the filessytem. Like \fIventi\fR this allows the system to boot from its own disks rather than fossil starting after the kernel has booted from aonther filesystem (kfs(1) or via a network connection for example). .LP .DS .CW cpu% fossil/conf -w /dev/sd04/fossil << EOF fsys main config /dev/sd04/fossil fsys other config /dev/sd08/other fsys main open -c 14848 fsys other open -c 14848 fsys main snaptime -s 15 -a 0400 -t 3600 listen tcp!*!564 EOF .DE .NH Initialise fossil data from venti. .LP Here the vac score saved earlier is used, first removing the leading \fBvac:\fR string. .LP The file tree is not actually loaded into fossil, meerly a reference to the top of the tree is inserted, therefore this takes only a second. .DS .CW cpu% score=`{sed 's/^vac://' /tmp/last.vac} cpu% fossil/flfmt -h 192.168.0.5 -v $score /dev/sd04/fossil .DE .NH Format other. .LP During the rebuild of the venti's indices \fBother\fR was overwritten, it now needs to be formatted for fossil. .DS .CW cpu% fossil/flfmt /dev/sd08/other .DE .NH Format and initialise the 9fat partition .LP Load a kernel, both boot-strap loaders, and and plan9.ini into the 9fat partition. .DS .CW cpu% disk/format -b /386/pbslba -d -r 2 /dev/sd04/9fat /386/9load /386/9pcf /tmp/plan9.ini .SM \fI# This line was wrapped in formatting for this document .DE .NH nvram partition .LP As the disk containing the nvram partition is now at target 4 it is necessary to tell the kernel to find it, by adding the following to plan9.ini. .DS .CW nvroff=0 nvrlen=512 nvram=#S/sd04/nvram .DE .LP If these enviroment variables are also initialised on the current shell then \fIwrkey\fR can can be used to setup the nvram, alternatively \fIkeyfs\fR will generate similar prompts if it discoves an invalid nvram partition when the machine first boots. .DS .CW cpu% auth/wrkey auth id: bootes auth dom: plan9.mydomain.dom password: xyzzy1 secstore password: xyzzy2 .DE .LP If bootes's secstore is populated with a key for sources.cs.bell-labs.com then these keys may be read into factotum via /rc/bin/cpurc. .DS .CW # This example is taken from a running system cpu% grep factotum /bin/cpurc auth/secstore -n -G factotum >> /mnt/factotum/ctl cpu% grep outside /mnt/factotum/ctl key proto=p9sk1 dom=outside.plan9.bell-labs.com user=stevesimon !password? .DE .NH Reboot. .bp .SH Appendix A .PP Converting Venti to a mirrored pair. .LP As the Venti arenas are the only pieces of the system which cannot easily be regenerated it is prudent to protect them by mirroring with fs(3). Mirrored partitions must be the same size though the disks on which they reside need not be. Continuing the example above we mirror the entire venti disk /dev/sd06/data onto /dev/sd010/data. To hold the fs(3) configuration a separate fscfg partition must be generated, this is most easily done by stealing a sector from the swap partition on /dev/sd04/swap. .NH 1 Reboot onto the CDROM .LP Though the mirrored disk can be copied live as detailed in fs(3) other parts config require a reboot so it is safest to make the changes below whilst booted from a standalone CDROM. .NH 1 Create the fscfg partition .LP Use disk/prep to change the partition table for /dev/sd04/plan9 reducing the size of swap by one one 512 byte sector and creating a new fscfg partition in this space. .NH 1 Update plan9.ini .LP Edit plan9.ini, changing all references to /dev/sd06/arenas0 with /dev/fs/arenas0. Add a variable fscfg. The boot processes initialises the fs(3) driver if it sees this definition in plan9.ini . Note the spelling of \fBfsconfig\fR . .DS .CW fsconfig=/dev/sd04/fscfg .DE .NH 1 Create a fscfg file .LP Ensure any .B mirror lines list the fastest disk(s) first as reads are always performed from the first disk listed (assuming returns no errors). .DS .CW term% cat /tmp/fscfg.txxt fsdev: mirror arenas0 /dev/sd06/arenas0 /dev/sd010/arenas0 .DE .NH 1 Install fscfg .LP Put the fscfg info into /dev/sd04/fscfg, there is no utility to do this but dd(1) will suffice. .DS .CW cpu% dd -if /tmp/fscfg.txt -of /dev/sd04/fscfg -count 1 .DE .NH 1 Edit venti config .LP Use venti/conf to read and write the configuration, replacing all references to /dev/sd06/arenas0 with /dev/fs/arenas0 .NH 1 Copy the disks .DS .CW cpu% dd -if /dev/sd06/data -of /dev/sd010/data -bs 1024k .DE .NH 1 Reboot .bp .SH Appendix B .PP On Venti and fossil cache sizes, by Russ Cox .LP .I suppose I have a fossil buffer of 1 Gb, 50 Gb of venti arenas, 0.75 Gb of ram, and I want the machine to be basically a file server, but still be able to run rio and a few other things without running out of memory, how do I use the memory I have in the most efficient way? .R .LP First decide how much memory you want for interactive use. Suppose this is 256MB. You probably want to set kernelpercent down to something small given how much memory you have. Suppose you set it to 20%. Then that leaves you 614MB. Suppose you keep 102MB for yourself, leaving 512MB for fossil+venti. .LP Now the question is how to partition the 512. If the Venti is used primarily for backing the fossil, then it makes sense to give fossil most of the memory, since fossil does its own caching of Venti reads/writes, and reading even from the Venti cache is noticeably slower than satisfying requests entirely from the fossil cache. .LP I would give 8MB to each of Venti's uses and leave the rest for fossil: .DS .CW venti -B 8M -C 8M -I 8M open -c 62424 .DE 62424 is (512-8*3)*1024*1024/8192, assuming you have an 8k block size. It is probably wrong that -c takes a block count instead of bytes like the others. .LP I've been running with the config suggested in the wiki, 8M for each venti guy and also 8M (the default 1000 blocks) for fossil. I have been meaning to switch to some small amount of cache for Venti and more cache for fossil. I think that will help things a bit. .DS .CW venti -B 1M -C 1M -I 1M open -c 3712 .DE seems like a much better use of the 32MB. .LP