A different proposal for XO upgrade.

Mon Jun 25 17:51:19 EDT 2007

As food for discussion, here's a counter proposal for an XO upgrade
mechanism, focusing on the network side.
---------
Design goals:
 - minimize round-trips necessary for successful upgrade
 - minimize size of upgrades

Software versions are assigned sequential integers.  Version 1, 2,
etc.  We assume that every machine should discover an available
upgrade within a day, on average.  The constants below can be tweaked
to adjust this interval.

Upgrades consist of a number of messages U_0 through U_N with a
maximum size Umax.  The maximum size is chosen either so that a) each
U_i fits in a single network packet, or b) each U_i can fits into a
reserved storage area on the XO (to allow it to be shared with
neighbors).  Upgrade messages are independent.  It may be worthwhile
for upgrade messages to be applied in any order, but for simplicity
we'll constrain the order of application to be sequential.  U_i is
authenticated and self-labeling: it contains a signature as well as a
field specifying the version # of the upgrade as well as i, the
sequence number of the upgrade message.

Part 1: discovering an upgrade.
  Every hour, the XO looks to see how many mesh neighbors it has,
  counting itself.  Call this number N.  N is always at least 1.
  It then generates a random number in [0, N*24).  If the number is
  0, does a DNS lookup on update.laptop.org to check for a new
  version.  (Note that the DNS lookup may be cached by the mesh portal
  or school server.)

  XOs which have version V of the base system also listen to the
  multicast address xxxx:yyyy:zzzz::<V+1>

Part 2: announcing the upgrade.
  If you are the person to discover a new version V', you are the
  Upgrade Leader.  While the version on your laptop (V) is less than V' do
  the following:
    a) Set i=0.  Stop listening to the multicast address.
    b) download U_i for (V+1) via http from update.laptop.org (again,
       these queries are often cached by a school server).  Apply it
       to our local copy (using COW techniques to prevent the
       application from being globally visible until the upgrade is complete)
    c) Broadcast U_i to the multicast address xxxx:yyyy:zzzz::<V+1>.
    d) If U_i is not the last upgrade message, increment i, and repeat
       from step b)
    e) Otherwise, atomically make our updates visible to the system,
       increment V, and restart part 2 if necessary.
    f) Start listening to the multicast address xxxx:yyyy:zzzz::<V'+1>

Part 3: casual bystanders.
  If you hear a piece of the upgrade U_i on the multicast address
  xxxx:yyyy:zzzz::<V+1>:
    a) If there is a piece U_j between U_0 and U_i which you have not
       already heard, ignore this packet.
    b) Otherwise, apply U_i and (re)set the Upgrade Timer to expire
       some random time in the future (some 10s of minutes).
    c) If U_i was the last upgrade message, atomically make the
       updates visible, delete the Upgrade timer, increment our local
       version, and rebind our multicast listening address to
       xxxx:yyyy:zzzz::<V+1> for our new version V.
    d) If the Upgrade Timer expires, become a new Upgrade Leader:
       set i = the smallest i such that we don't have U_i,
       and jump to Part 2 step b.

----------------------
Structure of upgrade messages:

<Version # from>
<Version # to>
<Sequence #>
<final message flag?>
<msg length>
1 or more of:
  <filename to patch>
  <pre-patch hash>
  <post-patch hash>
  <bsdiff format diff>
<signature>

We might keep a git-style manifest for system, patching this like any
other file.  This can be used like fsck to sanity-check/verify the
end-result.

Further, although the 'from' version and 'to' version are separated by
exactly 1 in typical usage, a sequence of upgrade messages can be used
to perform any desired amount of up/down grade.  I would propose
keeping 'up-by-one' and 'down-by-one' upgrade message sequences on
update.laptop.org, so that the user can downgrade to any chosen
revision.  We might either store 'skip-by-N' sequences on the server
(for N=1,2,4,8,16,32,...) or generate upgrade messages on the fly if
we want faster/further upgrades/downgrades.

---------------------
Notes:
  a) the school server can trigger updates on all machines by becoming
     an Upgrade Leader and broadcasting bits.
  b) Upgrades are distributed very efficiently over the air if
     mesh broadcast is working (ie, not too many nodes drop packets).  It
     could be even more efficient if we were allowed to cache entire
     upgrade message sequences on the XO and/or apply upgrade messages
     out-of-order.
  c) The Upgrade Timer timeout should probably depend on the number of
     messages in the upgrade and the # of the message you lost.
  d) Although I've described the process above as an automatic
     upgrade, once the sequence of upgrades has been applied, the actual
     atomic swap of current-tree to upgraded-tree could be deferred until
     some point in the future.
  e) Limiting the size of the upgrade messages U_i is primarily done
     by limiting the # of files patched by a particular message.
     However, we can also split bsdiff patches: bsdiff messages
     consist of a control block specifying a # of operations to
     perform on the input file to create the output file.  We can
     split the control block into pieces to create multiple smaller bsdiffs
     from a monolithic diff.  This is where it would be tricky to
     allow upgrade messages to be applied in any order.
  f) If we distribute any gzipped files (do we?) we should be sure to
     use the --rsyncable option to gzip to ensure efficient binary
     diffs.
-- 
                         ( http://cscott.net/ )