XO in-field upgrades

C. Scott Ananian cscott at laptop.org
Sun Jun 24 13:24:28 EDT 2007


On 6/24/07, Ivan Krstić <krstic at solarsail.hcs.harvard.edu> wrote:
> I should have a concrete spec ready for discussion later today.

I will wait with bated breath. =)

Some concrete concerns -- I've got some answers to these, but I'll try
to just present the questions at this point:
  a) Robustness -- what can break our upgrade mechanism?  Do we have a
fallback?  What dependencies does the upgrade mechanism have?
  b) Current vserver code requires restarting the containers when
files inside the container are modified by the root context.  There is
also a relinking process necessary.  Have we thought through these
interactions?
  c) How do XOs know there is an upgrade available?  Does this
query/notification work both with and without a school server?
  d) We can't afford to have all the XOs contact Cambridge directly to
get updates.  Likewise, if the XOs get all their updates from the
school server, the machines closest to the mesh portal will be passing
the same bits around redundantly.  Can we use mesh topography to
improve distribution?
  e) How small/fast can we make an upgrade, to patch (say) a specific
0-day in one of our applications?  If we can fit a patch into a single
network packet, our distribution success probability should increase
dramatically.
  f) Does our mechanism scale gracefully for both small (0-day
security) and large (FC7-to-FC8) upgrades?
  g) Related to the above three: I've been told that the XOs can't
possibly afford the space to store upgrades in order to be able to
provide them to neighbors.  I'm not sure I agree with this, but this
concern should be addressed.
  h) What happens if we need to patch the kernel to fix a security
problem?  What if we need to patch firmware?
  i) What's the development process like?  Can we easily create and
test upgrades?
  j) Is the "base system" upgrade mechanism related in any way to the
activity upgrade mechanism?  If so, activity upgrades have a whole
'nuther set of concerns.  If not, how can we avoid reinventing the
same wheels (esp. with regard to distributing an activity upgrade
efficiently to a classroom full of machines).

Since I can't resist a few stabs at answers:
 a) I'd like to see two independent upgrade mechanisms available, one
of them as simple as practical and hard-coded into firmware.
 d) If upgrades are broadcast to the mesh from the portal, the flood
fill will ensure the bits efficiently reach most of the machines.

I hope that listing these issues is helpful.
 --scott

-- 
                         ( http://cscott.net/ )


More information about the Devel mailing list