System update spec proposal

Tue Jun 26 18:50:37 EDT 2007

On 6/26/07, Christopher Blizzard <blizzard at redhat.com> wrote:
> A note about the history of using rsync.  We used rsync as the basis for
> a lot of the Stateless Linux work that we did a few years ago.  That
> approach (although using LVM snapshots instead of CoW snapshots)
> basically did exactly what you've proposed here.  And we used to kill
> servers all the time with only a handful of clients.  Other people
> report that it's easy to take out other servers using rsync.  It's
> pretty fragile and it doesn't scale well to entire filesystem updates.
> That's just based on our experience of building systems like what you're
> suggesting here and how we got to where we are today.

I can try to get some benchmark numbers to validate this one way or
the other.  My understanding is that rsync is a memory hog because it
builds the complete list of filenames to sync before doing anything.
'Killing servers' would be their running out of memory.  Rsync 3.0
claims to fix this problem, which may also be mitigated by the
relatively small scale of our use:  my laptop's debian/unstable build
has 1,345,731 files. Rsync documents using 100 bytes per file, so
that's 100M of core required. Not hard to see that 10 clients or so
would tax a machine with 1G main memory.  In contrast, XO build 465
has 23,327 files: ~2M of memory.  100 kids simultaneously updating
equals 2G of memory, which is within our specs for the school server.
Almost two orders of magnitude fewer files for the XO vs. a 'standard'
distribution ought to fix the scaling problem, even without moving to
rsync 3.0.
 --scott

-- 
                         ( http://cscott.net/ )