An olpcfs experience report

C. Scott Ananian cscott at laptop.org
Fri Apr 25 12:38:59 EDT 2008


On Fri, Apr 25, 2008 at 12:30 PM, Joshua N Pritikin <jpritikin at pobox.com> wrote:
> On Fri, Apr 25, 2008 at 12:12:28PM -0400, C. Scott Ananian wrote:
>  > In the current olpcfs1 implementation, all metadata is stored in
>  > Berkeley DB; the actual file contents are stored in a simple
>  > content-addressable store.
>
>  You say "content-addressable store," but what does that mean
>  implementation-wise? Does that mean that OLPCFS hashes the content of
>  the file into a metadata tag when the file is closed?

yes.  the code is not very difficult; you should read it.

>  In particular, I am wondering what happens when I save a movie made with
>  Record (i.e. save a big file). How much additional overhead does OLPCFS
>  add?

Benchmark it.

>  I presume that the file is not copied an additional time. However, does
>  OLPCFS need to read the file again to calculate the hash? If it does,
>  can it (eventually in version 2) do this in the background?

It could update the hash incrementally as the file is written, or any
one of a number of other strategies.
Hashing the entire 1G NAND filesystem takes about two minutes; most of
that is I/O time, not hashing time.  If you hash as you write, you
don't have to pay the I/O time.

>  > If you use BDB correctly, it seems to me to be pretty solid and mature
>  > -- a conclusion which concurs with the long list of BDB users.
>
>  Sure, but BDB would be even more trustworthy if I could 'rm -rf' the
>  database and regenerate it from scratch without losing data. If metadata
>  was stored in POSIX xattrs then it would seem like your design permits
>  this mode of operation.

Yes, it does, but I doubt we'd want to waste precious NAND space on
storing this data redundantly.  Perhaps.

>  I have seen enough instances of "foolproof" databases that I think we
>  should minimize reliance on them.

Yes, certainly I refuse to use GMail because there's a database behind
it.  Or Firefox for that matter.  Or Linux, since it maintains many
internal databases, and they're not even standard well tested
userspace ones!  And don't get me started on resierfs.  Or  ext2.  Or
ext3!
 --scott

-- 
 ( http://cscott.net/ )



More information about the Devel mailing list