An olpcfs experience report

C. Scott Ananian cscott at laptop.org
Fri Apr 25 12:12:28 EDT 2008


On Fri, Apr 25, 2008 at 10:24 AM, Joshua N Pritikin <jpritikin at pobox.com> wrote:
>  I wish I had more time to study the implementation.
>
>  What is stored in Berkeley DB? Only the indexes? The dependency on
>  Berkeley DB makes me nervous. I recall the Subversion Berkeley DB
>  backend suffered from the occational need to recover the database in a
>  non-automatic way.

The current implementation is primarily intended as a
proof-of-concept.  Many of the datastructures would be more-or-less
similar in any implementation, but things like the choice of Berkeley
DB are certainly not fixed.

In the current olpcfs1 implementation, all metadata is stored in
Berkeley DB; the actual file contents are stored in a simple
content-addressable store.

My experience with Berkeley DB is that it is very sensitive to
programming errors.  If you mishandle it, it *will* corrupt your
database, and that will likely require manual intervention.  But if
you use the API properly, BDB seems extremely robust and
fault-tolerant.

My original commits for olpcfs1 used a very complicated multi-index
scheme on some of the databases.  BDB is pretty intolerant of your
mucking around in the secondary indices, especially during iteration.
I certainly had some frustrating BDB experiences during this period.
Upgrading to the latest versions of the BDB bindings made these
problems more transparent to me (since I got failures more directly
correlated with the places I was doing the Bad Things), and eventually
(for other reasons) I switched to a much simpler single-index scheme.
This minimized my ability to inadvertently do things which violated
BDB's API, and BDB has been much happier since.  Reverting to the
'stock' versions of the BDB bindings didn't change BDB's happiness.

So, in summary: I fully understand where the BDB horror stories come
from, but in my (limited) experience the blame is primarily on the
users of the APIs, and secondarily on the BDB documentation, which
doesn't always spell out its intolerance for mucking around with the
secondary indices.  If you use BDB correctly, it seems to me to be
pretty solid and mature -- a conclusion which concurs with the long
list of BDB users.
 --scott

-- 
 ( http://cscott.net/ )



More information about the Devel mailing list