[PATCH] Maintain a metadata copy outside the index (was Re: Datastore & backup - request for help)

Martin Langhoff martin.langhoff at gmail.com
Wed May 21 19:13:23 EDT 2008

On Thu, May 22, 2008 at 6:14 AM, Tomeu Vizoso <tomeu at tomeuvizoso.net> wrote:
> the patch attached maintains a copy of the metadata of each object
> outside the xapian index. How it works:

Fantastic. Except that... erm... arhm... you forgot the patch ;-)

> - at every create and update, a json file is created next to the object's file,
> - it's also deleted along the object,
> - at startup, if the file <datastore_path>/.metadata.exported doesn't
> exist, check how many objects need to get their metadata exported
> (0.8s for 3000 entries)

That's pretty good.

> - in an idle callback, process each of those objects one per iteration
> (3ms per entry with simplejson, 2ms with cjson).

Exporting a few 100 per iteration probably is more efficient ;-)

> In my tests this has worked quite well, but I have one concern: can
> something bad happen if we have 20k files in the same dir (for a
> journal with 10k entries)?

Ok, we can split it into a subdir (which will only have 10K files then).

If there's a cost to large dirs in jfffs2 then we can use hashed dirs,
and that change will be needed for both the main datastore storage
_and_ the metadata files.

> One side effect of this is that when (if) we agree on a new on-disk
> data structure for the DS, it will be easier to convert than if we had
> to extract all the metadata from the index.

Yes. And as you said earlier, easy recovery if xapian goes to la-la land.


 martin.langhoff at gmail.com
 martin at laptop.org -- School Server Architect
 - ask interesting questions
 - don't get distracted with shiny stuff - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff

More information about the Devel mailing list