[PATCH] Maintain a metadata copy outside the index (was Re: Datastore & backup - request for help)
tomeu at tomeuvizoso.net
Thu May 22 04:53:38 EDT 2008
On Thu, May 22, 2008 at 5:45 AM, Jameson Chema Quinn
<jquinn at cs.oberlin.edu> wrote:
> Yay, I am happy about this patch (when there is a patch :)
>> > - at every create and update, a json file is created next to the
>> > object's file,
> I definitely think it should be in the same directory as the object file,
> with a related name. It might even be worth using the macintosh ._name
> naming convention.
> (Note that when we have directories as bundles, bundle-level metadata can
> live in a ._. file. If all bundles had some kind of manifest, then any
> subfiles which are used separately could grow their own metadata in
> ._subfile ; as long as that file were not in the manifest, it would not be
> packed up when exporting the bundle to foreign storage.)
In the proposed solution, the storage dir is still not very friendly
to being browsed, because of the lack of titles-as-filenames and flat
structure without search and filtering. AFAIK, these conventions in
OSX are oriented towards improving the life of people using normal
tools to browse the actual files.
>> > - it's also deleted along the object,
>> > - at startup, if the file <datastore_path>/.metadata.exported doesn't
>> > exist, check how many objects need to get their metadata exported
>> > (0.8s for 3000 entries)
>> That's pretty good.
>> > - in an idle callback, process each of those objects one per iteration
>> > (3ms per entry with simplejson, 2ms with cjson).
>> Exporting a few 100 per iteration probably is more efficient ;-)
> This brings up the issue of TamTam imperfect timing - it would be great if
> there were some way to turn off all unnecessary background CPU use for cases
> like TamTam. If so, I'd say 12*3ms is about the right size for a background
> click every second or two.
Remember that this export process will happen only on first boot after
upgrade and will hopefully last little.
>> > In my tests this has worked quite well, but I have one concern: can
>> > something bad happen if we have 20k files in the same dir (for a
>> > journal with 10k entries)?
>> Ok, we can split it into a subdir (which will only have 10K files then).
>> If there's a cost to large dirs in jfffs2 then we can use hashed dirs,
>> and that change will be needed for both the main datastore storage
>> _and_ the metadata files.
But using a hashed dir will make browsing the actual files more
cumbersome to the occasional observer.
More information about the Devel