[Code-review] [PATCH] Maintain a metadata copy outside the index.

Tomeu Vizoso tomeu at tomeuvizoso.net
Fri Jun 20 06:00:26 EDT 2008


On Thu, Jun 19, 2008 at 5:51 PM, Michael Stone <michael at laptop.org> wrote:
> On Thu, Jun 19, 2008 at 01:09:33PM +0200, Tomeu Vizoso wrote:
>> > What happens if our _store_metadata() function dies with something like
>> > ESPACE?
>>
>> The laptop won't boot anymore? ;) What do you suggest to do here?
>
> The big things I can think of are:
>
>  a) not to kill Sugar. Would an exception here bring down the rest of
>  the system?

No, the exception will be contained inside the event handler, so:

- if happens in the initial dump, will abort it and will try again on next boot,
- if happens during a normal create() or update(), that operation will
be aborted and the activity will receive the exception across dbus.

>  b) to inform an in-flight backup process that we're out of space so
>  that it can make its own decision about whether to continue.

Martin, any ideas?

> The thing I'm getting at is that we may well be invoking the backup
> software _because_ the system is in a crazy state and we want to try to
> save our data. It would be really good if this were feasible to achieve.
> (I'm not quite sure whether it is.)

In principle, all metadata will be in json files after the first boot
after update, so we're quite well in this regard. Unless flash got
full because of the update, that I hope won't happen.

> Incidentally, why do we hardcode saving the .metadata files into the
> 'base' DS directory? Sure, it's a sensible default setting, but if I had
> a USB key (er, mass storage device) mounted and handy, I'd be really
> interested in being able to easily dump my DS onto it. In that
> circumstance, it would seem to me to make the most sense for the backup
> software to instruct the DS to write metadata straight into a directory
> on the USB key. Thoughts?

Well, metadata is being put next to the files they relate to. I'm
supposing here that we only are interested in the metadata if we have
the files and vice versa.

> New patch
> ---------
>
>> +try:
>> +    import cjson
>> +except ImportError:
>> +    import simplejson
>
> ...
>
>>      # File Management API
>> +    def _encode_json(self, metadata, file_path):
>> +        if has_cjson:
>> +            f = open(file_path, 'w')
>> +            f.write(cjson.encode(metadata))
>> +            f.close()
>> +        else:
>> +            simplejson.dump(metadata, open(file_path, 'w'))
>> +
>
> What do you think of the style of the following helper? (Notice that it
> takes the file object instead of the path.)
>
> def write_json_to_file(filelike, close=True):
>    try:
>        filelike.write(cjson.encode(metadata))
>        filelike.close()
>    except NameError:
>        simplejson.dump(metadata, filelike)

How is it to be called? What's the close param for? What if the write
fails, do we know when the file will be closed? NameError could be
raised by any other reason? Is dump() going to close it?

> Also, what permissions are going to be assigned to the .metadata files?
> Are these permissions correct?

644, looks like. Which permissions should it have? Same as the data file (604)?

>> +        # Check that all entries have their metadata in the file system.
>> +        if not os.path.exists(os.path.join(self.base, '.metadata.exported')):
>
> Could you explain why testing for the absence of this flag file is the
> appropriate way to determine whether we need to do a full metadata
> export? I think it's correct but I'd still like to hear the reasoning
> spelled out.

This patch does two things:

- maintain an external representation of the metadata in a file next
to the data file,
- generate that metadata file for all the existing entries after update.

Checking whether any entry needs its metadata to be exported is an
expensive operation that is saved by just testing for that file, that
is created when we succeed in exporting all the existing entries.

>> +            uids_to_export = []
>> +            uids = self.indexmanager.get_all_ids()
>> +
>> +            t = time.time()
>
> This time measurement seems to be unused. (Incidentally, have you
> pylinted the resulting code?)

Right, will give it a try now.

Thanks,

Tomeu


More information about the Code-review mailing list