[Code-review] [PATCH] Maintain a metadata copy outside the index.

Michael Stone michael at laptop.org
Thu Jun 19 11:51:28 EDT 2008


On Thu, Jun 19, 2008 at 01:09:33PM +0200, Tomeu Vizoso wrote:
> No, will have to be added to the spec, I guess.

> Well, ideally we would choose the one that fulfills best our needs and
> just use that. Using cjson will save us some milliseconds that may be
> nice to save but won't change much anything. Any comments?

Probably best to require both libraries for testing. If we can
demonstrate that nothing needs simplejson, then we can do a new
release of the RPM without much hassle.

> > I take it that we're not worried about the effect of concurrent writers
> > on our stored metadata?
> 
> The one stored in xapian or in json files? Anyway, both storages are
> to be written only by the DS' main thread.

At present. :)

> > What happens if our _store_metadata() function dies with something like
> > ESPACE?
> 
> The laptop won't boot anymore? ;) What do you suggest to do here?

The big things I can think of are:

  a) not to kill Sugar. Would an exception here bring down the rest of
  the system?

  b) to inform an in-flight backup process that we're out of space so
  that it can make its own decision about whether to continue.

The thing I'm getting at is that we may well be invoking the backup
software _because_ the system is in a crazy state and we want to try to
save our data. It would be really good if this were feasible to achieve.
(I'm not quite sure whether it is.)

Incidentally, why do we hardcode saving the .metadata files into the
'base' DS directory? Sure, it's a sensible default setting, but if I had
a USB key (er, mass storage device) mounted and handy, I'd be really
interested in being able to easily dump my DS onto it. In that
circumstance, it would seem to me to make the most sense for the backup
software to instruct the DS to write metadata straight into a directory
on the USB key. Thoughts?

New patch
---------

> +try:
> +    import cjson
> +except ImportError:
> +    import simplejson

...

>      # File Management API
> +    def _encode_json(self, metadata, file_path):
> +        if has_cjson:
> +            f = open(file_path, 'w')
> +            f.write(cjson.encode(metadata))
> +            f.close()
> +        else:
> +            simplejson.dump(metadata, open(file_path, 'w'))
> +

What do you think of the style of the following helper? (Notice that it
takes the file object instead of the path.)

def write_json_to_file(filelike, close=True):
    try:
        filelike.write(cjson.encode(metadata))
        filelike.close()
    except NameError:
        simplejson.dump(metadata, filelike)

Also, what permissions are going to be assigned to the .metadata files?
Are these permissions correct?    

> +        # Check that all entries have their metadata in the file system.
> +        if not os.path.exists(os.path.join(self.base, '.metadata.exported')):

Could you explain why testing for the absence of this flag file is the
appropriate way to determine whether we need to do a full metadata
export? I think it's correct but I'd still like to hear the reasoning
spelled out.

> +            uids_to_export = []
> +            uids = self.indexmanager.get_all_ids()
> +
> +            t = time.time()

This time measurement seems to be unused. (Incidentally, have you
pylinted the resulting code?)

Michael


More information about the Code-review mailing list