[Server-devel] [PATCH] ds_backup - major rework

Martin Langhoff martin.langhoff at gmail.com
Mon Jun 16 17:45:56 EDT 2008


On Mon, Jun 16, 2008 at 5:27 PM, Michael Stone <michael at laptop.org> wrote:
> On Mon, Jun 16, 2008 at 01:46:25PM -0400, martin.langhoff at gmail.com wrote:
>> From: Martin Langhoff <martin at laptop.org>
>> +    # setup datastore connection
>> +    bus = dbus.SessionBus()
>> +    obj = bus.get_object(DS_DBUS_SERVICE, DS_DBUS_PATH)
>> +    datastore = dbus.Interface(obj, DS_DBUS_INTERFACE)
>
> Might it be appropriate to cache the resulting DS D-Bus object/interface
> for later use?

All this chunk is no longer needed, now that we have Tomeu's metadata
patch. I will be removing it shortly...

>> +    # name the backup file
>> +    # and open a tmpfile in the same dir
>> +    # to ensure an atomic replace
>> +    md_path = os.path.join(ds_path,
>> +                           'metadata.json')
>> +    (md_fd, md_tmppath) = tempfile.mkstemp(suffix='.json',
>> +                                           prefix='.metadata-',
>> +                                           dir=ds_path)
>
> Where is the resulting file, which lives on NAND, deleted?

Gets moved into place, overwriting the old file.

>> +
>> +    # preview contains binary data we
>> +    # don't actually want...
>> +    drop_properties = ['preview']
>> +
>> +    query = {}
>
> Why do we want to ditch previews? Just because we can?

Here I was just following suit with the old code.

>>      entries, count = datastore.find(query, [], byte_arrays=True)
>>      print 'Writing metadata and file indexes for %d entries.' % len(entries)
>>      for entry in entries:
>> -        for prop in external_properties:
>> +        for prop in drop_properties:
>>              if prop in entry:
>>                  del entry[prop]
> ...
>> +        md_fh.write(json.write(_sanitize_dbus_dict(entry))+'\n')
>> +    md_fh.close()
>>
> ...
>> +    os.rename(md_tmppath, md_path)
>
> Are we expecting concurrent writers here? If so, then we need to be
> aware that our cleanup_stale_metadata is racing with them. Writers who
> lose the race with the winning writer's cleanup attempt will die when
> their rename fails as their source path was unlinked by the winner.
> Writers who lose the race with the winning writer but who beat its
> cleanup attempt will overwrite the metadata to be transmittted. Is this
> correct?

Yes and no.

 - Yes it's racey
 - Martin's been smart to use a lock - in /var/lock or .sugar/default/lock?
 - This race goes away when we stop dumping the metadata altogether,
which I will soon do.

>> +def cleanup_stale_metadata(ds_path):
>> +    files = glob.glob(os.path.join(ds_path, '.metadata-*.json'))
>> +    for file in files:
>> +        os.unlink(file)
>> +    return files.count;
>
> Are we concerned about the cleanup code failing, for example because
> permissions were changed on the ds_path dir? If it permanently failed
> then I think our backup script would also permanently fail, which
> doesn't seem quite right. (If something's broken with my permissions,
> then I _want_ my backup script to take what data it can get and move it
> off the system, right?)

Good point. This part of the code is removed, but I haven't made it a
point to ensure that the code can backup a RO homedir. Actually, we
aren't well setup for a RO root fs at all (should an error kick us
into ro), /tmp is not a tmpfs!

...

> Why are we removing the
> progress-tracking code? Does it not work?

Might be cute for debugging, but scripts meant for cron should have
_no_ output, and if where they do have output (--verbose), it should
be loggable. IIRC, the rsync --progress option is for a tty as it
spits out control chars.

cheers,



m
-- 
 martin.langhoff at gmail.com
 martin at laptop.org -- School Server Architect
 - ask interesting questions
 - don't get distracted with shiny stuff - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff


More information about the Server-devel mailing list