Sample large datastore

Mon May 19 03:20:28 EDT 2008

On Mon, May 19, 2008 at 7:01 AM, Martin Langhoff
<martin.langhoff at gmail.com> wrote:
> Working on ds-backup, I am concerned about performance of extracting
> all of the datastore metadata. Both Tomeu and Ivan have warned me
> about performance and memory impact. I want to repro the problem asap.
>
> If a full metadata dump can be acceptably fast, a lot of complications
> and error conditions in the ds-backup code just disappear, thanks to
> rsync. I think it should be fast - our storage space limits the amount
> of data we are dealing with in a normal case, and the machine is fast
> enough that we should be able to burn through a few thousand entries
> in less that 2s. But I need a good test case.
>
> (So far, with small datasets I haven't been able to go over 1s, and
> the python startup dominates time anyway).
>
> Has anyone got a good sized datastore that I can grab? I am happy to
> respect privacy of any personal datastore sample... but you must cope
> with my browsing a bit around. :-) My test machines are on 703 - happy
> to up/down/side-grade in case that matters.

I never had a _real_ DS with the thousands of entries that a kid can
create in not a very long time, but I can give a you a script that
creates something more or less close to it. We could take a DS with
dozens or a few hundreds of real entries, and multiplicate those until
you have the desired size.

Do you think this would be enough?

Regards,

Tomeu

Do you think it would be enough?