Datastore profiling

Tomeu Vizoso tomeu at tomeuvizoso.net
Wed Aug 13 05:41:38 EDT 2008


On Sat, Aug 9, 2008 at 3:30 PM, riccardo <riccardo.lucchese at gmail.com> wrote:
> Hi,
>
>
> Testcase:
> Fill the datastore with `many small' objects.

Which user-perceived action are you investigating?

> The test was automated with the script at:
> http://dev.laptop.org/~rlucchese/datastore_fill/fill_ds

Why the need to restart sugar?

> For this test only the _TEST_SMALL_FILES flag was set in `fill_ds' and
> the following cmdline was used:
> $ fill_ds -s sources/ -b 20000000

In most cases, the files that activities put into the DS are moved
instead of copied, because of the long time that takes writing to
jffs2 due to compression. That will change the profiling results a
lot.

For that, call datastore.write() with transfer_ownership=True.

> The `sources' dir I used can be found at:
> http://dev.laptop.org/~rlucchese/datastore_fill/sources/
>
> Before running the script all entries in the datastore were deleted.
>
>
>
>  + Macro investigation of cpu usage
> A picker graph taken during the test can be found at:
> http://dev.laptop.org/~rlucchese/datastore_fill/small_files__b20000000.picker.svg
> (http://dev.laptop.org/~rlucchese/datastore_fill/small_files__b20000000.picker)
>
> The following tab shows cpu usage of the 3 processes that took more cpu
> time during the sampling time:
> ($ grapher -c2 -n -r cpu -i small_files__b20000000.picker)
>
>  tot%   ps%     cmdline
>  -----------------------
>        55.0    python /usr/bin/datastore-service
>  86.7   31.7    python /usr/bin/sugar-activity
> journalactivity.JournalActivity -
>
> During the test the `favorites view' was shown and I think I was told
> the journal shouldn't take cpu time to update itself in this case. What
> is it doing than ?

It is updating the "what" combobox in the toolbar, see
SearchToolbar.refresh_filters().

This of course could be optimized in the same way that the list view
update is, by only refreshing once the journal comes visible again. To
check if this is worth doing, I'd do manually the user action you are
trying to make faster with and without refresh_filters() being called
and would check if there's a significant difference.

>  + Test log
> http://dev.laptop.org/~rlucchese/datastore_fill/small_files__b20000000.log
>
> It took 123+ seconds to fill the datastore with 484 files (total size
> 20041624 bytes; files pushed to the datastore where text_s.txt,
> text_s.odt and img_s.jpg from the sources dir linked above). The 123s
> value is very stable when repeating the test.
>
>
>
>  + cProfile statistics (KCG format) for the datastore:
> http://dev.laptop.org/~rlucchese/datastore_fill/small_files__b20000000.cProfile
>
> * functions ordered by self-time:
>  23.8   posix.write
>  16.8   send_message of _dbus_bindings.Connection
>  9.6   _database_gen_allterms_iter of xapian.py
>  4.2   flush of secore/indexerconnection.py
>  2.6   add of secore/indexerconnection.py
>  2.3   cPickle.dumps
>  2.    perform of secore/fieldactions.py
>  1.7   _act_index_freetext of secore/fieldactions.py
>  1.6   index of datastore/xapianindex.py
>  ----------------------------------------------------
>  64.6%
>
> I/O performance and dbus are predominant in results.
> As I said, I'm not sure the journal should be so `active' during the
> test; in the case it would be more interesting to repeat the test after
> a fix.

Not sure yet if the journal optimization should be done (depends on
the result of the test suggested above) but we surely should repeat
the test with transfer_ownership=True. Automatic compression in jffs2
is going to be much slower than anything else.

>  + Sysprof statistics:
> http://dev.laptop.org/~rlucchese/datastore_fill/small_files__b20000000.sysprof
>
> I can't see anything relevant for libxapian and it's bindings (xapian
> dbg packages were installed)

Perhaps because of the small amount of metadata? PROPS_WITH_PREVIEW in
http://dev.laptop.org/git?p=projects/datastore;a=blob;f=tests/test_sugar.py
would get you quite close to the real case.

> Quite a bit of time goes to librsvg functions rendering svg icons to
> GtkPixbufs. Attaching strace to the journal activity and grepping "open"
> on its output gives:
> <
> ...
> open("/usr/share/icons/sugar/scalable/mimetypes/text-x-generic.svg",
> O_RDONLY|O_LARGEFILE) = 11
> open("/usr/share/icons/sugar/scalable/mimetypes/image-x-generic.svg",
> O_RDONLY|O_LARGEFILE) = 11
> open("/usr/share/icons/sugar/scalable/mimetypes/audio-x-generic.svg",
> O_RDONLY|O_LARGEFILE) = 11
> open("/usr/share/icons/sugar/scalable/mimetypes/video-x-generic.svg",
> O_RDONLY|O_LARGEFILE) = 11
> open("/usr/share/icons/sugar/scalable/mimetypes/text-uri-list.svg",
> O_RDONLY|O_LARGEFILE) = 11
> open("/usr/share/icons/sugar/scalable/mimetypes/text-x-generic.svg",
> O_RDONLY|O_LARGEFILE) = 11
> open("/usr/share/icons/sugar/scalable/mimetypes/image-x-generic.svg",
> O_RDONLY|O_LARGEFILE) = 11
> open("/usr/share/icons/sugar/scalable/mimetypes/audio-x-generic.svg",
> O_RDONLY|O_LARGEFILE) = 11
> open("/usr/share/icons/sugar/scalable/mimetypes/video-x-generic.svg",
> O_RDONLY|O_LARGEFILE) = 11
> ...
>>
>
> Does this come from sugar.mime or any sugar component ?
> If yes, would it be possible to share one icon-cache between all sugar
> modules ?

We already have a cache for svg icons, but in this case s.g.combobox
is using a gtk.CellRendererPixbuf which doesn't use s.g.icon. We
should probably do our CellRenderer that uses s.g.icon, this would
give us colored icons, badges, etc. Good catch!

Thanks,

Tomeu



More information about the Devel mailing list