sugar start-up profiling
riccardo
riccardo.lucchese at gmail.com
Tue Jul 29 10:15:07 EDT 2008
Problem: sugar starts up slowly on the xo
Tests were ran on an xo after a clean joyride-2181 install plus
sucrose's activities.
A graph of the ending part of the boot process can be found at:
http://dev.laptop.org/~rlucchese/boot/boot.stats.svg
(http://dev.laptop.org/~rlucchese/boot/boot.stats)
It was obtained by running:
$ picker -t 75 -o /tmp/boot.stats &
$ grapher -i boot.stats -c10
The sampling period begins in `start()' in haldaemon's init script and
lasts 75 seconds.
The following tab shows cpu usage of the 10 processes that took more cpu
time during the sampling time:
$ grapher -i boot.stats -c13 -r cpu
tot% ps% cmdline
-----------------------
27.7 /bin/sh /usr/bin/sugar
44.6 16.8 jffs2_gcd_mtd0
51.7 7.1 /usr/bin/env python /usr/bin/sugar-activity journal...
56.2 4.5 python /usr/bin/datastore-service
60.6 4.4 picker -t75 -o /tmp/boot.stats
64.3 3.8 python /usr/sbin/rainbow-daemon --daemon
68.0 3.6 xinit /usr/bin/olpc-session -- /usr/bin/X -fp built-ins...
71.3 3.3 /bin/sh /usr/bin/olpc-session
74.1 2.8 python /usr/bin/sugar-shell-service
76.9 2.8 python /usr/bin/sugar-presence-service
79.7 2.8 python /usr/sbin/rainbow-daemon --daemon
81.2 1.5 hald
82.6 1.4 ohmd
Renicing jffs2_gcd_mtd0 to 19 in haldaemon's init script slightly
speeds-up the last
part of the boot:
$ cat /home/olpc/.boot_time.prev (clean joyride)
56.27
$ cat /home/olpc/.boot_time
54.14
Note however that when renicing jffs2_gcd_mtd0, boot timings `became
less deterministic, sometimes being slower than the `clean install'
case. Stopwatch timings (from boot loader to fully redrawn shell):
clean install : 1m34.0s
renice trick : 1m30.2s +[0. to 4.5 sec]
Would it be possible to make DS, sugar-presence-service and
sugar-shell-service system services and thus decoupling them from the
shell's start-up?
The next tab shows the 10 processes that hit more major faults during
the sampling time:
$ grapher -i boot.stats -m6 -r faults
minflt majflt cmdline
-----------------------
15363 154 /bin/sh /usr/bin/sugar
3122 57 python /usr/sbin/rainbow-daemon --daemon
1829 51 xinit /usr/bin/olpc-session -- /usr/bin/X -fp built-ins -wr
-aut
3576 38 python /usr/bin/datastore-service
498 37 ohmd
1270 16 hald
A strace of `dbus-launch ... sugar-shell' in /usr/bin/sugar can be found
at:
http://dev.laptop.org/~rlucchese/boot/dbus-launch_sugar-shell.strace
A script was run that cat'ed all files (5700+) that sugar tries to open
on boot. (http://dev.laptop.org/~rlucchese/boot/cat_sugar_startup_files)
$ time sh cat_sugar_startup_files
real 0m16.9s
user 0m1.0s
sys 0m12.2s
Timings vary a lot when repeating the test (+-~30%).
`top' shows that the remaining time goes to io-wait.
Why don't those files get cached when repeating the test ?
cProfile statistics (KCG format) for sugar-shell (start-up only):
http://dev.laptop.org/~rlucchese/boot/cProfile-shell
Functions ordered by self-time (%):
32.2 gtk.main()
19.6 send_message_with_reply_and_block of dbus
12.8 block of dbus.lowlevel.PendingCall
7.3 grab of sugar._sugarext.KeyGrabber
3. _add_weight of sugar/shell/view/home/grid.py
2.7 render_cairo of rsvg.Handle
1.12 __getitem__ of sugar/shell/view/home/grid.py
---------------------------------------------------
78.7 %
Perhaps some dbus calls can be made asynchronous; this is taking 30+% of
start-up time.
It's curious that `grab` takes so much time: I don't see anything
obvious looking at the code.
Note that the layout for the favorite's view was set to the ring type;
why is the shell computing weights in this case?
High level tasks/functions ordered by total-time:
67.3 gtk.main()
33.6 __init__ of sugar/view/Shell.py
16.8 _start_journal_idle of sugar/view/Shell.py
11.6 __init__ shell/view/keyhandler.py
7.5 add sugar/shell/view/home/spreadlayout.py
Note that 45% of Shell.__init__ and the 89% of _start_journal_idle
went to blocking dbus calls.
The next tab shows start-up timings extracted from the statistics
gathered with picker (the sampler may have missed processes lasting
shorter than the sampling period):
http://dev.laptop.org/~rlucchese/boot/sugar_xo_boot_timings
$ grapher -i boot.stats -r timings,cpu | `a bit of selection
tot% ps% start length cmdline
--------------------------------------
16.9 16.8 0.00 74.99 jffs2_gcd_mtd0
..
36.5 3.8 12.44 62.55 python /usr/sbin/rainbow-daemon --daemon
..
37.8 0.1 18.88 56.11 /usr/sbin/olpc-dm
..
38.0 0.0 19.32 55.67 /bin/sh /usr/bin/startx /usr/bin/...
..
41.7 3.6 19.87 55.12 xinit /usr/bin/olpc-session -- ...
..
46.0 3.3 24.27 50.72 /bin/sh /usr/bin/olpc-session
..
74.0 27.7 28.24 46.75 /bin/sh /usr/bin/sugar
..
75.1 0.4 49.15 25.84 matchbox-window-manager -use_titlebar...
..
77.9 2.8 49.25 25.74 python /usr/bin/sugar-presence-service
..
81.2 2.8 53.86 21.13 python /usr/bin/sugar-shell-service
..
85.7 4.5 58.89 16.10 python /usr/bin/datastore-service
..
92.8 7.1 62.86 12.13 usr/bin/env python /usr/... (journal)
Accurate shell's start-up timings were taken by instrumentation:
---first launch (normal boot)---
/usr/bin/sugar : (1217317352.992) -- 0.
entering main() in shell/main.py : (1217317368.162) -- 15.2
before entering gtk's mainloop shell/main.py: (1217317375.464) -- 22.5
---second launch (ctrl+alt+erase)---
/usr/bin/sugar : (1217317725.747) -- 0.
entering main() in shell/main.py : (1217317734.939) -- 9.2
before entering gtk's mainloop shell/main.py: (1217317741.023) -- 15.3
The datastore service and the journal will be profiled in separate
sessions.
Thanks,
riccardo
More information about the Devel
mailing list