performance work
Jordan Crouse
jordan at cosmicpenguin.net
Mon Dec 22 17:36:36 EST 2008
Greg Smith wrote:
> Hi Jordan,
>
> Looks like we made a little more progress on graphics benchmarking. See
> Neil's results below.
>
> I updated the feature page with the test results so far:
> http://wiki.laptop.org/go/Feature_roadmap/General_UI_sluggishness
>
> What's next?
>
> Do we know enough now to target a particular section of the code for
> optimization?
>
I ran the raw data through a script, and came up with a nice little
summary of where we stand. My first general observation is that the
numbers are skewed due to system activity - recall that X runs in user
space, so it is subject to be preempted by the kernel. I think that the
obviously high numbers in many of the results are due to NAND or
wireless interrupts (example):
6: 2261923 (5.25 ms)
7: 16690761 (38.73 ms)
8: 2306919 (5.35 ms)
You might want to re-acquire the numbers with wireless turned off and
the system in a very quiet state. If you want to be extra careful, you
can run the benchmarks in an empty X server (no sugar) and save the
results to a ramfs backed directory to avoid NAND. You probably don't
have to get _that_ extreme, but I don't want you to spend much time
trying to investigate a path only to find out that the numbers are wrong
due to a few writes(). In the results below, I tried to mitigate the
damage somewhat by removing the highest and lowest value.
The list below is sorted by delta between accel and un-accel, with the
"worse" tests on top (i.e - the ones where accel is actually hurting
you) - these are good candidates to be looked at. There are three
reasons why unaccel would be faster then accel - 1) a bug in the accel
code, 2) The accel path requires reading from video memory (which is
very slow), and 3) the accel path doesn't punt to unaccel early enough.
The first two on the list (textpath-xlib and texturedtext-xlib) toss up
a huge red flag - I am guessing we are probably seeing a bug in the driver.
All of the upsample and downsample entries are interesting, because the
driver should be kicking back to the unaccelerated path - I'm guessing
that 3) might be in effect here - though 73 ms is a long time.
Most of the operations between 1ms and -1ms are probably going down the
unaccelerated path. Most everything in there probably should be
unaccelerated, with the possible exception of the 'over' operations -
those are the easiest for the GPU to accelerate and the most heavily
used, so you probably want to take a look at those.
As before, I encourage you to investigate which operation are heavily
used - if you don't use textured text very much, then optimizing it
would be heavily on the geek points, but not very useful in the long haul.
Jordan
Test Accel Noaccel Delta
------------------------------------------------------------------
textpath-xlib-textpath 1562.60 1345.12 217.48
texturedtext-xlib-texturedtext 315.61 140.54 175.07
downsample-nearest-xlib-512x512-redsquar 106.37 33.25 73.12
downsample-bilinear-xlib-512x512-redsqua 96.57 35.22 61.35
downsample-bilinear-xlib-512x512-primros 83.36 34.81 48.56
downsample-nearest-xlib-512x512-lenna 78.18 29.83 48.35
downsample-bilinear-xlib-512x512-lenna 83.91 36.32 47.59
downsample-nearest-xlib-512x512-primrose 77.49 30.06 47.43
upsample-nearest-xlib-48x48-todo 86.23 60.14 26.09
upsample-bilinear-xlib-48x48-brokenlock 242.52 216.49 26.03
upsample-bilinear-xlib-48x48-script 237.69 211.70 25.98
upsample-bilinear-xlib-48x48-mail 234.40 208.43 25.97
upsample-bilinear-xlib-48x48-todo 239.85 213.94 25.91
upsample-nearest-xlib-48x48-script 81.67 57.02 24.65
upsample-nearest-xlib-48x48-mail 78.99 54.42 24.57
upsample-nearest-xlib-48x48-brokenlock 86.18 61.73 24.45
upsample-nearest-48x48-script 61.95 57.46 4.49
downsample-bilinear-512x512-redsquare 11.24 7.77 3.47
solidtext-xlib-solidtext 11.70 9.51 2.19
textpath-textpath 1081.14 1079.37 1.78
texturedtext-texturedtext 112.33 111.79 0.54
upsample-bilinear-48x48-todo 224.06 223.68 0.37
upsample-nearest-48x48-brokenlock 64.46 64.16 0.30
upsample-bilinear-48x48-brokenlock 226.51 226.25 0.26
downsample-nearest-512x512-redsquare 2.43 2.23 0.19
gradients-linear-gradients-linear 107.39 107.30 0.09
over-640x480-empty 15.68 15.61 0.07
over-640x480-opaque 20.19 20.12 0.07
add-640x480-opaque 20.77 20.73 0.04
upsample-nearest-48x48-todo 60.75 60.71 0.04
add-640x480-transparentshapes 20.79 20.78 0.02
add-640x480-shapes 20.76 20.74 0.02
multiple-clip-rectangles-multiple clip r 1.23 1.22 0.01
over-clipped-640x480-empty 0.95 0.94 0.01
over-640x480-text 23.51 23.51 0.01
downsample-bilinear-512x512-primrose 7.08 7.08 0.00
multiple-clip-rectangles-xlib-multiple c 0.15 0.15 0.00
over-clipped-640x480-opaque 1.22 1.22 0.00
downsample-bilinear-512x512-lenna 7.03 7.04 -0.01
over-clipped-640x480-shapes 1.23 1.24 -0.01
downsample-nearest-512x512-primrose 2.03 2.05 -0.02
downsample-nearest-512x512-lenna 2.03 2.05 -0.02
over-640x480-transparentshapes 58.66 58.68 -0.02
over-640x480-shapes 18.56 18.59 -0.03
upsample-nearest-48x48-mail 54.71 54.77 -0.07
add-640x480-text 20.70 20.77 -0.08
solidtext-solidtext 42.83 42.94 -0.10
add-640x480-empty 20.66 20.80 -0.13
upsample-bilinear-48x48-mail 217.81 219.44 -1.63
over-clipped-xlib-640x480-opaque 4.55 6.26 -1.71
upsample-bilinear-48x48-script 220.89 222.80 -1.92
over-clipped-xlib-640x480-empty 3.67 6.04 -2.38
lines-lines 426.79 429.16 -2.38
over-clipped-xlib-640x480-shapes 4.00 6.52 -2.51
curves-curves 224.55 236.08 -11.53
over-xlib-640x480-empty 29.88 48.30 -18.42
curves-xlib-curves 245.46 264.19 -18.73
gradients-linear-xlib-gradients-linear 132.35 151.62 -19.26
over-xlib-640x480-opaque 29.92 53.04 -23.12
add-xlib-640x480-transparentshapes 29.98 53.53 -23.54
add-xlib-640x480-opaque 29.97 53.54 -23.57
add-xlib-640x480-empty 29.93 53.61 -23.67
add-xlib-640x480-shapes 30.05 53.77 -23.72
add-xlib-640x480-text 29.75 53.59 -23.84
over-xlib-640x480-shapes 29.77 54.93 -25.16
over-xlib-640x480-text 29.83 57.75 -27.92
over-xlib-640x480-transparentshapes 29.76 91.67 -61.91
lines-xlib-lines 275.59 481.84 -206.25
More information about the Devel
mailing list