More 16 vs 24 bpp profiling

Tue Sep 11 11:29:44 EDT 2007

On 11/09/07 13:05 +0200, Stefano Fedrigo wrote:
> I've done some more profiling on the 16 vs. 24 bpp issue.
> This time I used this test:
> https://dev.laptop.org/git?p=sugar;a=blob;f=tests/graphics/hipposcalability.py
> 
> A simple speed test: I measured the time required to scroll down and up
> one time all the generated list.  Not extremely accurate, but I repeated the
> test a few times with consistent results (+- 0.5 secs).  Mean times:
> 
> xserver 1.4
> 16 bpp: 37.9
> 24 bpp: 40.7
> 
> xserver 1.3
> 16: 46.4
> 24: 50.1
> 
> At 24 bpp we're a little slower.  1.3 is 20% slower than 1.4. The pixman
> migration patch makes the difference: 1.3 spend most of that 20% in memcpy().
> 
> The oprofile reports are from xserver 1.4.  I don't see much difference
> between 16 and 24, except that at 24 bpp, less time is spent in pixman and more
> in amd_drv.  At 16 bpp pixman_fill() takes twice the time.
> 
> Unfortunately without a working callgraph it's not very clear to me what's
> happening in amd_drv.  At 24bpp gp_wait_until_idle() takes twice the time...

What can we do to fix this?  I would really like to know who is calling
gp_wait_until_idle().

Also, I think we're spending way too much time in
gp_color_bitmap_to_screen_blt() - is there any way we can get more indepth
profiling in that one function?

Jordan