Where olpc machine spending time when using web broswer

Adam Jackson ajackson at redhat.com
Tue Mar 13 11:52:44 EDT 2007


On Mon, 2007-03-12 at 18:59 -0400, William Cohen wrote:

> # opreport -t 1 -l /usr/bin/Xorg
> CPU: CPU with timer interrupt, speed 0 MHz (estimated)
> Profiling through timer interrupt
> samples  %        image name               symbol name
> 6514     68.1096  libfb.so                 fbFetchTransformed

Wow.  I think that's the first time I've ever seen this actually show up
on a profile.  I didn't think anything used Render's transformations on
account of they're so painfully slow.

> 613       6.4095  libfb.so                 fbFetchPixel_x8r8g8b8

Just as an aside, gcc generates some intensely dumb code:

00000d80 <fbFetchPixel_x8r8g8b8>:
     d80:       55                      push   %ebp
     d81:       8b 04 90                mov    (%eax,%edx,4),%eax
     d84:       89 e5                   mov    %esp,%ebp
     d86:       5d                      pop    %ebp
     d87:       0d 00 00 00 ff          or     $0xff000000,%eax
     d8c:       c3                      ret    
     d8d:       8d 76 00                lea    0x0(%esi),%esi

Yay for frame pointers on leaf functions!  I guess we can build X with
-momit-leaf-frame-pointer, it can't hurt but I have no idea if it'll
help measurably.  It certainly makes the code prettier:

00000ce0 <fbFetchPixel_x8r8g8b8>:
     ce0:       8b 04 90                mov    (%eax,%edx,4),%eax
     ce3:       0d 00 00 00 ff          or     $0xff000000,%eax
     ce8:       c3                      ret    
     ce9:       8d b4 26 00 00 00 00    lea    0x0(%esi),%esi

>    398  6.1099 :                        x1 = MOD (x1, pict->pDrawable->width);
>    383  5.8796 :                        x2 = MOD (x2, pict->pDrawable->width);
>    336  5.1581 :                        y1 = MOD (y1, pict->pDrawable->height);
>    355  5.4498 :                        y2 = MOD (y2, pict->pDrawable->height);

It feels like we ought to be able to MMX this pattern.

>                :                        ft = FbGet8(tl,0) * idistx + FbGet8(tr,0) * distx;
>                :                        fb = FbGet8(bl,0) * idistx + FbGet8(br,0) * distx;
>    536  8.2284 :                        r = (((ft * idisty + fb * disty) >> 16) & 0xff);
>                :                        ft = FbGet8(tl,8) * idistx + FbGet8(tr,8) * distx;
>                :                        fb = FbGet8(bl,8) * idistx + FbGet8(br,8) * distx;
>    482  7.3994 :                        r |= (((ft * idisty + fb * disty) >> 8) & 0xff00);
>                :                        ft = FbGet8(tl,16) * idistx + FbGet8(tr,16) * distx;
>                :                        fb = FbGet8(bl,16) * idistx + FbGet8(br,16) * distx;
>    514  7.8907 :                        r |= (((ft * idisty + fb * disty)) & 0xff0000);
>                :                        ft = FbGet8(tl,24) * idistx + FbGet8(tr,24) * distx;
>                :                        fb = FbGet8(bl,24) * idistx + FbGet8(br,24) * distx;
>                :                        r |= (((ft * idisty + fb * disty) << 8) & 0xff000000);
>    512  7.8600 :                        buffer[i] = r;

And maybe this one.

But the more serious question is why we're hitting this path at all.  I
can't think of anything in the described firefox use profile that would
require filters or transformations.  The only place I can see the word
'bilinear' mentioned at all in firefox is

modules/libpr0n/decoders/icon/gtk/nsIconChannel.cpp:
    nsIconChannel::InitWithGnome()

but that seems unlikely.  And more importantly it doesn't look like
we'll ever hit fbFetchTransformed without an actual transformation
matrix.  Someone who knows firefox want to chime in here?

- ajax




More information about the Devel mailing list