Where olpc machine spending time when using web broswer
Adam Jackson
ajackson at redhat.com
Tue Mar 13 11:52:44 EDT 2007
On Mon, 2007-03-12 at 18:59 -0400, William Cohen wrote:
> # opreport -t 1 -l /usr/bin/Xorg
> CPU: CPU with timer interrupt, speed 0 MHz (estimated)
> Profiling through timer interrupt
> samples % image name symbol name
> 6514 68.1096 libfb.so fbFetchTransformed
Wow. I think that's the first time I've ever seen this actually show up
on a profile. I didn't think anything used Render's transformations on
account of they're so painfully slow.
> 613 6.4095 libfb.so fbFetchPixel_x8r8g8b8
Just as an aside, gcc generates some intensely dumb code:
00000d80 <fbFetchPixel_x8r8g8b8>:
d80: 55 push %ebp
d81: 8b 04 90 mov (%eax,%edx,4),%eax
d84: 89 e5 mov %esp,%ebp
d86: 5d pop %ebp
d87: 0d 00 00 00 ff or $0xff000000,%eax
d8c: c3 ret
d8d: 8d 76 00 lea 0x0(%esi),%esi
Yay for frame pointers on leaf functions! I guess we can build X with
-momit-leaf-frame-pointer, it can't hurt but I have no idea if it'll
help measurably. It certainly makes the code prettier:
00000ce0 <fbFetchPixel_x8r8g8b8>:
ce0: 8b 04 90 mov (%eax,%edx,4),%eax
ce3: 0d 00 00 00 ff or $0xff000000,%eax
ce8: c3 ret
ce9: 8d b4 26 00 00 00 00 lea 0x0(%esi),%esi
> 398 6.1099 : x1 = MOD (x1, pict->pDrawable->width);
> 383 5.8796 : x2 = MOD (x2, pict->pDrawable->width);
> 336 5.1581 : y1 = MOD (y1, pict->pDrawable->height);
> 355 5.4498 : y2 = MOD (y2, pict->pDrawable->height);
It feels like we ought to be able to MMX this pattern.
> : ft = FbGet8(tl,0) * idistx + FbGet8(tr,0) * distx;
> : fb = FbGet8(bl,0) * idistx + FbGet8(br,0) * distx;
> 536 8.2284 : r = (((ft * idisty + fb * disty) >> 16) & 0xff);
> : ft = FbGet8(tl,8) * idistx + FbGet8(tr,8) * distx;
> : fb = FbGet8(bl,8) * idistx + FbGet8(br,8) * distx;
> 482 7.3994 : r |= (((ft * idisty + fb * disty) >> 8) & 0xff00);
> : ft = FbGet8(tl,16) * idistx + FbGet8(tr,16) * distx;
> : fb = FbGet8(bl,16) * idistx + FbGet8(br,16) * distx;
> 514 7.8907 : r |= (((ft * idisty + fb * disty)) & 0xff0000);
> : ft = FbGet8(tl,24) * idistx + FbGet8(tr,24) * distx;
> : fb = FbGet8(bl,24) * idistx + FbGet8(br,24) * distx;
> : r |= (((ft * idisty + fb * disty) << 8) & 0xff000000);
> 512 7.8600 : buffer[i] = r;
And maybe this one.
But the more serious question is why we're hitting this path at all. I
can't think of anything in the described firefox use profile that would
require filters or transformations. The only place I can see the word
'bilinear' mentioned at all in firefox is
modules/libpr0n/decoders/icon/gtk/nsIconChannel.cpp:
nsIconChannel::InitWithGnome()
but that seems unlikely. And more importantly it doesn't look like
we'll ever hit fbFetchTransformed without an actual transformation
matrix. Someone who knows firefox want to chime in here?
- ajax
More information about the Devel
mailing list