Oprofile, swap
M. Edward (Ed) Borasky
znmeb at cesmail.net
Tue Dec 18 10:41:37 EST 2007
John Richard Moser wrote:
> I just got my OX laptop (hopefully some kid gets the other one soon...
> or not), and noticed it's slow and kind of buggy. I think I'll get a
> $25 4GB SD card for a SWAP area...
>
> I should run oprofile too, and have it write to the SD card. I
> understand what an interpreted language like Python does to the CPU but
> it shouldn't be this bad... it's only going to be like 100 times slower?
> An actual interpreter will...
>
> - Put pressure on the data cache as its code grows
> - ... but keep the actual interpreter (code) in cache better
> - Use a relatively large chunk of data for a look-up table
> - ... or use some convoluted and hard to maintain code
> - ... or optimally, a look-up table to start the decoding process, if
> like a CPU bytecode interpreter (Java, CIL) it has an insn + address
> mode + data (not QUITE optimal for Python, but maybe since simple
> addition and call happens)
> - Wind up doing what can easily become a multi-hundred-cycle decoding
> process for each executed bytecode insn
>
> Python rewrites to bytecode (good, interpreting text is slow! Multiple
> parsing!) but a lot of the main function calls in the API should be C,
> not Python (taking some of the pressure off). This means Python should
> be doing a lot of logic in native space, rather than interpreting a lot
> (unlike Java, which had its whole library written in Java...)
>
> I suggest taking a look at PyPy for Python, which will dynamic recompile
> Python to native code and likely give some good performance benefits. I
> really can't stand JIT compilation and would prefer something that takes
> advantage of Mono's own facilities, to centralize the effort in the JIT
> at least (Mono has nice stuff), but IronPython is Microsoft Permissive
> License which is not OSI approved.
>
> As for real solutions, I want to profile things and see where they're
> hanging. I may need a Python profiler too, to get a look inside the
> Python code and see if some functions there are also bad; oprofile will
> tell me if Python itself is spending an ungodly amount of time in its
> decoder functions but that's it.
>
I don't have my physical unit yet, but I too am interested in profiling
and performance tuning. Unfortunately I have no Python tuning experience
so I can't be of much help at the moment. I do have a "virtual ship2",
running on a 2.2 GHz Athlon64 X2, but that of course is cheating. :)
Oprofile is a bit tough to work with -- it makes you install a whole
bunch of GUI libraries just to get at the low-level profiling stuff. And
the kernel needs to be rebuilt with the right options -- I don't know if
the OLPC kernel does so. So for now, I think you'll probably be better
off with lower-level command-line tools.
I know "top" is there, but as far as I'm concerned the one must-have
package is "sysstat". "sysstat" is a work of pure genius -- it started
out as a Linux re-implementation of "sar" and "iostat", but it is much
more than that now. Once I get my physical unit, I'll be looking at
things in some detail.
I'm guessing that adding swap isn't going to help you. If you're memory
bound, the solution is to stop activities that you aren't using, not
forcing the kernel to move stuff in and out of RAM. "top" will tell you.
Open a terminal window and type "top". At the top of the display you'll
see memory used, free, cached, etc. There's a keystroke that will sort
processes by their resident set size. Type "h" to get a help menu. If I
get a chance tonight, I'll fire up a bunch of activities in my virtual
XO and see what it does when it runs out of RAM.
More information about the Devel
mailing list