[OLPC-devel] Re: pygtk performance issue
Ian Bicking
ianb at colorstudy.com
Wed Sep 6 12:44:03 EDT 2006
Mitch Bradley wrote:
> As an example of what one might do: The search path resolution
> mechanism could notice, while looking for file X in a list of
> directories, that some of those directories don't exist. It could then
> prune those entries from the path list.
If you want to go down this route, the trimming should happen in site.py
(which, in turn, is probably where the superfluous entries were added).
Python's site.py is a mess IMHO, and a custom OLPC version seems quite
reasonable.
Some people on some platforms have reported improvements when using zip
files instead of directories of modules, because the available files are
listed all in one place. Also, Python doesn't expect a zip file to
change, but will rescan directories frequently during subsequent
imports. There's a memory overhead for zip files, in part just because
the file listing is kept in memory, and I don't know (I would doubt)
that .so files can be imported out of a zip file.
Also, packages and relative imports cause some overhead, because Python
must always search for nearby modules before looking for global modules.
But there's not much to be done about this; in Python 2.5 you could
turn on explicit relative imports, but that has to be done
module-by-module, and the result is not Python 2.4 compatible. But I
think the Python 2.5 stat caching improvements might help here (even
without explicit relative imports), where simply trimming sys.path won't.
There was also the issue that Mike Heam brought up about the linking
overhead of importing _gtk.so in particular. Here I must profess that I
am largely ignorant of the issues, but I'll speculate anyway. Perhaps
just splitting up _gtk.so into separate modules would be helpful,
particularly if there are really different kinds of functionality so
that some might never be needed by an application (just switching to
lazy imports only stretches out the performance problem, and I'm not
sure that's a good solution here). But even then, the memory overhead
is quite substantial (8Mb?) so his suggestion to fork an
already-gtk-initialized Python process might be the only way to address
both issues reliably. (My own intuition is that PyGTK is representing
every GTK object with a corresponding Python object, which causes a
duplication that may not be necessary, but even if that is true it's
part of the basic architecture of PyGTK so it's not going to be easy to
change).
--
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org
More information about the Devel
mailing list