9.1 Proposal: Top five performance problems

Tomeu Vizoso tomeu at tomeuvizoso.net
Mon Oct 27 11:38:16 EDT 2008

On Fri, Oct 24, 2008 at 9:13 PM, Mitch Bradley <wmb at laptop.org> wrote:
> Michael Stone wrote:
>> I did some basic profiling of my new rainbow code last night and
>> discovered that, in the best case with the current codebase on XO, it
>> costs about 0.5s/"1 exec(python)". Approximately 80% of the 0.5s was
>> spent importing modules.
>> I hope to dig deeper in the near future, but I am concerned at my lack
>> of inspiration about how to deal with this problem. (Other than by
>> rewriting into a different language.) I still do not consider the
>> mod_python approach used in the 767-era rainbow to be a viable long-term
>> solution.
> Well, there is a tedious solution that would probably be effective.  Go
> through the list of modules with a fine-toothed comb and find out what
> is actually used from each module.  I'll bet that there are quite a few
> modules from which only a few simple functions are used.  Collecting
> those functions into one lightweight (no unnecessary stuff) module might
> collapse the dependency graph.
> As I said, this can be tedious, but it's the sort of think I've done
> many times during my career, and it has usually paid off.  If nothing
> else, you end up learning a lot about how things work, which tends to
> make you eventually become fearless.  Hah! I know how that works, and
> it's not nearly as complicated as you think!
> A lot of complexity ends up being solutions to low-value problems that
> don't apply in your case.  As a case in point, a long time ago I needed
> to incorporate a stripped-down stdio package in some app that needed to
> be tiny.  The basic character I/O ended up pulling in a train load of
> networking libraries.  It turned out that "isatty()" was the culprit -
> it had to check whether the file descriptor matched every conceivable
> kind of I/O object.  I just made a stub version of isatty() and all the
> spurious dependencies disappeared.

The problem that I see with this approach are the initializations that
are done when importing a module python from the base libraries. You
may know that only one or two functions are used from one module and
decide to take them into our collector module, but the problem here is
that we don't know which part of the code in the original module
initialization is expected by those functions to be have been
executed. Or even worst, which module initialization code from another
imported module is expected by those functions.

I intend to look at activity startup in greater detail and see which
are the worst offenders at what can be done about it. Other people
before us have been bitten by this problem and they have gotten their
patches integrated into python, we should be able to do the same.



More information about the Devel mailing list