Compiler optimization for Geode/Floating Point pipeline
echerlin at gmail.com
Tue Oct 30 00:05:44 EDT 2007
I don't know about the internals of current activities in Python, but
I know where you can find reams of array code in C, FORTRAN, and other
languages in 3D graphics libraries (4 x 4 matrix multiplication
especially), multimedia compression and decompression, cryptography
(particularly RSA exponentiation, but also elliptic curve
cryptography, ssh key exchange, and others), audio and image
processing (lots of convolutions), font rendering (quadratic and cubic
splines), and other computationally intensive domains. The more we do
on this front now, the better for the older students when they come to
modeling, numeric integration, linear programming, linear algebra,
computational molecular biology, and so on and on and on.
I believe the GNU compilers use a common intermediate format and code
generator for a variety of languages. Ah, here we go.
http://gcc.gnu.org/ "The GNU Compiler Collection includes front ends
for C, C++, Objective-C, Fortran, Java, and Ada, as well as libraries
for these languages (libstdc++, libgcj,...)." There are over 1,000
hits on Geode in their archive. For example,
Embedded AMD CPU with MMX and 3dNOW! instruction set support.
While picking a specific cpu-type will schedule things appropriately
for that particular chip, the compiler will not generate any code that
does not run on the i386 without the -march=cpu-type option being
Generate instructions for the machine type cpu-type. The choices
for cpu-type are the same as for -mtune. Moreover, specifying
-march=cpu-type implies -mtune=cpu-type.
Interesting. It appears that much of the necessary work has been done,
and we could get fairly good results if we just set up a small compile
farm and comb through our current code base. With plenty of regression
testing, of course. I have machine time to spare, and would be happy
to recruit at the local LUGs. Or we could talk to distributed.net and
BOINC. Much more performance would remain to be wrung out by rewriting
small sections of some kinds of code, and presumably the true wizards
could find more code generation optimizations than we have yet seen
Does anybody know how we handle signal processing in the Tamtam
software synth? The Fast Fourier Transforms for going from time domain
to frequency domain in the data visualization activity are a prime
target. It's all multiply-accumulate except for the bit twiddling.
What is the theoretical performance of the Geode LX for various kinds
of arithmetic? What performance are we getting?
On 10/29/07, Brian Carnes <bmcarnes_olpc at oddren.com> wrote:
> The developers page on the wiki
> (http://wiki.laptop.org/go/Developers_program) mentions:
> "compiler optimization: if you are a compiler wizard, we understand that
> the Geode lacks a specific back end code scheduler, which limits
> performance, particularly FP performance. We'd love to see work go on in
> this area which would help everyone."
> What aspects of this issue/request for help are still open? I'll go take
> a look at the OLPC build system tonight to see what is being used (late
> versions of GCC do have some Geode -mtune/-march modes), but would love to
> be hooked into whatever project is addressing this
> ...Or start my own project if I'm the first to step to the plate on this
> issue. If anyone knows any particularly lengthy floating-point dominated
> operations in the current software, let me know and we'll use them as our
> metric for improvement.
> Devel mailing list
> Devel at lists.laptop.org
Earth Treasury: End Poverty at a Profit
Sustainable MBA student
Presidio School of Management
More information about the Devel