[OLPC-devel] [PATCH 0/4] Compile kernel with -fwhole-program --combine
David Woodhouse
dwmw2 at infradead.org
Thu Aug 24 11:24:28 EDT 2006
`-combine'
If you are compiling multiple source files, this option tells the
driver to pass all the source files to the compiler at once (for
those languages for which the compiler can handle this). This
will allow intermodule analysis (IMA) to be performed by the
compiler. Currently the only language for which this is supported
is C. If you pass source files for multiple languages to the
driver, using this option, the driver will invoke the compiler(s)
that support IMA once each, passing each compiler all the source
files appropriate for it. For those languages that do not support
IMA this option will be ignored, and the compiler will be invoked
once for each source file in that language. If you use this
option in conjunction with `-save-temps', the compiler will
generate multiple pre-processed files (one for each source file),
but only one (combined) `.o' or `.s' file.
`-fwhole-program'
Assume that the current compilation unit represents whole program
being compiled. All public functions and variables with the
exception of `main' and those merged by attribute
`externally_visible' become static functions and in a affect gets
more aggressively optimized by interprocedural optimizers. While
this option is equivalent to proper use of `static' keyword for
programs consisting of single file, in combination with option
`--combine' this flag can be used to compile most of smaller scale
C programs since the functions and variables become local for the
whole combined compilation unit, not for the single source file
itself.
Using a combination of these two compiler options for building kernel
code leads to some useful optimisation -- especially with modules which
are made up of a bunch of incestuous C files, where none of the global
symbols actually _need_ to be visible outside the directory they reside
in. File systems are a prime example of this -- on PPC64 I see a
reduction in size of ext3.ko by 2.6%, jffs2.ko by 5%, cifs.ko by 8% and
befs.ko by a scary 14%. Strangely, udf.ko seems to have _grown_ by 6.6%
-- that'll probably be another optimisation bug like GCC PR28755.
The same benefits can be extended to the vmlinux too, although there are
caveats with making _everything_ static. However, it's relatively simple
to make EXPORT_SYMBOL() automatically set the 'externally_visible'
attribute on the symbol in question, and to introduce a new '__global'
tag which does the same for those symbols which aren't exported to
modules but which _are_ needed as a global symbol in vmlinux.
Size results from a test build on ppc64 are shown at
http://david.woodhou.se/combine/sizes.csv -- the format is
<old size>,<new size>,<delta>,<percentage * 100>,<object name>
The same file with objects where the size didn't change omitted, and
sorted on percentage is http://david.woodhou.se/combine/sizes-sorted.csv
There are a bunch of GCC bugs which make this interesting:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27898
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27889
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28706
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28712
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28744
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28755
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28779
Fixes (or workarounds) for some of these are at
http://david.woodhou.se/combine/gcc-patches/
(Actual patches will follow, to linux-kernel at vger.kernel.org only)
--
dwmw2
More information about the Devel
mailing list