No surprise on memory
Benjamin M. Schwartz
bmschwar at fas.harvard.edu
Mon Dec 15 23:21:18 EST 2008
-----BEGIN PGP SIGNED MESSAGE-----
I recently learned a few very important things about Linux memory
management (I'm speaking about how its supposed to work, irrespective of
any bugs). Operating systems experts already know all of this, but I did not.
1. Malloc lies. It will happily return a pointer to an allocation larger
than the entire amount of physical memory, just hoping that you won't use
it. This is called "overcommit".
2. Even without swap, the system will never actually run out of memory.
Instead, as some program attempts to make use of the memory that it has
already allocated, the kernel will start paging out all clean pages that
are not currently in use. At this point, the system has so little
remaining free memory that only the specific pages of binaries that are
currently in use can be held in memory. The system is essentially running
executables directly from disk, which is so slow that it would take ages
to finally run out of memory. Bernie helpfully compared this type of
thrashing to Zeno's paradox.
3. To avoid getting stuck in this situation, the kernel has a "OOM
killer". This is a misnomer. The OOM killer picks a process to kill
_before_ OOM is reached. It does this either because the system is
already low on memory and is paging lots of stuff out to disk, or because
the system is overcommitted by an unacceptably large ratio.
I found this very surprising, and in some ways I still do. I read many
justifications of these decisions, but I was curious to test it for
myself. I was happy, therefore, to learn about
/proc/sys/vm/overcommit_memory and /proc/sys/vm/overcommit_ratio. These
knobs control the memory system.
By setting overcommit_memory to "2" and overcommit_ratio to "95", it is
possible to approximate the behavior that a naive C programmer might
expect from the kernel. In this mode, malloc will only return a non-null
pointer if the allocation can actually be fulfilled in physical memory.
Also, this setting of overcommit_ratio ensures that 5% of memory is
reserved to the kernel.
I tried running 767 on an XO in this mode, and the bottom line is that the
conventional wisdom is correct. I set the parameters and restarted X, and
Sugar came up fine. Every view displayed correctly, including the Journal
and the mesh view with buddies from Gabble. Some activities, like
Calculate, run fine, but the big ones, like Record and Browse, are
semi-functional at best, and at worst cause sugar to lock up entirely.
This is not too surprising.
I'm no expert, but making the system work well without overcommit would
probably require extensive modifications to the python interpreter, the
fd.o libraries (dbus, gstreamer, telepathy, etc.), gecko, and maybe even
X. All of these would need to allocate only as much memory as they need,
and react appropriately when malloc returns NULL. In other words, 'tain't
Conclusion: no magic get-out-of-jail-free card.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.9 (GNU/Linux)
-----END PGP SIGNATURE-----
More information about the Devel