Infrequent heap corruption, XO-4, Fedora 20

James Cameron quozl at laptop.org
Wed Feb 4 03:10:48 EST 2015


Following up a thread from last September.

This problem has just become more interesting, because it hit during
an activity startup.

I'm quite used to seeing it with yum.  But seeing it without yum now
points us at kernel, glibc or python.

http://dev.laptop.org/ticket/12837#comment:4 has the details of the
most recent event.

On Wed, Sep 10, 2014 at 01:56:27PM +1000, James Cameron wrote:
> G'day Peter,
> 
> Thanks for any ideas you may have.
> 
> The problem also reproduces on OLPC Fedora 20 image for XO-4:
> 
> http://build.laptop.org/14.1.0/os1/xo-4/41001o4.zd (552 MB)
> 
> *** Error in `/usr/bin/python': free(): invalid pointer: 0x047c79ae ***
> ======= Backtrace: =========
> /lib/libc.so.6(+0x6c8b4)[0xb6c828b4]
> /lib/libc.so.6(+0x754e8)[0xb6c8b4e8]
> ======= Memory map: ========
> [...]
> 
> The error varies in detail, but always suggests corruption of heap or
> pointers to heap.
> 
> The triggering conditions are interactive use of yum, yum update, or
> yum used by olpc-os-builder.  The latter is a simple reproducer for me.
> 
> I'm reproducing it on an XO-4, with 2GB of RAM, no swap, 8 GB eMMC, 8
> GB USB flash drive.
> 
> While memory demand by yum is large by comparison to other programs,
> the available memory at the time of failure is ample.  There are no
> kernel out of memory (OOM) events.  It seems more likely to occur when
> the filesystem cache is under heavy demand.
> 
> The method to recreate the problem was:
> 
> 1.  install the system image 41001o4.zd using fs-update and then boot,
> 
> 2.  configure wireless network,
> 
> 3.  "yum install -y git olpc-os-builder"
> 
> 4.  clone the master branch of
> git://dev.laptop.org/projects/olpc-os-builder
> (last verified with b87e6ee)
> 
> 5.  run "./osbuilder.py examples/olpc-os-14.1.0-xo4.ini" repeatedly
> until the error occurs (usually within about five attempts),
> 
> 
> I've also tried running under valgrind, but that causes illegal
> instruction.  It is quite likely I'm not using valgrind correctly.
> http://dev.laptop.org/~quozl/z/1XRYtO.txt
> 
> The workaround at the moment is to build our Fedora 20 images on
> Fedora 18.  Fedora 18 shows no sign of the problem.  I'm worried that
> a low probability heap corruptor may cause instability of applications
> in the field.
> 
> The exact same kernel is being used for Fedora 18 and Fedora 20.
> 
> On Tue, Sep 09, 2014 at 03:55:24PM +0100, Peter Robinson wrote:
> > What version of OOB are you using, and what config files? I can try
> > and recreate the problem here on other devices.
> 
> -- 
> James Cameron
> http://quozl.linux.org.au/

-- 
James Cameron
http://quozl.linux.org.au/



More information about the Devel mailing list