Infrequent heap corruption, XO-4, Fedora 20
jon.nettleton at gmail.com
Wed Feb 4 06:14:02 EST 2015
It is a problem with the v4 version of the galcore driver. We have
replicated it on a couple of platforms.
On Wed, Feb 4, 2015 at 11:26 AM, Peter Robinson <pbrobinson at gmail.com>
> On Wed, Feb 4, 2015 at 8:10 AM, James Cameron <quozl at laptop.org> wrote:
> > Following up a thread from last September.
> > This problem has just become more interesting, because it hit during
> > an activity startup.
> > I'm quite used to seeing it with yum. But seeing it without yum now
> > points us at kernel, glibc or python.
> We've not seen this in the wider F-20 Fedora ARM distro so my bet
> would be on the kernel.
> > http://dev.laptop.org/ticket/12837#comment:4 has the details of the
> > most recent event.
> > On Wed, Sep 10, 2014 at 01:56:27PM +1000, James Cameron wrote:
> >> G'day Peter,
> >> Thanks for any ideas you may have.
> >> The problem also reproduces on OLPC Fedora 20 image for XO-4:
> >> http://build.laptop.org/14.1.0/os1/xo-4/41001o4.zd (552 MB)
> >> *** Error in `/usr/bin/python': free(): invalid pointer: 0x047c79ae ***
> >> ======= Backtrace: =========
> >> /lib/libc.so.6(+0x6c8b4)[0xb6c828b4]
> >> /lib/libc.so.6(+0x754e8)[0xb6c8b4e8]
> >> ======= Memory map: ========
> >> [...]
> >> The error varies in detail, but always suggests corruption of heap or
> >> pointers to heap.
> >> The triggering conditions are interactive use of yum, yum update, or
> >> yum used by olpc-os-builder. The latter is a simple reproducer for me.
> >> I'm reproducing it on an XO-4, with 2GB of RAM, no swap, 8 GB eMMC, 8
> >> GB USB flash drive.
> >> While memory demand by yum is large by comparison to other programs,
> >> the available memory at the time of failure is ample. There are no
> >> kernel out of memory (OOM) events. It seems more likely to occur when
> >> the filesystem cache is under heavy demand.
> >> The method to recreate the problem was:
> >> 1. install the system image 41001o4.zd using fs-update and then boot,
> >> 2. configure wireless network,
> >> 3. "yum install -y git olpc-os-builder"
> >> 4. clone the master branch of
> >> git://dev.laptop.org/projects/olpc-os-builder
> >> (last verified with b87e6ee)
> >> 5. run "./osbuilder.py examples/olpc-os-14.1.0-xo4.ini" repeatedly
> >> until the error occurs (usually within about five attempts),
> >> I've also tried running under valgrind, but that causes illegal
> >> instruction. It is quite likely I'm not using valgrind correctly.
> >> http://dev.laptop.org/~quozl/z/1XRYtO.txt
> >> The workaround at the moment is to build our Fedora 20 images on
> >> Fedora 18. Fedora 18 shows no sign of the problem. I'm worried that
> >> a low probability heap corruptor may cause instability of applications
> >> in the field.
> >> The exact same kernel is being used for Fedora 18 and Fedora 20.
> >> On Tue, Sep 09, 2014 at 03:55:24PM +0100, Peter Robinson wrote:
> >> > What version of OOB are you using, and what config files? I can try
> >> > and recreate the problem here on other devices.
> >> --
> >> James Cameron
> >> http://quozl.linux.org.au/
> > --
> > James Cameron
> > http://quozl.linux.org.au/
> Devel mailing list
> Devel at lists.laptop.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Devel