Infrequent heap corruption, XO-4, Fedora 20

Jon Nettleton jon.nettleton at gmail.com
Wed Feb 4 06:14:02 EST 2015


It is a problem with the v4 version of the galcore driver.  We have
replicated it on a couple of platforms.

On Wed, Feb 4, 2015 at 11:26 AM, Peter Robinson <pbrobinson at gmail.com>
wrote:

> On Wed, Feb 4, 2015 at 8:10 AM, James Cameron <quozl at laptop.org> wrote:
> > Following up a thread from last September.
> >
> > This problem has just become more interesting, because it hit during
> > an activity startup.
> >
> > I'm quite used to seeing it with yum.  But seeing it without yum now
> > points us at kernel, glibc or python.
>
> We've not seen this in the wider F-20 Fedora ARM distro so my bet
> would be on the kernel.
>
> Peter
>
> > http://dev.laptop.org/ticket/12837#comment:4 has the details of the
> > most recent event.
> >
> > On Wed, Sep 10, 2014 at 01:56:27PM +1000, James Cameron wrote:
> >> G'day Peter,
> >>
> >> Thanks for any ideas you may have.
> >>
> >> The problem also reproduces on OLPC Fedora 20 image for XO-4:
> >>
> >> http://build.laptop.org/14.1.0/os1/xo-4/41001o4.zd (552 MB)
> >>
> >> *** Error in `/usr/bin/python': free(): invalid pointer: 0x047c79ae ***
> >> ======= Backtrace: =========
> >> /lib/libc.so.6(+0x6c8b4)[0xb6c828b4]
> >> /lib/libc.so.6(+0x754e8)[0xb6c8b4e8]
> >> ======= Memory map: ========
> >> [...]
> >>
> >> The error varies in detail, but always suggests corruption of heap or
> >> pointers to heap.
> >>
> >> The triggering conditions are interactive use of yum, yum update, or
> >> yum used by olpc-os-builder.  The latter is a simple reproducer for me.
> >>
> >> I'm reproducing it on an XO-4, with 2GB of RAM, no swap, 8 GB eMMC, 8
> >> GB USB flash drive.
> >>
> >> While memory demand by yum is large by comparison to other programs,
> >> the available memory at the time of failure is ample.  There are no
> >> kernel out of memory (OOM) events.  It seems more likely to occur when
> >> the filesystem cache is under heavy demand.
> >>
> >> The method to recreate the problem was:
> >>
> >> 1.  install the system image 41001o4.zd using fs-update and then boot,
> >>
> >> 2.  configure wireless network,
> >>
> >> 3.  "yum install -y git olpc-os-builder"
> >>
> >> 4.  clone the master branch of
> >> git://dev.laptop.org/projects/olpc-os-builder
> >> (last verified with b87e6ee)
> >>
> >> 5.  run "./osbuilder.py examples/olpc-os-14.1.0-xo4.ini" repeatedly
> >> until the error occurs (usually within about five attempts),
> >>
> >>
> >> I've also tried running under valgrind, but that causes illegal
> >> instruction.  It is quite likely I'm not using valgrind correctly.
> >> http://dev.laptop.org/~quozl/z/1XRYtO.txt
> >>
> >> The workaround at the moment is to build our Fedora 20 images on
> >> Fedora 18.  Fedora 18 shows no sign of the problem.  I'm worried that
> >> a low probability heap corruptor may cause instability of applications
> >> in the field.
> >>
> >> The exact same kernel is being used for Fedora 18 and Fedora 20.
> >>
> >> On Tue, Sep 09, 2014 at 03:55:24PM +0100, Peter Robinson wrote:
> >> > What version of OOB are you using, and what config files? I can try
> >> > and recreate the problem here on other devices.
> >>
> >> --
> >> James Cameron
> >> http://quozl.linux.org.au/
> >
> > --
> > James Cameron
> > http://quozl.linux.org.au/
> _______________________________________________
> Devel mailing list
> Devel at lists.laptop.org
> http://lists.laptop.org/listinfo/devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.laptop.org/pipermail/devel/attachments/20150204/3428911d/attachment.html>


More information about the Devel mailing list