Infrequent heap corruption, XO-4, Fedora 20

Peter Robinson pbrobinson at gmail.com
Wed Feb 4 05:26:15 EST 2015


On Wed, Feb 4, 2015 at 8:10 AM, James Cameron <quozl at laptop.org> wrote:
> Following up a thread from last September.
>
> This problem has just become more interesting, because it hit during
> an activity startup.
>
> I'm quite used to seeing it with yum.  But seeing it without yum now
> points us at kernel, glibc or python.

We've not seen this in the wider F-20 Fedora ARM distro so my bet
would be on the kernel.

Peter

> http://dev.laptop.org/ticket/12837#comment:4 has the details of the
> most recent event.
>
> On Wed, Sep 10, 2014 at 01:56:27PM +1000, James Cameron wrote:
>> G'day Peter,
>>
>> Thanks for any ideas you may have.
>>
>> The problem also reproduces on OLPC Fedora 20 image for XO-4:
>>
>> http://build.laptop.org/14.1.0/os1/xo-4/41001o4.zd (552 MB)
>>
>> *** Error in `/usr/bin/python': free(): invalid pointer: 0x047c79ae ***
>> ======= Backtrace: =========
>> /lib/libc.so.6(+0x6c8b4)[0xb6c828b4]
>> /lib/libc.so.6(+0x754e8)[0xb6c8b4e8]
>> ======= Memory map: ========
>> [...]
>>
>> The error varies in detail, but always suggests corruption of heap or
>> pointers to heap.
>>
>> The triggering conditions are interactive use of yum, yum update, or
>> yum used by olpc-os-builder.  The latter is a simple reproducer for me.
>>
>> I'm reproducing it on an XO-4, with 2GB of RAM, no swap, 8 GB eMMC, 8
>> GB USB flash drive.
>>
>> While memory demand by yum is large by comparison to other programs,
>> the available memory at the time of failure is ample.  There are no
>> kernel out of memory (OOM) events.  It seems more likely to occur when
>> the filesystem cache is under heavy demand.
>>
>> The method to recreate the problem was:
>>
>> 1.  install the system image 41001o4.zd using fs-update and then boot,
>>
>> 2.  configure wireless network,
>>
>> 3.  "yum install -y git olpc-os-builder"
>>
>> 4.  clone the master branch of
>> git://dev.laptop.org/projects/olpc-os-builder
>> (last verified with b87e6ee)
>>
>> 5.  run "./osbuilder.py examples/olpc-os-14.1.0-xo4.ini" repeatedly
>> until the error occurs (usually within about five attempts),
>>
>>
>> I've also tried running under valgrind, but that causes illegal
>> instruction.  It is quite likely I'm not using valgrind correctly.
>> http://dev.laptop.org/~quozl/z/1XRYtO.txt
>>
>> The workaround at the moment is to build our Fedora 20 images on
>> Fedora 18.  Fedora 18 shows no sign of the problem.  I'm worried that
>> a low probability heap corruptor may cause instability of applications
>> in the field.
>>
>> The exact same kernel is being used for Fedora 18 and Fedora 20.
>>
>> On Tue, Sep 09, 2014 at 03:55:24PM +0100, Peter Robinson wrote:
>> > What version of OOB are you using, and what config files? I can try
>> > and recreate the problem here on other devices.
>>
>> --
>> James Cameron
>> http://quozl.linux.org.au/
>
> --
> James Cameron
> http://quozl.linux.org.au/



More information about the Devel mailing list