Reason for the "one dot" hang found!

Gary Martin garycmartin at googlemail.com
Thu Jun 10 22:01:29 EDT 2010


On 10 Jun 2010, at 20:53, Daniel Drake <dsd at laptop.org> wrote:

> On 10 June 2010 10:58, Bernie Innocenti <bernie at codewiz.org> wrote:
>> Hello,
>> 
>> with the serial cable Richard gave me, I figured out what's causing a
>> rare lockup during boot which has been riddling the XO-1 since when we
>> moved to F11.
>> 
>> The /etc/rc.sysinit script contains this line:
>> 
>>  # Sync waiting for storage.
>>  { rmmod scsi_wait_scan ; modprobe scsi_wait_scan ; rmmod  scsi_wait_scan ; } >/dev/null 2>&1
>> 
>> It gets executed while udev is loading modules in parallel. Apparently,
>> something in the kernel ends up dead-locking on module load:
>> 
>> 
>>   1 tty1     Ss+    0:02 /sbin/init
>>  945 ?        Ss     0:00 /bin/sh -e -c ?runlevel --set S >/dev/null || true???/
>>  950 ?        S      0:00  \_ /bin/bash /etc/rc.d/rc.sysinit
>> 1597 ?        D      0:00      \_ modprobe scsi_wait_scan
> 
> I strongly doubt this is the issue. This is a very simple module.
> 
> Note your other blocked process:
> 
>> 1035 ?        D<     0:00 /sbin/modprobe -b pci:v000011ABd00004102sv000011ABsd00
> 
> This one also has a lower process ID, suggesting that it was run first.
> 
> I suspect there is a crash/hang within this module, and at this point,
> attempting to load any other module (scsi_wait_scan or otherwise) will
> hang. Due to contention on a lock, corruption, a dead kernel thread,
> or something like that.
> 
> My suggested next steps in diagnosis:
> 1. Identify which device is pci:v000011ABd00004102
> Anyone can do this on any XO-1 with: lspci -vd 11ab:4102
> I'm pretty sure its a part of the CAFE chip but I don't have an XO to check.

00:0c.2 Multimedia video controller: Marvell Technology Group Ltd. Unknown device 4102 (rev 10) (pro-if 01)
    Subsystem: Marvell Technology Group Ltd. Unknown device 4100
    Flags: bus master, 66MHz, medium, latency 32, IRQ 11
    Memory at fe028000 (32-bit, non-prefetchable) [size=16K]
    Capabilities: <access denied>
    Kernel driver in use: cafe1000-ccic

This is from an XO-1 running build 767, Sugar 0.82.1, firmware Q2E18.

Regards,
--Gary

> 2. Look at dmesg at point of crash
> Considering that you got a process tree I guess you can also run some
> other commands at point of hang?
> Run "dmesg" and capture output.
> 
> 3. Capture kernel task dump at point of crash
> echo t > /proc/sysrq-trigger
> The task dump will appear in kernel logs (dmesg).
> 
> Daniel
> _______________________________________________
> Devel mailing list
> Devel at lists.laptop.org
> http://lists.laptop.org/listinfo/devel



More information about the Devel mailing list