Reason for the "one dot" hang found!
Gary Martin
garycmartin at googlemail.com
Thu Jun 10 22:01:29 EDT 2010
On 10 Jun 2010, at 20:53, Daniel Drake <dsd at laptop.org> wrote:
> On 10 June 2010 10:58, Bernie Innocenti <bernie at codewiz.org> wrote:
>> Hello,
>>
>> with the serial cable Richard gave me, I figured out what's causing a
>> rare lockup during boot which has been riddling the XO-1 since when we
>> moved to F11.
>>
>> The /etc/rc.sysinit script contains this line:
>>
>> # Sync waiting for storage.
>> { rmmod scsi_wait_scan ; modprobe scsi_wait_scan ; rmmod scsi_wait_scan ; } >/dev/null 2>&1
>>
>> It gets executed while udev is loading modules in parallel. Apparently,
>> something in the kernel ends up dead-locking on module load:
>>
>>
>> 1 tty1 Ss+ 0:02 /sbin/init
>> 945 ? Ss 0:00 /bin/sh -e -c ?runlevel --set S >/dev/null || true???/
>> 950 ? S 0:00 \_ /bin/bash /etc/rc.d/rc.sysinit
>> 1597 ? D 0:00 \_ modprobe scsi_wait_scan
>
> I strongly doubt this is the issue. This is a very simple module.
>
> Note your other blocked process:
>
>> 1035 ? D< 0:00 /sbin/modprobe -b pci:v000011ABd00004102sv000011ABsd00
>
> This one also has a lower process ID, suggesting that it was run first.
>
> I suspect there is a crash/hang within this module, and at this point,
> attempting to load any other module (scsi_wait_scan or otherwise) will
> hang. Due to contention on a lock, corruption, a dead kernel thread,
> or something like that.
>
> My suggested next steps in diagnosis:
> 1. Identify which device is pci:v000011ABd00004102
> Anyone can do this on any XO-1 with: lspci -vd 11ab:4102
> I'm pretty sure its a part of the CAFE chip but I don't have an XO to check.
00:0c.2 Multimedia video controller: Marvell Technology Group Ltd. Unknown device 4102 (rev 10) (pro-if 01)
Subsystem: Marvell Technology Group Ltd. Unknown device 4100
Flags: bus master, 66MHz, medium, latency 32, IRQ 11
Memory at fe028000 (32-bit, non-prefetchable) [size=16K]
Capabilities: <access denied>
Kernel driver in use: cafe1000-ccic
This is from an XO-1 running build 767, Sugar 0.82.1, firmware Q2E18.
Regards,
--Gary
> 2. Look at dmesg at point of crash
> Considering that you got a process tree I guess you can also run some
> other commands at point of hang?
> Run "dmesg" and capture output.
>
> 3. Capture kernel task dump at point of crash
> echo t > /proc/sysrq-trigger
> The task dump will appear in kernel logs (dmesg).
>
> Daniel
> _______________________________________________
> Devel mailing list
> Devel at lists.laptop.org
> http://lists.laptop.org/listinfo/devel
More information about the Devel
mailing list