Weird WLAN problem after stupid upgrade attempt
Tom Seago
tom at tomseago.com
Fri Jan 4 05:37:10 EST 2008
I have managed to turn what was a working G1G1 machine into a machine
which can no longer see it's wireless card on the USB bus. I thought
I caused this to happen with the following software gyrations, but it
is quite possible that I am the victim of an unfortunately timed and
entirely coincidental hardware failure. If anyone has any pointers to
help me figure out which of those two is the case - I will be
eternally grateful (or at least grateful for a really really long time
indistinguishable from eternity :) ).
The stupid part is how I started on this little odyssey of mine. I
stupidly tried to just run "olpc-update joyride-1492" from the command
line without having a developer key. The update process appeared to
run fine (which one could argue it should NOT have done if it wasn't
going to work later), but when the machine rebooted I got the "boot
failed" screen. Realizing that I probably _needed_ that whole
developer key thing, I attempted to give up on this ill-fated upgrade
by doing the "circle key" boot from the alternate OS image.
When doing the circle key reboot, the machine rebooted fine, except I
had no networking. No lights, no nothing. After some irrelevant
silliness on my part where I didn't check the $path variable but
started freaking out that ifconfig couldn't be found, I decided what I
needed was a fresh install to get everything back to the way it was so
I could continue to sit in my corner with the shipping version of the
software and avoid the joyride that I was clearly not qualified to be
on.
Thus, I found the wiki page about downloading the os653.img and fs.zip
files, I threw those on a USB stick, and did the "all buttons, I
really really mean it" reboot. The reflash proceeded without a hitch.
The machine restarted, had lost the things I had downloaded and it's
nickname (as expected), but alas - still no network.
At this point, I dug in to the exact situation as best I could. I
noticed a "eth0 no private ioctls" message during startup, which lead
me to this old ticket http://dev.laptop.org/ticket/1969 . That
ticket describes exactly what I have going on, in so far as the
wireless card has apparently disappeared.
Having a second, working G1G1 machine, I was able to determine that
normally there are 3 devices listed in /proc/bus/usb/devices on a
working machine. On my broken machine I only have 2. Also, the
usb8xxx and 802.11 modules aren't loaded - but that's not surprising
if the usb device wasn't found as the kernel loaded.
Looking in /var/log/messages from my good machine I see the following
during a _good_working_ boot.
Jan 4 17:59:56 localhost kernel: [ 19.120404] hub 1-0:1.0: USB hub
found
Jan 4 17:59:56 localhost kernel: [ 19.120730] hub 1-0:1.0: 4 ports
detected
Jan 4 17:59:56 localhost kernel: [ 19.244457] ohci_hcd 0000:00:0f.
4: OHCI Host Controller
Jan 4 17:59:56 localhost kernel: [ 19.269021] ohci_hcd 0000:00:0f.
4: new USB bus registered, assigned bus number 2
Jan 4 17:59:56 localhost kernel: [ 19.285507] ohci_hcd 0000:00:0f.
4: irq 10, io mem 0xfe01a000
Jan 4 17:59:56 localhost kernel: [ 19.404162] usb usb2:
configuration #1 chosen from 1 choice
Jan 4 17:59:56 localhost kernel: [ 19.428806] hub 2-0:1.0: USB hub
found
Jan 4 17:59:56 localhost kernel: [ 19.444807] hub 2-0:1.0: 4 ports
detected
<-- Same up to here -->
Jan 4 17:59:56 localhost kernel: [ 19.460720] hub_port_wait_reset:
portstatus=503 portchange=10
Jan 4 17:59:56 localhost kernel: [ 19.533567] usb 1-1: new high
speed USB device using ehci_hcd and address 2
Jan 4 17:59:56 localhost kernel: [ 19.580477] Initializing USB Mass
Storage driver...
Jan 4 17:59:56 localhost kernel: [ 19.610342] hub_port_wait_reset:
portstatus=503 portchange=10
Jan 4 17:59:56 localhost kernel: [ 19.720163] usb 1-1:
configuration #1 chosen from 1 choice
Jan 4 17:59:56 localhost kernel: [ 19.752491] usbcore: registered
new interface driver usb-storage
Jan 4 17:59:56 localhost kernel: [ 19.768864] USB Mass Storage
support registered.
Jan 4 17:59:56 localhost kernel: [ 19.799984] usbcore: registered
new interface driver libusual
From the nasty bad broken machine, in this same area of kernel
messages I see the following :
Jan 4 18:00:12 localhost kernel: [ 19.035283] hub 1-0:1.0: USB hub
found
Jan 4 18:00:12 localhost kernel: [ 19.035610] hub 1-0:1.0: 4 ports
detected
Jan 4 18:00:12 localhost kernel: [ 19.165439] ohci_hcd 0000:00:0f.
4: OHCI Host Controller
Jan 4 18:00:12 localhost kernel: [ 19.188479] ohci_hcd 0000:00:0f.
4: new USB bus registered, assigned bus number 2
Jan 4 18:00:12 localhost kernel: [ 19.204928] ohci_hcd 0000:00:0f.
4: irq 10, io mem 0xfe01a000
Jan 4 18:00:12 localhost kernel: [ 19.319481] usb usb2:
configuration #1 chosen from 1 choice
Jan 4 18:00:12 localhost kernel: [ 19.344145] hub 2-0:1.0: USB hub
found
Jan 4 18:00:12 localhost kernel: [ 19.360104] hub 2-0:1.0: 4 ports
detected
<-- Same up to here -->
Jan 4 18:00:12 localhost kernel: [ 19.398669] hub_port_wait_reset:
portstatus=501 portchange=10
Jan 4 18:00:12 localhost kernel: [ 19.478184] hub_port_wait_reset:
portstatus=100 portchange=1
Jan 4 18:00:12 localhost kernel: [ 19.493707] hub_port_wait_reset:
device went away!
Jan 4 18:00:12 localhost kernel: [ 19.509760] Initializing USB Mass
Storage driver...
Jan 4 18:00:12 localhost kernel: [ 19.747667] hub_port_wait_reset:
portstatus=103 portchange=10
Jan 4 18:00:12 localhost kernel: [ 19.824322] usb 2-1: new full
speed USB device using ohci_hcd and address 2
Jan 4 18:00:12 localhost kernel: [ 19.927321] hub_port_wait_reset:
portstatus=103 portchange=10
Jan 4 18:00:12 localhost kernel: [ 20.007386] usb 2-1: device
descriptor read/64, error -62
Jan 4 18:00:12 localhost kernel: [ 20.217309] hub_port_wait_reset:
portstatus=103 portchange=10
Jan 4 18:00:12 localhost kernel: [ 20.298350] usb 2-1: device
descriptor read/64, error -62
Jan 4 18:00:12 localhost kernel: [ 20.507786] hub_port_wait_reset:
portstatus=103 portchange=10
Jan 4 18:00:12 localhost kernel: [ 20.587363] usb 2-1: new full
speed USB device using ohci_hcd and address 3
Jan 4 18:00:12 localhost kernel: [ 20.687441] hub_port_wait_reset:
portstatus=103 portchange=10
Jan 4 18:00:12 localhost kernel: [ 20.767505] usb 2-1: device
descriptor read/64, error -62
Jan 4 18:00:12 localhost kernel: [ 20.987681] hub_port_wait_reset:
portstatus=103 portchange=10
Jan 4 18:00:12 localhost kernel: [ 21.068233] usb 2-1: device
descriptor read/64, error -62
Jan 4 18:00:12 localhost kernel: [ 21.278157] hub_port_wait_reset:
portstatus=103 portchange=10
Jan 4 18:00:12 localhost kernel: [ 21.357247] usb 2-1: new full
speed USB device using ohci_hcd and address 4
Jan 4 18:00:12 localhost kernel: [ 21.795150] usb 2-1: device not
accepting address 4, error -62
Jan 4 18:00:12 localhost kernel: [ 21.887425] hub_port_wait_reset:
portstatus=103 portchange=10
Jan 4 18:00:12 localhost kernel: [ 21.967003] usb 2-1: new full
speed USB device using ohci_hcd and address 5
Jan 4 18:00:12 localhost kernel: [ 22.406370] usb 2-1: device not
accepting address 5, error -62
<-- Basically the same after here, except for the no-workie part -->
Jan 4 18:00:12 localhost kernel: [ 22.423247] usbcore: registered
new interface driver usb-storage
Jan 4 18:00:12 localhost kernel: [ 22.439931] USB Mass Storage
support registered.
Jan 4 18:00:12 localhost kernel: [ 22.464870] usbcore: registered
new interface driver libusual
As you can see from the above, the broken machine gives some
hub_port_wait_reset business that the working machine didn't do -
making me suspect a hardware issue.
Another thing I have done is run the POST diagnostics by holding the
left rocker button during boot. I did this on both machines at the
same time to diff the results. Both say that usb port 0 is in use -
good. But the working machine did scroll some wlan diagnostic
information up the screen at the end of the the video tests that the
broken machine did not do. The broken machine did not report an error
- but it clearly did not run the same wlan test.
After some consultation on IRC and general flinging anything I could
against the issue I did try upgrading to the '07 firmware just for
grins. No different results (which is what the release notes would
lead me to suspect). I also tried going back to the 650 software
instead of 653, but again, no different results.
So I'm sorry for everyone on this developers list who aren't really
terribly interested in my specific, probably hardware related,
silliness. Given that it sure _feels_ like I managed to effectively
brick my machine by just running update scripts that shouldn't brick
my machine, I thought I would ask for opinions, guidance, or general
flames about what I'm overlooking.
I have submitted my developer key request by now, and tomorrow should
be able to get to the Ok prompt. Until then, I don't know as that I
can do anything else. If no one has any good ideas for me to try, it
looks like I may have to RMA this puppy next week.
That makes me sad, but I knew there was a reason I ordered 2 of these
suckers to begin with....
(-: Tom ;-)
More information about the Devel
mailing list