8.2.1 WPA testing

Hal Murray hmurray at megapathdsl.net
Mon Mar 2 04:56:55 EST 2009


Interesting.  Thanks.

I'm currently using build-34.  I think I've tried 800, 801, and a few other 
build-3? without noticing any differences.  (But I didn't look as carefully 
with the others.)  I'm not sure how they relate to Sugar 8.2.0 or 8.2.1.

I'm seeing a different pattern, perhaps because I'm looking in a different 
place and/or have a different AP.

I have a Linksys WRT54GL running stock firmware with WPA enabled.  I have 2 
XOs.  Both know the password and they should just come up and connect to the 
AP.  It did that reliably with the old software I was using.  I'm not sure 
what that was.  Probably os759.img from last Sept.  (I'll try things if that 
will help.)

Here is what I see on booting...

The automatic stuff goes through 4 steps.  (See below.)  It tries but doesn't 
connect to my AP.  It ends up on Mesh Network 1.  It does connect when I poke 
the AP icon in the Neighborhood view, but I have to wait until the normal 
startup stuff finishes.

I'm not sure the pattern of automatic-fails then poke-works holds 100%, but 
it's very close to that.  I've never seen it connect automatically.  I have 
seen the first poke fail several times, but I think they were all because I 
poked the AP icon before the automatic stuff had actually finished.

The automatic but doesn't-work case goes through 4 steps:
  1) Looking for a School Mesh Portal...
  2) Trying to connect to my AP (ch 6)
  3) Looking for an XO Mesh Portal...
  4) Connect to a Simple Mesh  (Mesh Network 1)

Step 1 takes 80 seconds
Step 2 takes 45 seconds
Step 3 takes 70 seconds
Step 4 takes 10 seconds
Those were timed by hand so they are rough.

Steps 1 and 3 try all 3 channels (1, 6, 11).  I can see the blinking move 
through the 3 mesh icons in the Neighboorhood view and/or watch the pop-up 
from the network con in the Frame.

Sometimes, when it's trying to connect to my AP in step 2, the pop-up shows 
the wrong channel.  I assume that's a UI bug.

I've seen several cases where it didn't setup the Simple Mesh after booting.  
There was no network icon in the lower right when the Frame was visible.  
Maybe I'm confusing booting with suspend/resume.  See below.  At any rate, it 
doesn't do that very often for me.


Returning from suspend (lid closed or power button) seems to discard all 
network state and start over, but the dance is different from booting.   The 
automatic part goes through the first 2 steps above.  It doesn't work and it 
doesn't end up connected to a mesh.  The LEDs are out and there is no network 
icon in the lower right of the Frame.  Poking the AP sometimes works and 
sometimes doesn't.  If it doesn't, Poking again normally works.

After suspend and poke to get connected, the LED pattern is different.  The 
left LED is on.  The right one doesn't blink like I expect when I generate 
network traffic.

Sometimes it assigns a new link-local IP Address for msh0.  Sometimes it 
doesn't.  (Or maybe I'm confused.)

-----------

Can somebody give me a lesson in the LEDs in the lower left when looking at 
the screen.  Where are they documented?

The left one is a ball on a stick.  The right one is a dot inside ( and ).

I expected the left one to be WiFi on/active, and the right one to blink when 
packets are being transmitted or received.  That seems to match the normal 
after-booting case.  The left one flickers occasionally (minute or so).  I'm 
guessing that is when it is resetting to peek at other channels or recover 
from a timeout.


Once, I saw the left LED stay on when the system was suspended after I poked 
the power button.

---------------

More possibly interesting observations:

After suspend/resume I see things like this:

Mar  1 13:26:49 localhost dhclient: DHCPREQUEST on eth0 to 192.168.1.8 port 67
Mar  1 13:26:49 localhost dhclient: DHCPACK from 192.168.1.8
Mar  1 13:26:49 localhost NetworkManager: dhcp_state_changed: assertion `req 
!=
NULL' failed
Mar  1 13:26:49 localhost NetworkManager: dhcp_state_changed: assertion `req 
!=
NULL' failed
Mar  1 13:26:49 localhost dhclient: bound to 192.168.1.105 -- renewal in 1644 
seconds.

Before the suspend/resume, it doesn't have the pair of messages from 
NetworkManager.


I'm getting timeouts.
Mar  1 12:16:19 localhost kernel: [42935.718952] NETDEV WATCHDOG: eth0: 
transmit timed out
Mar  1 12:16:19 localhost kernel: [42935.724037] libertas: tx watch dog 
timeout

They happen on both the normal and returned from suspend cases.  Sometimes 
there are clumps where they happen every 2 minutes.  I see one where there 
are 9 in a row at 2 min 10 second intervals.

Sometimes it will go for an hour or two without any.  They happen even when 
there isn't much traffic.  I don't think more traffic makes more of them, but 
I don't have any numbers to back that up.

------------

How does the system know which channel my AP is listening to?  I don't see 
anything in the file with the password stuff.

Does it occasionally switch to another channel for a short while to collect 
APs?  (and hence disrupt normal activity using the current channel)  I've 
seen ping times of 100 to 300 ms that might be caused by something like that 
and the left LED blinking...

----------

Things get "interesting" if there is a second XO around.

I've verified that mesh mode works.  Or at least it works some of the time.

Sometimes, when booting or suspend/resuming one XO, the second XO that was 
working but not currently doing anything will stop working.  Frequently it 
recovers after about a minute.  I've seen it recover quickly.  I've seen it 
say it recovered but the network didn't work.

Here is a typical chunk from /var/log/messages:
Mar  1 02:15:45 localhost NetworkManager: <info>  eth0: link timed out.
Mar  1 02:16:47 localhost NetworkManager: <info>  msh0: Got association; 
scheduling association handler
Mar  1 02:16:47 localhost NetworkManager: <info>  msh0: got association event 
from driver.


I've seen several cases where the network just stoped working.  I suspect 
they all involve booting or suspend/resuming the other XO, but while I was 
watching one system, I wasn't keeping an eye on the other system to see when 
it stopped.


I tried booting both systems at the same time.  Mostly, it didn't work right 
and I didn't get any clean reproducible notes.

----------

I've seen it ask me for a password a couple of times.  Until Chris' message 
I've been writing them off as maybe I poked the AP too early.

I've been ignoring what happens if I poke too early.  Things are complicated 
enough without it.

----------


I like to think I'm reasonably good at blundering into quirks and finding a 
recipe for an interesting test case.  I'm feeling slightly frustrated with 
this stuff.  There are too many loose ends that I can't fit into a pattern.

I'll be glad to try things if anybody has suggestions/requests.


I don't know my way around this area.  What should I read to get up to speed? 
 (I used to be a network geek, but that was many years ago, long before WiFi.)

Has anything interesting changed in the networking area in the last month or 
two?

How solid is the kernel support for wifi in general and/or this driver?

What sources are involved?

I assume NetworkManager is interesting.  Does it do everything or does it 
need some helper packages?

What package implements the Neighborhood view and networking pop-ups?

The browser on my XO says the sources are here.  Is that up to date?
  http://dev.laptop.org/git
It says NetworkManager hasn't changed in 8 months.


-- 
These are my opinions, not necessarily my employer's.  I hate spam.





More information about the Devel mailing list