[Server-devel] Large groups of XO-1 do not work with access points

James Cameron quozl at laptop.org
Fri Feb 7 20:16:06 EST 2014

There seems to be a lot of speculation, so I'll add more technical
details on what Terry and I have been investigating.

1.  sometimes, an active scan by the XO-1 does not have the access
point listed in the scan results, despite the XO-1 transmitting an
acknowledgement to the access point,

2.  an active scan by the XO-1 is done only twice during boot before
Sugar starts, and is repeated every 30 seconds,

3.  if these two active scans do not contain the access point, Sugar
waits 10 seconds before it decides that we don't have any access point
available, and commits to using mesh.

Therefore all it takes is for two active scans to miss the access
point.  This can be easily reproduced with "sudo iwlist eth0 scan" and
looking at the "Last beacon" time for the access point.


The probability of failure in step 1 has considerable variance across
the test populations.  Here are some determinants:

a.  the probability varies by access point, even the same model access
point with the same firmware.  We see an extreme variation across
access points.  I see 5%, 23% and 32% with unused XO-1.  Terry sees
worse with his used XO-1 stock.

b.  the probability is higher if mesh is enabled in the firmware; my
32% fail rate drops down to 5.8% by turning off mesh using lbs_mesh,
and making no other changes.

c.  the probability is higher if many XO-1 are present and connected.

d.  the probability is higher if antennas or coax are broken (because
the two antennas are used at different times).

e.  the probability is much higher if there are other access points
present on the same channel at some distance.

f.  the probability is unchanged with or without encryption, with or
without power limits on the access point, and with or without 802.11n

I'm interested to know if anybody has any ideas as to what else to
vary in the experiments.

The test method is to place "sudo iwlist eth0 scan" in a loop, with a
five second repeat cycle, and count the number of scans where a
previous scan result was used.  Here's an example test:


while true; do
    T0=$(date +%s)
    if [[ $(( $T0 % 5 )) != 0 ]]; then
        sleep 0.1
    R0=$(sudo iwlist eth0 scan 2>/dev/null | awk "BEGIN{x=0;m=1} /$MA/{x=1;m=0} /Last beacon/{gsub(\"ms\",\"\"); if (x) print \$4} /IE: Unknown/{x=0} END{if(m) print \"missed\"}")
    if [[ "$R0" == "missed" ]]; then
        echo missed
        sleep 3
    echo $T0 $R0
    echo $T0 $R0 >> scan.log
    sleep 3


To generate the percentage failure:

awk 'BEGIN {p=0;f=0} { if ($2 > 1000) { f++ } else { p++ } } END { print p, f, f * 100 / p }' scan.log

James Cameron

More information about the Server-devel mailing list