[Testing] FYI: all 41 still on the mesh, 19 of which chat successfully

Kimberley Quirk kim.quirk at gmail.com
Mon Feb 2 15:12:57 EST 2009


To me it sounds a little worse than what we had with 8.2.0. I really  
thought we had all 40 laptops chatting and everyone saw every other  
one... for a short time (1-2 hours). Even before I left on Thurs night  
last week, I had 40 laptops and everyone of them saw exactly 40  
laptops when I typed olpc-xos. So something seems a little messed up  
in RF land.

Since I had seen that all laptops COULD see each other, then I'm more  
likely to put the culprit of todays behavior on:

1 - something else going on in RF that we can't 'see' but is affecting  
the tests
2 - perhaps right after a cleaninstall and first boot, things work ok,  
but after successive reboots, things degrade

I have no good idea or theory why #2 might be the case... and it would  
take some serious effort to do a new cleaninstall and re-register  
them, etc. But you actually have control over a test like that and you  
don't have much control over the RF in the area.

I don't think there is anything crazy about the RF at 1CC right now...  
at least when I was there it is was quiet and all other laptops we  
could see/touch were turned off. So I don't think we could rule out RF..

Maybe someone like Michael and Chris could look at the check-ins  
between 8.2.0 and 8.2.1 and see if anything was changed related to  
presence or telepathy. If not, it seems like these items and chat  
really should work as well as 8.2.0. Then we have to worry about  
whether any of the wireless driver or firmware changes could cause  
these problems.

Kim


On Jan 31, 2009, at 3:08 PM, Holt wrote:

> Machines all appeared properly registered to  
> "schoolserver.xs051.org" yet shared chat wasn't workable much at all  
> 12+ hrs later.  And "olpc-xos" returned very low numbers all around,  
> so I powered off every machine together, then started afresh:
>
> 40 machines were powered on, waiting 15-20 seconds between each  
> boot.  1 machine absolutely refused to show the shared chat bubble  
> even after reboot etc, but eventually 39 machines successfully  
> connected to a new shared chat -- hardly flawless but the vast  
> majority of chat messages get through.
>
> "olpc-xos" however returned widely varying results:
>   * 9 machines showed a number between 35 and 40
>   * 21 machines showed a number between 30 and 34
>   * 7 machines showed a number between 20 and 29
>   * 3 machines showed a number between 10 and 19
>
> 2 hours later, not a single "olpc-xos" return value appears to have  
> changed.
>
> Chat continued to work tolerably, with:
>   * 17 machines forcing me to the Home View at the very 1st keypress  
> -- then all was fine, after I rejoined the Chat still in memory  
> (from Home View)
>   * 5 machines had the Chat activity fail with errors; acknowledging  
> & ignoring error did not work -- then all was fine, after I stopped  
> Chat and rejoined the shared chat (from Network View)
>   * not every chat msg arriving on every machine, but perhaps this  
> lossiness is the norm...
>
>
> Kimberley Quirk wrote:
>> This is good information. One thought i had was that if the 12  
>> troublemakers were not correctly registered to the school server,  
>> then they would never share an activity properly.  I know I tried  
>> to register each laptop to the school server... but it could be the  
>> case that I registered a bunch of them while they were connected to  
>> the wrong AP, which might cause this kind of problem.
>>
>> To figure this out, you can look at the control panel of the  
>> suspect laptops, then click on network. They should all show the  
>> same server registration as the good laptops, something like  
>> "schoolserver.xs051.org".
>>
>> If any of them are different, then you probably need to remove the  
>> school registration and try registration again. Hopefully Reuben  
>> can help with this.
>>
>> The second scenario you mention sounds like the case when too many  
>> laptops (more than 10) are connected to a simple mesh (like mesh  
>> channel 1), and then there is no way they will share anything. This  
>> can happen when you reboot many laptops at basically the same time.  
>> If 20 laptops are all booting about the same time, some (many?) of  
>> them will not connect to desired AP, they will time-out from trying  
>> that and default to mesh channel 1.
>>
>> And, there is the possibility that too many reboots in a short  
>> period of time and not enough time for things to really settle  
>> might result in the presence service being confused.
>>
>> When things are working well these things will all be met:
>>
>> 1 - each laptop will show the same number of other laptops when you  
>> type: olpc-xos
>> 2 - each laptop will be registered to the same school server:  
>> control panel, network
>> 3 - starting/sharing chat on one laptop, you will see the chat icon  
>> on all other laptops and they can connect
>> 4 - typing in chat for each laptop can be seen by all other laptops  
>> over a period of a few hours (not 24 hours later)
>>
>> I think you need to get to this level of performance before  
>> releasing 8.2.1. What you describe below could be an important  
>> regression. Another test would be to get all the laptops up and  
>> registered to the school server and start the chat testing each new  
>> laptop against the ones already in the chat by typing a bit...  
>> before adding the next laptop. If there is a regression perhaps it  
>> shows up after the 10th laptop (or some such thing).
>>
>> I added the Testing group in case there are other thoughts  
>> (probably should have been on this thread from the beginning).
>>
>> Keep us posted,
>> Kim



More information about the Testing mailing list