The reason we see icons flashing here and there in the mesh view.. i.e. "xmas tree effect"

Michail Bletsas mbletsas at laptop.org
Fri Dec 14 06:38:40 EST 2007


And what worries me even more is that we will further cripple the laptop 
by always turning off local-link collaboration every time we are able to 
contact *any* jabber server. I really don't want to have the MPP fiasco 
repeated....

Jabber servers on the local network, need to be able to identify 
themselves (even with mDNS or a predefined anycast address) and then and 
only then we can turn mDNS off.

M.





John Watlington <wad at laptop.org> 
12/14/2007 12:43 AM

To
"Giannis Galanis" <galanis at laptop.org>
cc
John Watlington <wad at laptop.org>, "Michail Bletsas" <mbletsas at laptop.org>, 
"Kim Quirk" <Kim at laptop.org>, "Ricardo Carrano" 
<carrano at ricardocarrano.com>, guillaume.desmottes at collabora.co.uk, "Robert 
McQueen" <robert.mcqueen at collabora.co.uk>, "Simon McVittie" 
<simon.mcvittie at collabora.co.uk>, devel <devel at laptop.org>
Subject
Re: The reason we see icons flashing here and there in the mesh view.. 
i.e. "xmas tree effect"







What worries me most about this is the revelation that we continue to 
rely on mDNS
when connected to internet infrastructure.  When in the presence of a 
school server,
(or connected to a jabber server), mDNS should be shut down. 
Otherwise we risk
a network meltdown....

wad

On Dec 13, 2007, at 11:18 PM, Giannis Galanis wrote:

> I had several tests related to the xmas tree effect we see in the 
> mesh view.
>
> The effect is that some times XOs disappear + reappear to the same 
> or different position, or simply disappear. More usually it happens 
> for many XOs simultaneously.
>
> The results i have, clearly indicate that this is an issue an the 
> Avahi daemon, which is used by the Salut telepathy service. The 
> sugar interface displayes the information it receives from salut 
> very reliably. This means that when a host dissapear from the 
> avahi's host list, it vanished instantly from the mesh view, and 
> the same when a new host arrives.
>
> The Avahi deamon runs below Salut and keeps receives information 
> from other hosts in the network which also run Avahi deamon.
> It keeps a local cache with the recent hosts.
> At regular intervals(of 1-2 mins i think), it checks whether the 
> hosts in the cache are alive. If not, they are recorded as "failed"
> The above check can be invoked by "avahi-browse -t -r 
> _presence._tcp"  continuously(instead of waiting for 1-2mins)
> After a certain timeout, a failed entry(dead host) will disappear 
> from the cache, and instantly it will disappear from the mesh view.
>
> This timeouts is pretty long(several minutes), so a host(XO) has 
> the chance to become alive again with no effect on the mesh view.
> This can occur when:
> a. the XO's avahi packets dont get through due to high mesh 
> traffic. In this case the other XOs might either see is as alive, 
> or dead according to the conditions.
> b.the XO's deliberately moved to another channel, or anyway 
> disconnected. In that case, all othes XOs will see it as dead
> From a client's point of view, the two cases are treated almost the 
> same.
>
> THE TEST:
> 6 XOs connected to channel 11, with forwarding tables blinded only 
> to them selves, so no other element in the mesh can interfere.
>
> The cache list was scanned continuously on all XOs using a script
>
> If  all XOs remained idle, they all showed reliably to each other 
> mesh view. Every 5-10 mins an XO showed as dead in some other XOs 
> scns, but this was shortly recovered, and there was no visual 
> effect in the mesh view.
>
> If you switched an XO manually to another channel, again it showed 
> "dead" in all others. If you reconnected to channel 11, there is 
> again no effect in the mesh view.
> If you never reconnected, in about 10-15 minutes the entry is 
> deleted, and the corresponding XO icon dissapeared from the view.
>
> Therefore, it is common and expected for XOs to show as "dead" in 
> the Avahi cache for some time for some time.
>
> THE BUG:
> IF a new XO appears(a message is received through Avahi),
> WHILE there are 1 or more XOs in the cache that are reported as "dead"
> THEN Avahi "crashes" temporarily and the cache CLEARS.
>
> At this point ALL XOs that are listed as dead instantly disappear 
> from the mesh view.
> But, of course, some of the "dead" XOs are expected to re-appear 
> shortly. Specially those that are still in the same mesh channel, 
> but merely failed to transmit its avahi packets due to traffic load.
>
> Note that if there is only 1 XO that looks dead, but returns, 
> everything is normal.
> But, if there are 2,3.. XOs that look dead, when 1 returns, then:
> a. all(the dead ones) disappear from the view
> b. the 1 that returned will reappear right after in probably a 
> different position. i.e. it will "jump"
>
> The avahi-browse command scans realtime the network(i.e. sends 
> requests for all hosts in its cache list) and runs for a several 
> seconds. If the above situation occurs, it freezes(this is what i 
> meant by "crashes"). When it is restarted the cache is cleared from 
> previously dead hosts.
>
> A typical situation that the "xmas tree effect" occurs:
> 20 XOs are running salut in channel 1. This incuded XOs conencted 
> to medialab AP, schoolserver, linklocal.
> XOs leave the channel continuously.
> Concurrently, some connected XOs appear dead for 1 minute or so, 
> and reappear after short time.
>
> Assume that at some point 5 XOs have either really left, or "seem 
> dead" anyway
>
> At some point 2 of these XOs are reconnected at the same time to 
> the mesh channel by someone in the office.
> The 2 XOs will "jump" to a different position, whereas the other 3 
> will simply vanish
>
> The way I see it, there is very clear/narrow/specific bug in 
> handling the cache by the avahi daemon,
> when new hosts + dead hosts coexist.
>
> I hope the tests have cleared the picture alot
>
> yani
> _______________________________________________
> Devel mailing list
> Devel at lists.laptop.org
> http://lists.laptop.org/listinfo/devel


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.laptop.org/pipermail/devel/attachments/20071214/21e2fabe/attachment.html>


More information about the Devel mailing list