Salut/avahi/meshview issues

Sjoerd Simons sjoerd at luon.net
Fri Feb 1 15:42:59 EST 2008


On Thu, Jan 31, 2008 at 07:32:36AM -0500, Giannis Galanis wrote:
> I believe our current salut/avahi issues are described in the following
> points:
> 
> 1. I was under the impression that when a peer switches channels it sends a
> "goodbye signal". And in fact only anorthodoxically removed peers(after
> crashes/poweroffs by pressing the button etc) would delay to disappear from
> mesh views.  The 10min TTL is not unreasonable, but it should only be used
> for a routine check. In fact peers that leave/arrive should inform the mesh
> instantly. In that case the 10min TLL will only affect only the mesh points
> with noisy links that their "goodbye" signals will get lost. And these
> connections are less priority anyway. Also we could send 2/3 "goodbye"
> signals to "ensure" delivery.

I don't think avahi gets a chance to send goodbye packets. More specifically i
don't think NM or other mechanism actually tell avahi: Oh we're going to leave
the network, please say goodbye and then give it a chance to actually send the
necessary goodbyes

> 2. We should definitely decrease the timeout window between a lost peer
> being detected, and the actual disappearance from the mesh view. This used
> to be 10min, now it is 20min, but really, to my experience, if a peer is for
> more than 1-2min away he aint coming back.

In the code it's actually 12m + the time it takes avahi to conclude a node has
gone. So this used to be around 14 minutes maximally, but with the upped TTL to
10 min it will be around 22 minutes. It might be interesting to see if with the
latest patches the amount of false-negatives has gone down so much that we can
remove the or at least decrease the slack time we add after a node has gone in
avahi.

> 3. Should we make the above TTL and timeout to be user specific, or custom
> anyway?. Will there be a problem if two XOs have different TTL? I would
> assume that it wont. The idea is that it is a waste of our resources to try
> to calculate the ideal values of TTL and timeout by asking the collabora
> team to fix, and fix again. Whereas we can make the test here in 1cc, and
> find ourselves which suits as best. Is it easy to implement such a patch?

> 4. The 5501 bug(xmas tree effect). This is a very specific bug in the
> protocol, and i believe it will be sorted soon.

This one is fixed right?

> 5. Why are avahi/salut/mesh view not communicating well? I hope we will have
> some answers on that as well.

I'm not sure. If salut and the mesh view fail to communicate, the same problem
should show up with gabble.

  Sjoerd
-- 
"Consider a spherical bear, in simple harmonic motion..."
		-- Professor in the UCB physics department



More information about the Devel mailing list