Salut and Suspend/Resume issues

Ricardo Carrano carrano at ricardocarrano.com
Tue Feb 19 14:48:16 EST 2008


Yanni,

Timeout is a value, not a range. The effects brought by the timeout may
manifest in a period (a range).

I believe everyone will agree that 30 minutes is a long time to wait (and
like Polychronis added) defeat the whole idea of a presence service.

But, what I want to stress is that we are dealing with different issues
here.

I don't believe this 30 minutes or the xmas tree effect is related to
suspend/resume. Those seem like bugs somewhere in the stack of software that
support presence, while the suspend/resume issues are clearly a side effect
of  the multicast traffic not being "heard" by a suspended XO.



On Feb 19, 2008 3:00 PM, Giannis Galanis <galanis at laptop.org> wrote:

> The list expires in 10min-30min.
>
> But we cant wait 30min before suspending, it is way too long.
>
>
> On Feb 19, 2008 11:37 AM, Ricardo Carrano <carrano at ricardocarrano.com>
> wrote:
>
> > Yanni,
> >
> > As I posted in the bug, I believe that you are observing the entries on
> > the avahi cache expiring.
> >
> > So, your first scenario would happen when the suspend time is longer
> > than the time it takes for all entries to expire.
> > The second scenario would happen when the suspend time is not long
> > enough to make all cached entries to go away.
>
>
> Oh i see that you mean. But, i think both cases are when the suspend time
> is longer than time to expire.
> The first is UI effect, and might have no relation to salut, but to mesh
> view in general
> The second is an avahi effect, that the avahi cache is chagned
> Both, are in long suspends
>
> >
> > And the third scenario seems related to previous reports you've made on
> > the Xmas tree effect, so not related to suspend/resume.
>
>
> The xmas tree effect appears when XOs leave connection, while others
> return.
> Suspend/resume enhances this effect dramatically, because in 1-2min
> everyone goes away, and they return at random time according to when they
> resume.
>
> In my suspend-salut tests , the xmas tree effect(although NOT related to
> suspend/resume), it affects salut alot more then the other 2 scenarios
>
> My point is that we must fix it anyway. But especially now!!
>
>
> >
> > What do you think?
> >
>
> I have 2 questions that will help (me) understand alot about the
> situation:
>
> 1. When a XO resumes, does it send any notification via avahi, that it is
> back? Because if it doesnt, then other XOs that have cleared it from their
> lists, they will never search for it.
>
> 2. Every scans the network every 10min, to check whether its avahi peers
> are alive, in multicast packets. Do these packets include the address of the
> peers/targets? I think they do, unless i am very confused. Couldn't we
> awake/resume the target XO when it receives these specific packets?
>
> we need to do some sniffing
>
>
>
> >
> > On Feb 19, 2008 1:13 PM, Giannis Galanis <galanis at laptop.org> wrote:
> >
> > >
> > >
> > > On Feb 19, 2008 10:13 AM, Ricardo Carrano <carrano at ricardocarrano.com>
> > > wrote:
> > >
> > > >
> > > > I was asking whether it would help to have the wireless module wake
> > > > > us
> > > > > on multicast packets instead of only unicast.  Are you saying that
> > > > > it
> > > > > would?
> > > >
> > > >
> > > > It seems so, though it would, as John points out, make resumes far
> > > > more constant. It seems we have to find a creative way out of this tough
> > > > choice (automated suspend vs mesh) or face it.
> > > >
> > > >
> > > > >
> > > > >
> > > > >   > Avahi entries will expire after some time. Suspend will
> > > > > prevent it
> > > > >   > to update its cache.
> > > > >
> > > > > Yani's bug report (#6467) suggests that Avahi entries often expire
> > > > > immediately upon resume:
> > > > >
> > > > >   After the XO resumes (probably after beinng suspended for
> > > > > several
> > > > >   minutes) all the icons in the mesh view vanish, except the mesh
> > > > >   circles.
> > > >
> > > >
> > > > I read this as the avahi-cache  expiring its entries.  Yanni  can
> > > > you put timeframes on this?
> > > > Could check how long does it take to expiry an entry (TO) and then
> > > > check if:
> > > > Suspend time > TO -> all entries vanish
> > > > Suspend time << TO -> no entries vanish
> > > > Supens time ~ TO -> some entries vanish
> > > >
> > >
> > > There as 2 cases where icons vanish due to suspend.
> > >
> > > 1st: The moment you resume(it generally happens after long suspends),
> > > all icons vanish instantly(APs/XOs). This bug (#6467) suggests that sugar
> > > has a problem with suspend resume.
> > > The icons slowly reappear. I assume that if the avahi peer list is
> > > intact that all XOs return.
> > >
> > > 2nd: The avahi list smtimes looses some or all of the peers at resume.
> > > This is also under 6467, but it seems technicaly different. One possible
> > > explanation could be that during suspend th XO resumes several times, but i
> > > didnt notice it! And within this time frames it realized that the other
> > > suspended XOs are gone, so it cleared its cache. Now when I resumed it
> > > myself, I observed that the cache is clean!!
> > >
> > > Now, regarding the timeouts of avahi. This is a 3rd thing:
> > > When an XO leaves the channel we have 4 states:
> > >    mm:ss
> > > 1. 00:00  XO leave the channel(manually/or ti suspended)
> > > 2. 10:00  Avahi notices teh XO left, and reports it as "failed"
> > > 3. 30:00  Icon dissappears in the mesh view
> > > 4. 60:00  Avahi cache is cleared
> > > Additionally there is a bug(#5501) according to which, is a NEW XO
> > > arrives between states 2 and 3, then instantly ALL "failed" avahi peers are
> > > cleared and the corresponding icons vanish.
> > >
> > > So, the 3rd case is the following:
> > >
> > > Assume a mesh has e.g. 20 XOs, and I use my XO so it doesnt suspend,
> > > but the rest 19 of them are suspended.
> > > If in >10mins a new XO arrives, then all the 19 XOs instantly vanish
> > > from the mesh.
> > >
> > > So the TO time is between 10->30min... but closer to 10min if many XOs
> > > suspend/resume
> > > So if resume time << 10min everything is fine!!
> > >
> > >
> > >
> > > What i dont know is when an XO resumes if it sends any avahi packet no
> > > notify tis presence/return. Because if it doesnt, then the XO wont exist int
> > > he others cache list, so the others wont search for it.
> > > Sjoerd, can you answer this?
> > >
> > > This would explain why after resume some XOs take tooo long to see
> > > each other again.
> > > If you combine this with the "2nd" case, you will see that in the
> > > natural case that XOs will resume at random points in time by the user, they
> > > will all clear their cache, unless they resume concurrently.
> > > So in the end, all will have empty caches!!
> > >
> > >
> > >
> > >
> > > >
> > > > >
> > > > > Thanks,
> > > > >
> > > > > - Chris.
> > > > > --
> > > > > Chris Ball   <cjb at laptop.org>
> > > > >
> > > >
> > > >
> > >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.laptop.org/pipermail/devel/attachments/20080219/85aa97af/attachment.html>


More information about the Devel mailing list