Bad interaction between sleep timeouts and Salut?
Tomeu Vizoso
tomeu at tomeuvizoso.net
Thu Sep 16 04:24:26 EDT 2010
On Thu, Sep 16, 2010 at 05:26, James Cameron <quozl at laptop.org> wrote:
> On Wed, Sep 15, 2010 at 08:04:25PM -0400, Martin Langhoff wrote:
>> I am curious -- what is the frequency of "presence broadcasts" when
>> Salut is used? Where is it set?
>
> Three minutes. Perhaps KEEPALIVE_TIMEOUT in
> gibber-r-multicast-causal-transport.c
So Salut is an implementation of XEP-0174: Serverless Messaging:
http://xmpp.org/extensions/xep-0174.html
In this section of the document is an overview of the different
components used by this protocol:
http://xmpp.org/extensions/xep-0174.html#howitworks
As you can see, basic presence is done with multicast DNS/DNS-SD and
Salut uses Avahi for that.
Gibber is a library internal to Salut that implements the other part
of the protocol: XMPP message passing between nodes.
With that in mind, what I would do next is to try to isolate the
problems within either Avahi or Gibber, that will reduce significantly
the amount of code we need to care about and will be useful in further
debugging and tuning. avahi-browse and avahi-discover are useful tools
for monitoring the state of Avahi, so if you could reproduce the same
issues there, then we could rule out a big chunk of code.
If that's the case, then these two functions in Salut would be most
relevant, the first announces a DNS service for the buddy and the
other does the same for shared activities:
http://git.collabora.co.uk/?p=telepathy-salut.git;a=blob;f=src/salut-avahi-self.c;h=dab2c78247c84bfba7b693251e6ac3703648eb13;hb=HEAD#l220
http://git.collabora.co.uk/?p=telepathy-salut.git;a=blob;f=src/salut-avahi-olpc-activity.c;h=2a3681aca7568d1c8bf4ce97b161f79c11ef6165;hb=HEAD#l157
Assuming the problem is in the Avahi level, we should make sure that
the system is coming out from suspend when the radio receives
multicast activity directed to us so Avahi can properly update its
internal state and also for activities using clique (multi-user chat
rooms on server-less XMPP). A couple more of references:
http://files.multicastdns.org/draft-cheshire-dnsext-multicastdns.txt
http://telepathy.freedesktop.org/wiki/Clique
With Gabble we should see similar effects if the machine is not
resuming when the Jabber server sends us updates.
Regards,
Tomeu
>> With current 10.1.2 using salut over ad-hoc the neighbourhood view
>> never stabilises, and it's very apparent that it's because nodes go to
>> sleep too quickly (before they see others, before they are seen). And
>> when awake, they take a long time to regain a good view of what's out
>> there.
>>
>> If we can experiment with the timeouts (knowing where the knobs are)
>> maybe we can find out...
>>
>> (Tomeu, I bother you because I suspect you'd know... hope it's not an
>> annoyance...)
>
> http://git.collabora.co.uk/?p=telepathy-salut.git;a=blob;f=lib/gibber/gibber-r-multicast-causal-transport.c;h=3cdb2a8d3b88de473ee473da3af15e280d355c26;hb=HEAD
>
> Salut over ad-hoc seems to use multicast packets, according to my
> tcpdump, so this source file appears relevant.
>
> The timers are specified there as constants, without configurable
> settings. So if you don't mind fiddling with them and recompiling, it
> should be possible to improve the situation, at the expense of increased
> wireless traffic.
>
> Might also detect resume, evaluate time interval lost, and expire the
> timer sooner.
>
> --
> James Cameron
> http://quozl.linux.org.au/
>
More information about the Devel
mailing list