Thank you all for your replies. They clear the picture a lot.
<br><br>To summarize:<br><br>1. We need to fix the timeout for icons to disappear. Can we try Guillaume's patch? Also we need to be able to resolve which icons are currently not avaiable(but still appearing). I believe that failed entries in _precense._tcp is a complete list. Is this correct?
<br><br>2. We need to be able to restart PS. As you say this is not possible, but if we restart sugar will PS restart as well?<br><br>3. We need to force gabble to run. We have several instances of 4193 (almost all XOs connected to schoolserver,AP are running salut). Or at least to force trying to connect to jabber server.
<br><br>4. The process of trying to connect to the jabber server, is done by telepathy-gabble, or by the presence<br><br><div><span class="gmail_quote">On 11/6/07, <b class="gmail_sendername">Simon McVittie</b> <<a href="mailto:simon.mcvittie@collabora.co.uk" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">
simon.mcvittie@collabora.co.uk</a>> wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">-----BEGIN PGP SIGNED MESSAGE-----<br>Hash: SHA1
<br><br>In reply to your previous mail, "iff" means "if and only if". It's often<br>used by mathematicians.<br><br>On Tue, 06 Nov 2007 at 03:23:39 -0500, Giannis Galanis wrote:<br>> What does proper notification mean? Which are the cases that it happens?
<br><br>If Salut is explicitly asked to disconnect, it will tell Avahi to "delete"<br>all its mDNS records (this actually consists of re-sending all the<br>records it was advertising, with the Time To Live set to 0 seconds).
<br>This is sometimes referred to as a "goodbye" packet. See<br><a href="http://files.multicastdns.org/draft-cheshire-dnsext-multicastdns.txt" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">
http://files.multicastdns.org/draft-cheshire-dnsext-multicastdns.txt
</a><br>section 11.2 "Goodbye Packets".<br><br>The only time we'll currently do this is when switching off Salut because<br>Gabble has connected successfully.<br><br>> Probably this is not if an XO moves slowly to a place with poor
<br>> connectivity.<br><br>This is never done in response to network conditions - we can't know that<br>we've lost network connectivity until it's too late.<br><br>If the Time To Live on our mDNS records expires, that should have the same
<br>effect; however, as Sjoerd explained, we currently ignore that, because<br>the 1CC mesh network is apparently unstable enough that the TTL<br>sometimes expires even for laptops that are actually present.<br><br>> In the case of a temporary(short) disruption of connectictivity, how much
<br>> time does it generally take for it to return? You mentioned that in the past<br>> XOs were appearing and disappearing constantly. This implies that the<br>> common drop of connectivity is in the scale of few seconds.
<br><br>You tell me! :-) I don't have enough XOs to replicate the conditions of<br>a large mesh network like 1CC, so I can't comment on packet loss rates.<br>Perhaps Dan Williams (who used to maintain Presence Service) could help
<br>you.<br><br>> If it is lost<br>> for more than a few minutes, than it is not bad for the XO to leave and<br>> return. So I believe that 1h or even 10min are too long timeouts.<br><br>I believe we're currently using Avahi's default timeouts, which are
<br>those recommended in the mDNS draft (linked above). If I'm right about<br>that, then we're using 120 second TTLs for the SRV and A records.<br><br>Assuming Salut and Avahi follow the draft's recommendations, this means
<br>that for the records representing activities, buddies and laptops, if we<br>haven't seen an annoucement of a particular record, we will:<br><br>- - re-query after 96 - 98.4 seconds;<br>- - if no reply, re-query after 102 -
104.4 seconds;<br>- - if no reply, re-query after 114 - 116.4 seconds;<br>- - if no reply, assume the record has vanished after 120 seconds.<br><br>(In each of the ranges given for the re-queries, the exact time is<br>chosen at random, to avoid simultaneous queries from everyone in the
<br>network.)<br><br>The timeout is reset as soon as we see any announcement of a record.<br><br>The only ones whose disappearance matters are the SRV and A records - if<br>a TXT record fails to disappear when it shouldn't, we don't really care.
<br>TXT records have a substantially longer timeout (the draft recommends 75<br>minutes).<br><br>> There are a couple more things I would like to address:<br>><br>> 1. Is there a way to restart the presence service? In that way we can
<br>> resolve a weird state. Will killing restarting the porcess work?<br><br>Only if client code that accesses the PS is amended to cope with this<br>(I just filed #4681 to represent this). Until #4681 is closed, if the PS
<br>was restarted, nothing would work - use Ctrl+Alt+Backspace to restart all of<br>Sugar. Please see the bug for more details or to reply.<br><br>> 2. At what point in the source code, the presence serivce<br>> i.will
try to connect to the jabber server?<br>> ii. run gabble?<br><br>I'll answer (ii.) first. Gabble is automatically run by the session bus<br>(dbus-daemon) via service activation, the first time the Presence Service
<br>uses it, if it isn't already running. So there is no explicit code in the PS<br>to run Gabble.<br><br>OK, now (i.):<br><br>When Network Manager indicates that we have a valid IP address, we run<br>the _init_connection method of the ServerPlugin instance. If the Gabble
<br>connection fails, we schedule a timer (currently 5 seconds) and retry<br>running _init_connection when the timer runs out. (classes<br>TelepathyPlugin and ServerPlugin, methods _init_connection,<br>_reconnect_cb, _could_connect, _handle_connection_status_change.)
<br><br>What _init_connection does is: If there's already a Gabble connection and it's<br>connected, it'll be used. (class ServerPlugin, method<br>_find_existing_connection). Otherwise we make a new connection (method
<br>_make_new_connection).<br><br>ServerPlugin (src/server_plugin.py) inherits from TelepathyPlugin<br>(src/telepathy_plugin.py) so some of the methods I mentioned are defined<br>in TelepathyPlugin, some in ServerPlugin, and some are defined in
<br>TelepathyPlugin but overridden in ServerPlugin.<br><br>> ii. what type of communication is taking place between NM and PS<br><br>D-Bus messages, on the system bus.<br><br>> iv. the internet connectivity is detected by NM and sent to PS, or detected
<br>> by PS<br><br>Internet connectivity isn't really detected, as such. The PS listens for<br>signals from Network Manager that tell it that the IP address has<br>changed. Whenever we have an IP address, we tell Gabble to connect to
<br>the XMPP server; the nearest thing we have to "detecting Internet connectivity"<br>is that if we have it, Gabble will succeed.<br><br>In response to Gabble succeeding with a connection, PS calls the<br>Disconnect method on the Salut connection.
<br>-----BEGIN PGP SIGNATURE-----<br>Version: GnuPG v1.4.6 (GNU/Linux)<br>Comment: OpenPGP key: <a href="http://www.pseudorandom.co.uk/2003/contact/" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">
http://www.pseudorandom.co.uk/2003/contact/</a> or <a href="http://pgp.net" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">
pgp.net</a><br><br>iD8DBQFHMEgwWSc8zVUw7HYRAoH2AKC71yprDPK/KPOyGAwez12odisbfQCgjMdY<br>1Fg4j1GS02m7HlnrhZBOe5Y=<br>=g6CY<br>-----END PGP SIGNATURE-----<br></blockquote></div><br>