[Server-devel] ejabberd's mnesia breaking - (Re: Almost-released: XS-0.5.2)
Martin Langhoff
martin.langhoff at gmail.com
Tue Mar 17 23:44:56 EDT 2009
On Wed, Mar 18, 2009 at 3:55 PM, Martin Langhoff
<martin.langhoff at gmail.com> wrote:
> I am mostly happy, but there is a nasty issue with upgrades of
> ejabberd. Investigating. Had not been an issue before, and may be
> related to a big erlang update that slipped in from fedora updates.
Some notes of what seems to be happening
- On an XS-0.5 that has had ejabberd configured, upgrading to 0.5.2
via anaconda... (I strongly suspect yum driven updates don't see this
problem.)
- Upgrade log (attached) shows that we upgrade ejabberd-xs, then
erlang, then xs-config
- When xs-config is upgraded, it re-runs the config "preprocessor" to
expand @@SERVERNAME@@ on any new/updated config files. During that
stage, we touch (but actually don't change) the ejabberd-xs.cfg file,
and calls 'service ejabberd condrestart' -- and right there we see an
erlang kernel crash.
- *** ejabberd should not be running ***
- the db in /var/lib/ejabberd/spool looks fairly b0rken
- downgrading erlang in case mnesia format had somehow changed -- the
database is still unreadable (and ejabberd 2.0.x series all have the
same db format, I've been upgrading and downgrading all the time on my
dev boxen without a single problem).
- downgrade erlang, init a new DB, feed it some data, upgrade erlang:
no DB corruption
The main problem is why is ejabberd running under anaconda? There is a
very good chance that *that* is the src of the corruption.
- All the %post scripts for the ejabberd-xs pkg and for xs-config
issue 'condrestart' which checks /var/lock/subsys/$svcname to see if
the service is running, and only restarts if it is running. Using
condrestart is the recommended practice in rpm packaging...
- I checked carefully that we don't erroneously say 'start', and that
condrestart is well implemented in the ejabberd init script...
- During init, /var/lock/subsys is cleared, so even after a hard
poweroff the init scripts should not be confused about the state of
things. I am right now trying to see if anaconda does the same. This
is the only working theory I have...
Any other suggestions? I don't think this is something I can pin on
ejabberd or erlang. Looks more like a gotcha with anaconda.
cheers,
m
--
martin.langhoff at gmail.com
martin at laptop.org -- School Server Architect
- ask interesting questions
- don't get distracted with shiny stuff - working code first
- http://wiki.laptop.org/go/User:Martinlanghoff
-------------- next part --------------
A non-text attachment was scrubbed...
Name: upgrade.log
Type: application/octet-stream
Size: 4871 bytes
Desc: not available
Url : http://lists.laptop.org/pipermail/server-devel/attachments/20090318/090d1c81/attachment-0001.obj
More information about the Server-devel
mailing list