mwlib: reworking re2c files to use ctypes
martin at laptop.org
Fri Jan 28 07:55:45 EST 2011
Hi Ralf, Volker,
writing to you as you seem to be active maintainers of mwlib and the
OLPC ships an early version of mwlib in its WikiBrowse (aka
Wikiserver) activity, and it's a tool of major important. (Thanks for
your code! Having a nice wikislice on the many XOs that have little or
no connectivity makes a huge impact out there.)
The compiled .so files are a bit of a problem currently for us. We
ship "activities" (user-installable program bundles) that are usually
pure python, and (if prepared carefully) can be installed in several
releases of our OSs, which in turn are based on various Fedora
Binaries are not recommended inside of those bundles, but if they link
to generic libs with stable API/ABI, things are generally ok.
The re2c binaries from mwlib, unfortunately, inteface with Python
using swig, which means that they end up linking directly to
libpython. We use Python extensively, so we update somewhat
aggressively to the latest version in Fedora. So what happens is that
those SO files end up being tied to specific versions.
There is a different, better way to do this -- to create standalone
.so files, and to use them from Python using ctypes. That way, we can
distribute precompiled .so files that are significantly more portable
(they are still arch and glibc ABI specific).
Would that be of interest to you? Has anyone thought about this, or
worked on this?
If yes, I have done some initial hacking on this you might be
interested in. I have attached a WIP patch against an earlier version
of your _expander.re, it drops a lot of the glue, like:
mwlib/Makefile | 4 +-
mwlib/_expander.re | 75 ++++++++++-----------------------------------------
2 files changed, 17 insertions(+), 62 deletions(-)
It is not finished, definitely work-in-progress. Once it works, you can just use
_expander = ctypes.CDLL('_expander.so')
And same for _uscan.re .
I now see that in your latest code you are actually not using
_expander.re anymore. How does the Python-based tokenizer perform,
compared to the re2c tokenizer? We care a lot about keeping things
martin at laptop.org -- Software Architect - OLPC
- ask interesting questions
- don't get distracted with shiny stuff - working code first
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 3137 bytes
Desc: not available
More information about the Devel