[Wikireader] Welcome new members Tabitha Roder and Håkon Lie; Wiki-to-html conversion and compression
Chris Ball
cjb at laptop.org
Mon Feb 9 17:23:36 EST 2009
Hi,
> 7z and bz compression were quite similar (whereas if we were
> storing many revisions of each article it would be a huge
> difference). I believe we're using a variant on standard bzip that
> allows the per-article decompression we need; Chris can explain in
> more detail.
The file is standard bzip2, but the tools aren't. In short, we create
an index from article titles into individual bzip2 blocks, and then
have a tool (based on a modified bzip2recover program) that is able
to decompress arbitrary individual blocks of bzip2 archives. So, we
ask for an article title which is translated into a block <n>, and
only that block is decompressed and returned.
If we were to move to a different compression scheme, we would need
a similar tool for that scheme.
Thanks,
- Chris.
--
Chris Ball <cjb at laptop.org>
More information about the Wikireader
mailing list