[Wikireader] Wikireaders Update from Wikimania

emmanuel at engelhart.org emmanuel at engelhart.org
Mon Jul 21 07:43:22 EDT 2008


 Le sam 19/07/08 11:39, "Samuel Klein" meta.sj at gmail.com a écrit:
> Tim offered to generate dumps of article subsets for offline use (and
> encourages us to think about using HTML, which may be optimizable to
> take up a similar amt of space to the XML dumps when compressed)

He will maybe find an interesting stuff here:
http://kiwix.svn.sourceforge.net/viewvc/kiwix/

My ideas about the compression of HTML :
* take a look to XBL an alias standart (X)HTML tags/attributes (only for xulrunner based readers)
* Study in depth LZMA to find the best params and the best size for articles chunks

> He says
> these libraries are recently being used by kiwix -- Emmanuel, perhaps
> you can comment... 

Definitely true. We will take zeno as standart storage format in the future.
Tommi already did a great job on the zenoreader & indexer.
The next improvements (lzma + article chunk compression) are really promising.
I see only one shadow : work has to be done on the windows portability of the zenolib , but I'm pretty confident, that point will be resolve too.

Regards

Emmanuel



More information about the Wikireader mailing list