#10842 NORM Not Tri: Wikiserver parser fails on full enwiki dump

Thu Apr 28 09:18:42 EDT 2011

#10842: Wikiserver parser fails on full enwiki dump
           Reporter:  rmo         |       Owner:  cjb          
               Type:  defect      |      Status:  new          
           Priority:  normal      |   Milestone:  Not Triaged  
          Component:  wikiserver  |     Version:  not specified
         Resolution:              |    Keywords:               
        Next_action:  never set   |    Verified:  0            
Deployment_affected:              |   Blockedby:               
           Blocking:              |  

Comment(by martin.langhoff):

 Welcome to the fray, rmo! Agreed, the old mwlib we use is problematic.

 If you are diving into wikiserver, the best approach I suspect is to look
 at replacing all the old, crufty and horrendously patched mwlib with a
 current mwlib from upstream.

 Current mwlib upstream seems to have evolved into a pretty flexible
 toolkit, so what we had to patch before can probably be done in a much
 more elegant and maintainable way. And mwlib keeps up with Wikipedia dump
 formats and all.

 I am not sure if the mwlib compression + index + search scheme is good
 enough for us. Hopefully it is, and we can leave behind the unmaintained
 code we use now.

 Thanks for looking into this!

