#10842 NORM Not Tri: Wikiserver parser fails on full enwiki dump

Zarro Boogs per Child bugtracker at laptop.org
Thu Apr 28 05:42:47 EDT 2011


#10842: Wikiserver parser fails on full enwiki dump
------------------------+---------------------------------------------------
 Reporter:  rmo         |                 Owner:  cjb          
     Type:  defect      |                Status:  new          
 Priority:  normal      |             Milestone:  Not Triaged  
Component:  wikiserver  |               Version:  not specified
 Keywords:              |           Next_action:  never set    
 Verified:  0           |   Deployment_affected:               
Blockedby:              |              Blocking:               
------------------------+---------------------------------------------------
 I stumbled on a few strange parser bugs while testing ''server.py'' with a
 complete, recent, English Wikipedia dump.

 The first problem was a maximum recursion depth reached at just about
 every article of moderate complexity, which I worked around by
 blacklisting a few templates that often came up before the overflow [1].

 It soon became clear, however, that the problem went deeper. Here's a
 fragment of output demonstrating unexpected template
 expansion (to the non-existent ''Template:Main other''):


 {{{
 {{#switch:
   <!--If no or empty "demospace" parameter then detect namespace-->
   {{#if:{{{demospace|}}}
   | {{lc: {{{demospace}}} }}    <!--Use lower case "demospace"-->
   | {{#ifeq:{{NAMESPACE}}|{{ns:0}}
     | main
     | other
     }}
   }}
 | main     = {{{1|}}}
 | other
 | #default = {{{2|}}}
 }}<noinclude>
 {{documentation}}
 <!-- Add categories and interwikis to the /doc subpage, not here! -->
 </noinclude>

 expander.info >> parsing template "u'Template:Main other'"
 expander.warn >> using dummy resolver for fullpagename
 expander.warn >> using dummy resolver for fullpagename
 expander.warn >> using dummy resolver for fullpagename
 expander.warn >> using dummy resolver for fullpagename
 ...

 }}}


 It seems the omission of the exclamation point article from the
 `en_US_g1g1' is fortuitous; when present, it is frequently and
 absurdly embedded in other articles (eg. wp:Cloud) that nest a common
 {{!}} template.

 This strange behavior revealed another series of bugs: (1)
 there's no way, even with the edit directory enabled, to delete
 an article; (2) removing all of the text from an article fails
 silently; and (3) changes to an article are not reflected when
 invoked through a template.

 I wonder how much this situation would improve with upstream changes to
 mwlib?


 [1] I blacklisted ''Template:#time'', ''Template:Cite book'', and
 ''Template:Key press/core,'' but it should be noted that the blacklist
 isn't currently loaded; a small patch:

 {{{
 diff --git a/server.py b/server.py
 index 02d01a9..84287c1 100755
 --- a/server.py
 +++ b/server.py
 @@ -467,7 +467,7 @@ class WikiRequestHandler(SimpleHTTPRequestHandler):
          self.lang  = conf['lang']
          self.flang = conf['flang']
          self.templateprefix = conf['templateprefix']
 -        self.templateblacklist = set()
 +        self.templateblacklist = conf['templateblacklist']
          self.imgbasepath = self.flang + '/images/'
          self.wpheader = conf['wpheader']
          self.wpfooter = conf['wpfooter']
 }}}

-- 
Ticket URL: <http://dev.laptop.org/ticket/10842>
One Laptop Per Child <http://laptop.org/>
OLPC bug tracking system


More information about the Bugs mailing list