#9836 BLOC 1.5-sof: BUG() in if_sdio_handle_cmd()

Zarro Boogs per Child bugtracker at laptop.org
Wed Jan 6 08:19:18 EST 2010


#9836: BUG() in if_sdio_handle_cmd()
--------------------------------+-------------------------------------------
           Reporter:  dsaxena   |       Owner:  dsaxena            
               Type:  defect    |      Status:  new                
           Priority:  blocker   |   Milestone:  1.5-software-update
          Component:  kernel    |     Version:  1.5-B3             
         Resolution:            |    Keywords:                     
        Next_action:  diagnose  |    Verified:  0                  
Deployment_affected:            |   Blockedby:                     
           Blocking:            |  
--------------------------------+-------------------------------------------

Comment(by dsd):

 OK, I can reproduce this after all and I have an explanation.

 The libertas driver keeps 2 buffers for responses and alternates them
 between them for incoming responses. That way the driver can be processing
 1 response when another one arrives.

 When a command response arrives, it is put into the "other" buffer and
 then the main worker thread is notified that there is a new response. The
 main worker thread processes the response, then clears resp_len to mark
 the buffer as free.

 The driver makes no guarantee (and does not attempt to verify) that the
 previous response has been fully processed before the next one arrives.
 Dan Williams says that this will not happen at
 http://lists.infradead.org/pipermail/libertas-
 dev/2009-December/002921.html:

 {{{
 The driver only handles one command at a time because the firmware spec
 also says that the firmware only handles one command at a time.
 }}}

 And if the firmware behaves in this way, it seems fine. Commands are only
 sent by the worker thread, and responses are processed first, so we're
 never going to send 2 commands at once.

 What happens in the failure case is:
  1. command response interrupt arrives, gets saved into slot 1
  2. interrupt handler notifies main thread that new response is available
 in slot 1
  3. before the thread has had time to do anything, another command
 response interrupt arrives, response gets saved into slot 0
  4. interrupt handler notifies main thread that new response is available
 in slot 0
  5. main thread wakes up, processes slot 0, clears resp_len[0] to mark it
 as completed
  6. command response interrupt arrives, it should be saved in slot 1 but
 because the command response received in step 1 was never processed (and
 hence never cleared) then we hit this bug

 So, according to Dan William's explanation, this is a firmware bug. I'll
 now attempt to dump the contents of those 2 spurious command responses, as
 well as confirm which outgoing commands might have caused them.

-- 
Ticket URL: <http://dev.laptop.org/ticket/9836#comment:19>
One Laptop Per Child <http://laptop.org/>
OLPC bug tracking system


More information about the Bugs mailing list