#7458 BLOC 8.2.0 (: Intermittent lockup during WOL suspend/resume stress

Zarro Boogs per Child bugtracker at laptop.org
Mon Jul 28 15:08:02 EDT 2008


#7458: Intermittent lockup during WOL suspend/resume stress
----------------------------+-----------------------------------------------
   Reporter:  dsaxena       |       Owner:  dsaxena                          
       Type:  defect        |      Status:  assigned                         
   Priority:  blocker       |   Milestone:  8.2.0 (was Update.2)             
  Component:  not assigned  |     Version:  Development build as of this date
 Resolution:                |    Keywords:  joyride-2131:-                   
Next_action:  diagnose      |    Verified:  0                                
  Blockedby:                |    Blocking:  7393                             
----------------------------+-----------------------------------------------

Comment(by dilinger):

 Regarding the EC timeouts (it would be nice to have that in a separate
 bug, but for now):

 We see two different timeouts; one is before the command starts (timeout-
 waiting-to-quiesce), and one after sending the command (timeout-waiting-
 for-command-read).  In both cases, we're waiting for the IBF flag to be
 cleared.  There was a theory about the LPC bus floating during suspend,
 which could cause IBF to be in an unknown state immediately after resume..
 however, if we pass the IBF-quiesce check, then we can be assured that IBF
 is clear before sending the EC our command byte.  So, instead:

 {{{
 15:06 =dilinger>         outb(cmd, 0x6c);
 15:06 =dilinger>         if (wait_on_ibf(0x6c, 0)) {
 15:06 =dilinger>                 printk(KERN_ERR "olpc-ec:  timeout
 waiting for EC to read "
 15:06 =dilinger>                                 "command!\n");
 ...
 15:08 =dilinger> we stuffed a byte into 0x6c, which presumably sets IBF
 15:09 =dilinger> and then time out waiting for IBF to be clear
 15:09 =smithbone> yeah.. writes to 6c set ibf.
 15:09 =dilinger> we have a mdelay(1) in our IBF poll loop
 15:09 * smithbone wonders if I when I clear the flag if it really gets
 clear.
 15:10 =smithbone> I can clear then check.
 15:10 =dilinger> is it possible for the EC to clear IBF and then set it
 again
                  during that 1ms delay?
 15:10 =dilinger> *nod*
 15:10 =smithbone> Oh.. yeah...
 15:10 =smithbone> If I stick the data back in the register.
 15:10 =smithbone> That could totally happen.
 15:11 =smithbone> And fastpath would make it much worse.
 15:11 =dilinger> so, um...
 15:11 =smithbone> Hmmm....
 15:11 =smithbone> heh..
 15:11 =dilinger> is there a way we can prevent that? :)
 15:12 =dilinger> either by the kernel telling the EC that it's waiting for
 IBF,
                  and the EC not clearing IBF until the kernel has said
 that
                  it's no longer waiting..
 15:12 =smithbone> Dunno.. Let me think.. ok time for my con call..
 Excellent
                   brainstorming.. I think we may be on to it.
 15:12 =dilinger> or by changing the kernel code to not delay 1ms (which
 still
                  seems racy, and also is going to hurt performance)
 15:13 =smithbone> yeah.. let me ponder a bit.
 15:13 =dilinger> or by having the EC detect that the kernel is busy
 polling
                  IBF, and not touch it until it's been at least 2ms since
 the
                  kernel read?
 }}}

-- 
Ticket URL: <http://dev.laptop.org/ticket/7458#comment:39>
One Laptop Per Child <http://laptop.org/>
OLPC bug tracking system


More information about the Bugs mailing list