#7458 BLOC 8.2.0 (: Intermittent lockup during WOL suspend/resume stress
Zarro Boogs per Child
bugtracker at laptop.org
Mon Jul 28 15:08:02 EDT 2008
#7458: Intermittent lockup during WOL suspend/resume stress
----------------------------+-----------------------------------------------
Reporter: dsaxena | Owner: dsaxena
Type: defect | Status: assigned
Priority: blocker | Milestone: 8.2.0 (was Update.2)
Component: not assigned | Version: Development build as of this date
Resolution: | Keywords: joyride-2131:-
Next_action: diagnose | Verified: 0
Blockedby: | Blocking: 7393
----------------------------+-----------------------------------------------
Comment(by dilinger):
Regarding the EC timeouts (it would be nice to have that in a separate
bug, but for now):
We see two different timeouts; one is before the command starts (timeout-
waiting-to-quiesce), and one after sending the command (timeout-waiting-
for-command-read). In both cases, we're waiting for the IBF flag to be
cleared. There was a theory about the LPC bus floating during suspend,
which could cause IBF to be in an unknown state immediately after resume..
however, if we pass the IBF-quiesce check, then we can be assured that IBF
is clear before sending the EC our command byte. So, instead:
{{{
15:06 =dilinger> outb(cmd, 0x6c);
15:06 =dilinger> if (wait_on_ibf(0x6c, 0)) {
15:06 =dilinger> printk(KERN_ERR "olpc-ec: timeout
waiting for EC to read "
15:06 =dilinger> "command!\n");
...
15:08 =dilinger> we stuffed a byte into 0x6c, which presumably sets IBF
15:09 =dilinger> and then time out waiting for IBF to be clear
15:09 =smithbone> yeah.. writes to 6c set ibf.
15:09 =dilinger> we have a mdelay(1) in our IBF poll loop
15:09 * smithbone wonders if I when I clear the flag if it really gets
clear.
15:10 =smithbone> I can clear then check.
15:10 =dilinger> is it possible for the EC to clear IBF and then set it
again
during that 1ms delay?
15:10 =dilinger> *nod*
15:10 =smithbone> Oh.. yeah...
15:10 =smithbone> If I stick the data back in the register.
15:10 =smithbone> That could totally happen.
15:11 =smithbone> And fastpath would make it much worse.
15:11 =dilinger> so, um...
15:11 =smithbone> Hmmm....
15:11 =smithbone> heh..
15:11 =dilinger> is there a way we can prevent that? :)
15:12 =dilinger> either by the kernel telling the EC that it's waiting for
IBF,
and the EC not clearing IBF until the kernel has said
that
it's no longer waiting..
15:12 =smithbone> Dunno.. Let me think.. ok time for my con call..
Excellent
brainstorming.. I think we may be on to it.
15:12 =dilinger> or by changing the kernel code to not delay 1ms (which
still
seems racy, and also is going to hurt performance)
15:13 =smithbone> yeah.. let me ponder a bit.
15:13 =dilinger> or by having the EC detect that the kernel is busy
polling
IBF, and not touch it until it's been at least 2ms since
the
kernel read?
}}}
--
Ticket URL: <http://dev.laptop.org/ticket/7458#comment:39>
One Laptop Per Child <http://laptop.org/>
OLPC bug tracking system
More information about the Bugs
mailing list