#9910 NORM Not Tri: EC failure to reboot with no-kbc-reboot
Zarro Boogs per Child
bugtracker at laptop.org
Sat Dec 19 04:53:42 EST 2009
#9910: EC failure to reboot with no-kbc-reboot
------------------------------------+---------------------------------------
Reporter: pgf | Owner: rsmith
Type: defect | Status: new
Priority: normal | Milestone: Not Triaged
Component: not assigned | Version: Development source as of this date
Resolution: | Keywords: EC
Next_action: never set | Verified: 0
Deployment_affected: | Blockedby:
Blocking: |
------------------------------------+---------------------------------------
Comment(by rsmith):
There are 2 issues here:
First the following.
> {{{
> ok delete-tag TS add-tag TS RUNIN
> }}}
>
> works for a while (between 1 and 30 times, in my experience), but
eventually i get:
This is a usage error. The above commands involve putting the EC into
reset. Prior to putting the EC into reset OFW sends the EC a 0xd8 so it
can prepare. The above puts the EC int reset then brings it out of reset
and then immediately tries to put it back into reset. The EC is a slow
device and needs time to get things set back up. There must be a delay
between the 1st and 2nd command. The processing time for add-tag must be
just enough that the EC is able finish most of the time but not all.
You can duplicate the failure easily with the following which is the high
speed versions of the commands above.
{{{
ok kbc-off no-kbc-reboot kbc-on many
}}}
Fails on the 1st cycle. The following:
{{{
ok kbc-off no-kbc-reboot kbc-on d# 50 ms many
}}}
runs for a whole bunch cycles... However, the above sequence exposes a
much uglier problem. If you let the above run it will eventually still
crash and crash hard. Zero EC activity after that. I've eventually
worked out that the EC is either not starting back up or is crashing very
early on.
In the following log I added a while(1) { putchar('x'); } after the 0xD8
command. So the EC just spins waiting for OFW to put it into reset.
Here's the log leading up to a crash
{{{
xxxxx�!!!REBOOT!!!
20091219-
Ver:1.9.19
WDT:0
Boot no pwr loss
93885:PwrInit2
66Cmd 0xD8
xxxxxxx!!!REBOOT!!!
20091219-
Ver:1.9.19
WDT:0
Boot no pwr loss
93889:PwrInit2
66Cmd 0xD8
xxxxxx
}}}
The key is whats missing on the last line.. There should be a !!!REBOOT!!!
when the EC comes out of reset but it never happens. This means the EC
died very early on. If you use OFW via the serial port you can manually
reset the EC with the proper indexed IO commands and restore its operation
but other than that you have to power cycle.
I played with various things in the EC such as disabling interrupts after
the 0xD8 command but so far the only thing that seems to make any
difference is to delay after the kbc-off command. The following makes
crashes much less frequent but they still occur.
{{{
ok kbc-off d# 50 ms no-kbc-reboot kbc-on d# 50 ms many
}}}
Further investigation on this will have to wait until I'm back from Xmas
and I can put some IO strobes in the early startup code to see if the EC
actually runs any instructions after reset is disabled.
--
Ticket URL: <http://dev.laptop.org/ticket/9910#comment:1>
One Laptop Per Child <http://laptop.org/>
OLPC bug tracking system
More information about the Bugs
mailing list