#9910 NORM Not Tri: EC failure to reboot with no-kbc-reboot

Zarro Boogs per Child bugtracker at laptop.org
Sat Dec 19 04:53:42 EST 2009


#9910: EC failure to reboot with no-kbc-reboot
------------------------------------+---------------------------------------
           Reporter:  pgf           |       Owner:  rsmith                            
               Type:  defect        |      Status:  new                               
           Priority:  normal        |   Milestone:  Not Triaged                       
          Component:  not assigned  |     Version:  Development source as of this date
         Resolution:                |    Keywords:  EC                                
        Next_action:  never set     |    Verified:  0                                 
Deployment_affected:                |   Blockedby:                                    
           Blocking:                |  
------------------------------------+---------------------------------------

Comment(by rsmith):

 There are 2 issues here:

 First the following.

 > {{{
 > ok delete-tag TS  add-tag TS RUNIN
 > }}}
 >
 > works for a while (between 1 and 30 times, in my experience), but
 eventually i get:

 This is a usage error.  The above commands involve putting the EC into
 reset.  Prior to putting the EC into reset OFW sends the EC a 0xd8 so it
 can prepare.  The above puts the EC int reset then brings it out of reset
 and then immediately tries to put it back into reset.  The EC is a slow
 device and needs time to get things set back up.  There must be a delay
 between the 1st and 2nd command.  The processing time for add-tag must be
 just enough that the EC is able finish most of the time but not all.

 You can duplicate the failure easily with the following which is the high
 speed versions of the commands above.

 {{{
 ok kbc-off no-kbc-reboot kbc-on many
 }}}

 Fails on the 1st cycle.  The following:

 {{{
 ok kbc-off no-kbc-reboot kbc-on d# 50 ms many
 }}}

 runs for a whole bunch cycles... However, the above sequence exposes a
 much uglier problem.  If you let the above run it will eventually still
 crash and crash hard.  Zero EC activity after that.  I've eventually
 worked out that the EC is either not starting back up or is crashing very
 early on.

 In the following log I added a while(1) { putchar('x'); } after the 0xD8
 command.  So the EC just spins waiting for OFW to put it into reset.
 Here's the log leading up to a crash

 {{{
 xxxxx�!!!REBOOT!!!
 20091219-
 Ver:1.9.19
 WDT:0
 Boot no pwr loss
 93885:PwrInit2
 66Cmd 0xD8
 xxxxxxx!!!REBOOT!!!
 20091219-
 Ver:1.9.19
 WDT:0
 Boot no pwr loss
 93889:PwrInit2
 66Cmd 0xD8
 xxxxxx
 }}}

 The key is whats missing on the last line.. There should be a !!!REBOOT!!!
 when the EC comes out of reset but it never happens.  This means the EC
 died very early on.  If you use OFW via the serial port you can manually
 reset the EC with the proper indexed IO commands and restore its operation
 but other than that you have to power cycle.

 I played with various things in the EC such as disabling interrupts after
 the 0xD8 command but so far the only thing that seems to make any
 difference is to delay after the kbc-off command.  The following makes
 crashes much less frequent but they still occur.

 {{{
 ok kbc-off d# 50 ms no-kbc-reboot kbc-on d# 50 ms many
 }}}

 Further investigation on this will have to wait until I'm back from Xmas
 and I can put some IO strobes in the early startup code to see if the EC
 actually runs any instructions after reset is disabled.

-- 
Ticket URL: <http://dev.laptop.org/ticket/9910#comment:1>
One Laptop Per Child <http://laptop.org/>
OLPC bug tracking system


More information about the Bugs mailing list