Weekend 2008/8/22

Richard A. Smith richard at laptop.org
Fri Aug 22 18:26:17 EDT 2008


* q2e13 EC code had timing issues and broke the mouse on builds <=711
(690 included) Richard removed it from the joyride builds and marked it
as bad on the wiki.
* Richard Fixed the EC timing issues and released EC code pq2e14. Not in
a firmware yet but will be soon.
* The first full Multi-Battery Protoype was send to firmware developer
Lilian Walter.  3 more will head to 1cc next week.


EC Code:

Richard spend many hours in the lab understanding why what was supposed
to be a small timing change in the EC main loop turned out to break the
mouse on any build that was running the older driver.  He fixed all the
issues and released a new EC code for use in then upcomming q2e14
firmware release.  Although q2e14 EC code timing is now in much better
shape there are a lot of timing changes to the main loop.  It now runs
quite a bit faster than before and many other parts have been
streamlined.  Richard is cautions and would like to see a lot of testing
on this changes.  See Below for the gory details.

Multi Battery charger:

Lilian received the first of our prototypes.  Unfortunately, there does
not appear to be enough packing in our shipping packaging.  Shipping a
full unit with the power supply installed seems to result in  a power
supply that has manges to rip itself off of its supports.  Destroying
the supply in the process.
Lilian has new supply and we are beefing up the shipping material as
well as haveing flextronics add some additions supports on to the
mounting of the power supply onto it brackets.  3 units will be shipped
to 1cc next week when some larger boxes are found to deal with addtional
packing.


Gory details:

q2e13 mouse failure analysis or why "Bugs are Good".

Jim is fond of saying that "Bugs are Good" and the q2e13 mouse
failure on older builds is a wonderful example.

One of the speedups Richard made to the EC command code in q2e13 was to
take advantage that many EC comands send or receive an arguement and
typcialy EC commands are run in spurts.  The host has the ability to
feed data at a much faster rate than the EC can receive it at.  So the
code was tweaked such that the command loop was run 4 times every cycle.
In the idle case the code path is only a single if() and switch(). The 
addtional overhead in the idle case _should_ have been minimial.  When 
it broke stuff Richard set out to understand why.

EC code in q2e12 or less has following loop time behavior.

2-3 ms average loop time
Worst loop 9ms
~ 24 ms to process the 6 byte mouse packet.  The update rate of the
touch pad is 12 ms which means that on average the touch pad does not
get to report about 1/2 the point data it could.

In q2e13 this changed to:

A very solid 4 ms per loop
Worst loop 9ms
30-32 ms to process the mouse packet.

The extra added 1-2 ms of loop process time added 6 to 8 ms of lag on to 
the touchpad processing time and the net result is that only about 1/3 
of the ammount of data that the touchpad could have reported actually 
made it up to the kernel.
This triggered one of the checks in the kernel for funky mouse packets
and the kernel driver threw away most of the mouse data.  :(

The burning question was why was something that should have taken just a 
few hundred micro-seconds taking so long.  There were a couple of 
answers.  One big answer was the actual compiler optimizer itself, or 
the lack thereof.  Many areas of the EC code use a bank of function 
pointers.  Function pointers break the Keil compilers ability to do 
overlay analysis.  Overlay analysis is level 2 just above dead code 
elimination which is level 1.  So the optimizer was capped at level 1. 
If you have ever seen the raw output from a C compiler you know how ugly 
it can be.  By using per file #pragmas the optimizer was turned to level 
9 (emphasis speed) for all the files that don't have function pointers. 
  The resulting code was both much faster and a few kbytes smaller.

The next win was simple restructuring the code a bit.  By re-organizing 
the way things were call in the main loop a lot of function call 
overhead was eliminated.

The 3rd big item was the IO read/write calls.  The EC has a Bit IO table 
which allows the actually GPIO pins to be remapped without the calling 
functions having to change. This routned took 50x the time to read a 
simple IO port.  The IO polling in the main loop was modified to use a 
much simpler (but still re-mappable) scheme which was much faster.

And then finally what on earth was taking 9ms to complete?  The battery 
gas guage code had 2 functions both of which take a really long time to 
operate.  They were executed back to back.  There was not any reason 
they could not haved been broken up.  So they were and now the main loop 
cycles between the 2 calls.  The functions themselves can also be 
re-written to be much faster but thats a much more invasive change and 
it only happen every 150 ms or so it was not a priority.

The end result:

200-400 us average loop time
Worst loop 5 ms
11.5 - 12.1 ms to process the mouse packet

So q2e14 (when released) will _almost_ keep up with the touchpad packet 
stream from advanced mode.

I am however a bit concerned about the optimizer running on code where 
previously it was disabled.  There may still be a few timing related 
bugs.  But I'll take fixing code timing bugs because it runs runs too 
fast anyday.

I'll be out of contact most of the weekend.

-- 
Richard Smith  <richard at laptop.org>
One Laptop Per Child



More information about the Devel mailing list