#9694 NORM 1.5-fir: Openchrome lies about memory speed (was: Memory speed slower in OFW than Phoenix)

Zarro Boogs per Child bugtracker at laptop.org
Tue Nov 24 16:02:01 EST 2009


#9694: Openchrome lies about memory speed
-------------------------------------------+--------------------------------
           Reporter:  wmb at firmworks.com    |       Owner:  wmb at firmworks.com   
               Type:  defect               |      Status:  assigned            
           Priority:  normal               |   Milestone:  1.5-firmware        
          Component:  ofw - open firmware  |     Version:  Development firmware
         Resolution:                       |    Keywords:                      
        Next_action:  reproduce            |    Verified:  0                   
Deployment_affected:                       |   Blockedby:                      
           Blocking:                       |  
-------------------------------------------+--------------------------------

Comment(by wmb at firmworks.com):

 Very interesting, but complicated, data ...

 1) Using OFW loaded from a USB stick on top of Phoenix BIOS, I was able to
 reproduce some memory timing numbers that have about the same ratio as the
 ratio shown in Xorg.log.

 2) Instead of a memory-to-framebuffer copy, I used a simpler "fill
 framebuffer with constant data" loop (after verifying that "fill" gave
 similar performance ratios as "memcpy").

 3) The first component of the difference factor has to do with whether or
 not the display controller is fetching data to refresh the panel.  If that
 is happening, the memory fill is about 7% slower.  Turning off the display
 controller uses a different register depending on whether you are coming
 from text mode (a VGA console/VT as with Fedora 12) or from linear
 graphics mode (viafb as with os44).  So maybe the display controller is
 quiet when Xorg is doing the memory test coming from a text console, but
 busy when coming from viafb ... (and this might be related to the vt
 switch optimization also).  This made the time decrease from 3.1 mS to 2.9
 mS.

 4) The second, and larger, component of the difference factor is related
 to CPU thermal throttling.  The OFW high-precision timing measurement
 words use the timestamp counter.  But the timestamp counter counts at the
 CPU instruction clock rate - so when the CPU slows down due to thermal
 throttling, the timing clock slows down too.   But the memory interface
 doesn't slow down.  So when the CPU is in throttle state, the memory fill
 takes longer in real time because of the slower CPU, but the timestamp
 counter scales too, so it appears to take almost the same time.  But
 really the measured time is *less*, because the non-throttled memory is
 faster relative to the throttled CPU!  So when the CPU throttles, the
 actual time increases from 2.9 mS to 6.2 mS, but the reported time
 decreases from 2.9 mS to 2.4 mS.

 5) Why didn't I see the throttling effect when running native OFW instead
 of OFW-over-Phoenix?  Well, it's because the native OFW, which is a newer
 build, has a power-saving optimization to C2-idle the CPU when sitting at
 the ok prompt.  This makes the CPU run a lot cooler, so throttling doesn't
 need to happen.  The older USB-stick OFW didn't have the idle
 optimization, so it was just sitting there running at full bore, heating
 up the die until it had to throttle.  I can force the native OFW to
 "throttle" by manually dialing the CPU speed down to 400 MHz, and when I
 do that I see exactly the same "fake" 2.4 mS time that I saw with OFW
 under Phoenix.

 6) Why didn't the heat-spreader prevent overheating and throttling?
 Because the heat spreader wasn't attached.  Why not?  Because (a) I only
 have one heat spreader and it was on another board that is in a case (b)
 You can't easily attach the spreader unless the machine is in a case
 (nothing to screw it down to)  (c) When you are switching back and forth
 from Phoenix BIOS, it is convenient to leave the SPI reprogramming header
 attached so you can get back to OFW, but that header interferes with the
 case.

 7) Richard's Phoenix test system is similarly lacking a heat spreader.

 8) Guess which memory speed timing method the openchrome driver uses?
 Hint:

 {{{
 static unsigned
 time_function(vidCopyFunc mf, unsigned char *buf1, unsigned char *buf2)
 {
    unsigned t, t2;

    t = fastrdtsc();

    (*mf) (buf1, buf2, BSIZA, BSIZW, BSIZH, 0);

    t2 = fastrdtsc();
    return ((t < t2) ? t2 - t : 0xFFFFFFFFU - (t - t2 - 1));
 }

 }}}

 Bottom line - the hypothesis isn't proven yet, but this is starting to
 look a lot like a smoking gun.

-- 
Ticket URL: <http://dev.laptop.org/ticket/9694#comment:4>
One Laptop Per Child <http://laptop.org/>
OLPC bug tracking system


More information about the Bugs mailing list