5 sec boot

Sat Oct 4 21:49:24 EDT 2008

Deepak Saxena wrote:
> On Oct 03 2008, at 07:34, Mitch Bradley was caught saying:
>   
>>> Could somebody explain me whether [the 5 second boot] results are applicable to the 
>>> XO, and how far are we from it, please?
>>>
>>>   
>>>       
>> Ticket http://dev.laptop.org/ticket/4349 details my and codyl's 
>> experiments with speeding up boot.
>>     
>
> Some other ideas:
>
> - Right now the firmware copies the uncompressed kernel and initrd 
>   to memory and then the kernel and initrd decompressor has to re-read
>   it from memory and write it back out in decompressed format. 

Actually, OFW decompresses them both, so by the time control is 
transferred to the kernel, both kernel and initrd are already "in the 
clear".  But the principle is the same.  Somebody has to do the 
decompression.  OFW's decompression code is essentially the same as the 
kernel's, derived from the Mark Adler code.

> If 
>   it is stored deocompressed to begin with on the filesystem, we 
>   can simply read it into mem from flash and run. This would require
>   a few extra MiB of flash. Granted, reading from flash is 
>   slower than reading from memory, so we may not see a a net
>   benefit.  Easy enough to test...
>   

Some rules of thumb for OLPC:

a) Memory copy goes at about 500 MB/sec (so can be neglected in this case)
b) zlib decompression goes at about 3 MB/sec for typical code (2:1 
compression ratio)  (Over the complete filesystem, JFFS2's automatic 
zlib compression gets you about 3:2 space savings, because of various 
overhead factors).
c) Raw FLASH read time maxes out at 20 MB/sec.  But you don't get that 
speed from the filesystem; JFFS2 is good for between 5 and 10 MB/sec.

Considering all the intricacies of JFFS2, my best guess is that it's 
going to be close to a wash whether the kernel + initrd is stored in 
compressed or uncompressed form.

OTOH, if the kernel + initrd were in a separate partition in e.g. romfs 
format, where OFW could just blast them into memory without doing JFFS2 
node processing, we could probably get close to the 20 MB/sec speed.

Another holdup is the time it takes to do the hash calculation for 
security checking.  That costs a couple of seconds.  Ivan and I worked 
pretty hard to minimize that time, choosing one of the faster hash 
functions.

> - Embedded systems often use a suspend image to speedup boottime. 
>   Basically load an image into memory and then jump into the
>   kernel as if we are resuming from firmware. Another approach
>   if we can't do a full suspend image this is to use the new 
>   container code and save the runtime of the user session so we
>   can just reload it. Both these methods require flash space...
>   

It would be nice if we could identify a good common userland starting 
point and snapshot that.  My gut feel is that a 20 MB image might be 
sufficient.

> ~Deepak
>
>