Opportunity for speedup

Wade Brainerd wadetb at gmail.com
Thu Feb 19 13:58:33 EST 2009


Oh, and you can feed one of the 565 files through my 'rle.c' program to see
the compression ratio firsthand.

On Thu, Feb 19, 2009 at 1:56 PM, Wade Brainerd <wadetb at gmail.com> wrote:

> RLE (run length encoding) compresses sequences of identical pixels ("runs")
> as value/count pairs.
> So abbbbbbbbbbccc would be stored as 1a 10b 3c.
>
> The decompressor looks like:
>
> while (cur < end)
> {
>    unsigned short count = *cur++;
>    unsigned short value = *cur++;
>    while (count--)
>       *dest++ = value;
> }
>
> This can be faster than memcpy because you are reading significantly less
> memory than you would with memcpy, thus fewer cache misses are incurred.
>
> Because the startup images are mostly spans solid colors, this kind of
> compression works very well.  If that were not the case, say if there were a
> left-to-right gradient in the background, RLE would probably make things
> worse, thus you have to be careful when choosing it.
>
> But the smaller size on disk and in memory would probably improve
> performance in other ways as well.
>
> Best,
> Wade
>
>
> On Thu, Feb 19, 2009 at 1:49 PM, Bobby Powers <bobbypowers at gmail.com>wrote:
>
>> 2009/2/19 Wade Brainerd <wadetb at gmail.com>:
>> > On Thu, Feb 19, 2009 at 1:22 PM, C. Scott Ananian <cscott at laptop.org>
>> wrote:
>> >>
>> >> I'd suggest just uncompressing the various image files and re-timing
>> >> as a start.  The initial implementation was uncompressed, but people
>> >> complained about space usage on the emulator images (which are
>> >> uncompressed).  The current code supports both uncompressed and
>> >> compressed image formats.  For uncompressed images, putting the bits
>> >> on the screen is an mmap and memcpy, so I can't imagine any
>> >> implementation being faster than that (it's possible, of course, that
>> >> what's stealing CPU is the shell's invocation of the client program;
>> >> recoding just that little part in C should be trivial, since it does
>> >> nothing but write to a socket IIRC.)
>> >
>> > I implemented a RLE compressor specifically for these 16bit image files
>> the
>> > last time this question came up.  This can certainly be faster than
>> memcpy
>> > since we are talking memory performance.
>>
>> Can you explain this?  I don't think I have enough knowledge to
>> evaluate your claim.
>>
>> bobby
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.laptop.org/pipermail/devel/attachments/20090219/e5c90923/attachment.html>


More information about the Devel mailing list