jffs zlib tuning
NoiseEHC
NoiseEHC at freemail.hu
Mon Jan 7 13:49:47 EST 2008
> However, I recommend hacking in libz first, making it work with
> gzip, and staert porting it to the kernel next step. Debugging
> and benchmarking in userspace is *so* much easier.
>
Ah, I tried it. Unfortunately the zlib in the kernel is a heavily
modified zlib and I was not able to compile it in user space. So I have
learnt writing kernel modules (took just 2 days, it is so simple
compared to windows drivers that I still cannot believe it). And there
is a comment in the kernel zlib that user-space support was removed...
> Also, I'd be very surprised if such an obvious optimization
> hadn't been tried already in 20+ years of gzip. Try digging
> around: you may find that it's not worth it.
>
The optimization that I thought of is absolutely Geode specific. First
it needs some prefetch, and secondly it has a lot of branches. The Geode
has a very simple 1 bit branch predictor (it seems like that but not
documented) so it can waste 20-40 cycles for every run (every
length/distance code). I know that it is hard to create better code than
a C compiler nowadays so I am sure that simply rewriting the code in asm
would not speed things up (>5 years ago LZO had several asm
implementations for 486/586/686 but ironically all were slower than the
compiler generated one.)
Now that you have mentioned that jffs2 uses only 4K blocks, it can be
possible that the bottleneck is not in inffast.c. Do you have ANY
perf/profile data, please?
All I would like to know whether the bottleneck lies in inffast or not:
/*
When large enough input and output buffers are supplied to inflate(), for
example, a 16K input buffer and a 64K output buffer, more than 95% of the
inflate execution time is spent in this routine.
*/
More information about the Devel
mailing list