Compression of HTML files
Ian Bicking
ianb at colorstudy.com
Wed Jul 25 15:48:39 EDT 2007
Samuel Klein wrote:
>> Arael8 (Petr?) posted a zip file of HTML files,
>> http://dictionary.110mb.com/files/short-wiki.zip
>>
>> The files are pretty small, almost all from 500 bytes to 1.5K.
>>
>> Here's the compressed sizes:
>>
>> 4.6M raw
>> 1.3M short-wiki.zip
>> 704K short-wiki.tar.gz
>> 496K short-wiki.tar.bz2
>> 3.8M gzipped
>> 3.8M gzipped-1
>> 3.8M bzipped
>
> are the gzipped and bzipped files really the same size?
Actually, don't trust these sizes at all. I realize du is being smart,
and showing the actual disk space used, on my ext3 filesystem. Which is
not what I want. It takes into account, I believe, that I have an inode
size of 4K (or maybe 2K), which means any file less than 4K takes up 4K
anyway. As a result the compression can't do much good.
I think du --apparent-size is going to be more accurate:
2729 raw
1265 short-wiki.zip
490 short-wiki.tar.bz2
697 short-wiki.tar.gz
1214 gzipped
1273 gzipped-1
1277 bzipped
It still doesn't take into account the overheads of JFFS2 (which I'm
guessing are less than ext3, but still exist), but I'm emailing the dev
list to ask about that.
--
Ian Bicking : ianb at colorstudy.com : http://blog.ianbicking.org
More information about the Library
mailing list