[Http-crcsync] crccache ready for some testing I think

Alex Wulms alex.wulms at scarlet.be
Wed Apr 1 19:00:38 EDT 2009


Hi Rusty,

I have come across another subtle bug in the crccache module that Toby and 
myself are making, which I have meanwhile traced down to a not-yet-complete 
understanding on how to use your library. In attachment you can find another 
version of my test program, that I have used to better my understanding of 
the library. 

Here is the output of the program as it would be if you would run it:
./test
CRCs data1: 33dd97b6 0122a0a2 301f8cc8 
CRCs data2: 33dd97b6 0122a0a2 3c28960e 
remaining: 14, offset: 0
ndigested: 5, result: -1, searched-in: <ABCDEabcde123g>
remaining: 9, offset: 5
ndigested: 5, result: -2, searched-in: <abcde123g>
remaining: 4, offset: 10
ndigested: 4, result: 0, searched-in: <123g>
flush result: -3
flush result: 1
flush result: 0

Basically, I work with a blocksize of 5, my 'original' string is a 13 
character string (ABCDEabcde123) and my 'modified' string is a 14 character 
string (the original + 'g')

What I notice is that the crc_block_read function first returns the check-sums 
for the first two 5-byte blocks, which is as expected.

However, the last invokation of crc_read_block returns a 'result=0' and 
ndigested=4. So you keep the state about it in your internal buffer.
When I then invoke the flush function, I first get a reference to block 3, 
which is only a 3 byte-block and not a five-byte-block and then you return as 
last result that there is still 1 remaining non-matched character (the g).

So I guess that if I want to use your API correctly, that I should specify the 
length of the tail-block (of the cached data) in the request, so that I would 
be able to calculate the position of the 'g' character in the above case. Is 
that correct?

I have also done another test, in which I have made the second string 5 
characters longer then the 'original' string  (e.g. the original + 'ggggg'). 
In such case I get following output:
./test
CRCs data1: 33dd97b6 0122a0a2 301f8cc8 
CRCs data2: 33dd97b6 0122a0a2 19886038 1d2c046e 
remaining: 18, offset: 0
ndigested: 5, result: -1, searched-in: <ABCDEabcde123ggggg>
remaining: 13, offset: 5
ndigested: 5, result: -2, searched-in: <abcde123ggggg>
remaining: 8, offset: 10
ndigested: 8, result: 3, searched-in: <123ggggg>
flush result: 5
flush result: 0

So in that case, the crclib does not detect that '123' matches the tail-block 
of the original request. It returns it as a 'literal/mismatch' block and then 
the remaining 5 characters are remembered in the internal state and informed 
to me once I perform the flush. Is this also the intended behaviour?

Please note that I run regularly into your tail-cases, due to the moving 
window that I use to collect the chunks of data from the origin server and 
feed them to your library, so it is important that I fully understand how to 
use it properly.

Many thanks if you can shed some further light on this so that I can properly 
fix the crccache code.

Thanks and kind regards,
Alex











Op dinsdag 31 maart 2009, schreef Alex Wulms:
> I have fixed a few more bugs. Everything is checked-in to the repository.
> Please feel free to test it further.
>
> Op dinsdag 31 maart 2009, schreef Martin Langhoff:
> > On Tue, Mar 31, 2009 at 1:05 AM, Alex Wulms <alex.wulms at scarlet.be> wrote:
> > > With the new code, the compressed size is 5%.
> >
> > niiice.
> >
> > > Ps: slashdot sets 'cache private' headers so normally cache module does
> > > not want to cache them. I have overruled it via the configuration
> > > parameters
> >
> > For our use case, my thinking was that we would...
> >
> >  - use the standard mod_cache to cache "cacheable" content as per HTTP
> > headers
> >
> >  - use a separate storage to cache "uncacheable" content that is a
> > good candidate for crcsync...
> >
> > is that what you are thinking of as well?
> >
> > cheers,
> >
> >
> >
> > m
>
> _______________________________________________
> Http-crcsync mailing list
> Http-crcsync at lists.laptop.org
> http://lists.laptop.org/listinfo/http-crcsync


-------------- next part --------------
A non-text attachment was scrubbed...
Name: test.c
Type: text/x-csrc
Size: 1706 bytes
Desc: not available
Url : http://lists.laptop.org/pipermail/http-crcsync/attachments/20090402/34ce4bbe/attachment.c 


More information about the Http-crcsync mailing list