[Http-crcsync] crccache ready for some testing I think
Alex Wulms
alex.wulms at scarlet.be
Wed Apr 1 19:00:38 EDT 2009
Hi Rusty,
I have come across another subtle bug in the crccache module that Toby and
myself are making, which I have meanwhile traced down to a not-yet-complete
understanding on how to use your library. In attachment you can find another
version of my test program, that I have used to better my understanding of
the library.
Here is the output of the program as it would be if you would run it:
./test
CRCs data1: 33dd97b6 0122a0a2 301f8cc8
CRCs data2: 33dd97b6 0122a0a2 3c28960e
remaining: 14, offset: 0
ndigested: 5, result: -1, searched-in: <ABCDEabcde123g>
remaining: 9, offset: 5
ndigested: 5, result: -2, searched-in: <abcde123g>
remaining: 4, offset: 10
ndigested: 4, result: 0, searched-in: <123g>
flush result: -3
flush result: 1
flush result: 0
Basically, I work with a blocksize of 5, my 'original' string is a 13
character string (ABCDEabcde123) and my 'modified' string is a 14 character
string (the original + 'g')
What I notice is that the crc_block_read function first returns the check-sums
for the first two 5-byte blocks, which is as expected.
However, the last invokation of crc_read_block returns a 'result=0' and
ndigested=4. So you keep the state about it in your internal buffer.
When I then invoke the flush function, I first get a reference to block 3,
which is only a 3 byte-block and not a five-byte-block and then you return as
last result that there is still 1 remaining non-matched character (the g).
So I guess that if I want to use your API correctly, that I should specify the
length of the tail-block (of the cached data) in the request, so that I would
be able to calculate the position of the 'g' character in the above case. Is
that correct?
I have also done another test, in which I have made the second string 5
characters longer then the 'original' string (e.g. the original + 'ggggg').
In such case I get following output:
./test
CRCs data1: 33dd97b6 0122a0a2 301f8cc8
CRCs data2: 33dd97b6 0122a0a2 19886038 1d2c046e
remaining: 18, offset: 0
ndigested: 5, result: -1, searched-in: <ABCDEabcde123ggggg>
remaining: 13, offset: 5
ndigested: 5, result: -2, searched-in: <abcde123ggggg>
remaining: 8, offset: 10
ndigested: 8, result: 3, searched-in: <123ggggg>
flush result: 5
flush result: 0
So in that case, the crclib does not detect that '123' matches the tail-block
of the original request. It returns it as a 'literal/mismatch' block and then
the remaining 5 characters are remembered in the internal state and informed
to me once I perform the flush. Is this also the intended behaviour?
Please note that I run regularly into your tail-cases, due to the moving
window that I use to collect the chunks of data from the origin server and
feed them to your library, so it is important that I fully understand how to
use it properly.
Many thanks if you can shed some further light on this so that I can properly
fix the crccache code.
Thanks and kind regards,
Alex
Op dinsdag 31 maart 2009, schreef Alex Wulms:
> I have fixed a few more bugs. Everything is checked-in to the repository.
> Please feel free to test it further.
>
> Op dinsdag 31 maart 2009, schreef Martin Langhoff:
> > On Tue, Mar 31, 2009 at 1:05 AM, Alex Wulms <alex.wulms at scarlet.be> wrote:
> > > With the new code, the compressed size is 5%.
> >
> > niiice.
> >
> > > Ps: slashdot sets 'cache private' headers so normally cache module does
> > > not want to cache them. I have overruled it via the configuration
> > > parameters
> >
> > For our use case, my thinking was that we would...
> >
> > - use the standard mod_cache to cache "cacheable" content as per HTTP
> > headers
> >
> > - use a separate storage to cache "uncacheable" content that is a
> > good candidate for crcsync...
> >
> > is that what you are thinking of as well?
> >
> > cheers,
> >
> >
> >
> > m
>
> _______________________________________________
> Http-crcsync mailing list
> Http-crcsync at lists.laptop.org
> http://lists.laptop.org/listinfo/http-crcsync
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test.c
Type: text/x-csrc
Size: 1706 bytes
Desc: not available
Url : http://lists.laptop.org/pipermail/http-crcsync/attachments/20090402/34ce4bbe/attachment.c
More information about the Http-crcsync
mailing list