[Http-crcsync] General comments on crcsync document

Mon Dec 28 10:45:43 EST 2009

One more point: the proposal in the July message was still talking about 
indicating the hash algorithm in the header but that was before we had the 
entire debate about the algorithm and settled for CRC64-ISO. I propose to 
remove that part from the headers.

Example of request header from the client with filesize 45 bytes, blocksize of 
20 bytes for the complete blocks (implying 2 complete and one trailer block) 
and csl stands for checksum list:

If-Block: fs=45, bs=20, csl=aaaabbbbccccdddd

Example of capability header from the server, prefering blocksize multiple of 
8 due to hardware support:

Capability: crcsync, m=8

Cheers,
Alex

Op maandag 28 december 2009, schreef Alex Wulms:
> All,
>
> I have been reading back the mail archive and realize that a better
> solution was already proposed for this entire topic a while back. See
> http://lists.laptop.org/pipermail/http-crcsync/2009-July/000144.html
>
> I'll update the doc and the code accordingly (I have some time on my hand
> this week)
>
> Cheers,
> Alex
>
> Op maandag 28 december 2009, schreef Alex Wulms:
> > I have (finally) been reading up on the latest spec and understand now
> > where the complication comes from.
> >
> > It is because the server, in the current proposal, has to guess the
> > block-size that the client used from the filesize and the number of
> > hashes specified in the request.
> >
> > If we want to use a hash for the trailing block, the logic to use the
> > same block-size on client and server is a little bit tricky due to the
> > fact that the trailing block can have a zero size, implying that the
> > number of complete blocks is sometimes the same as the number of hashes
> > and sometimes one block less.
> >
> > I don't know how to write in one mathematical formula how the client and
> > server should behave but algorithm wise I would do it like illustrated in
> > the below examples (inspired on Toby's example).
> >
> > Example 1:
> >
> > The client wants to make 40 full blocks while the file is 40k + 2 bytes
> >
> > Client will use (full-block-size = floor(filesize / #full-blocks) = 1k)
> > bytes for the complete blocks
> > And trailing block size will be (trailing-size = filesize % #full-blocks
> > = 2 bytes)
> >
> > So this will give 40 1K blocks and one 2 bytes block. In total 41 hashes
> > in the request.
> >
> > Example 2:
> >
> > The client wants to make (again) 40 full blocks while the file is 40k
> > bytes
> >
> > Client will use (full-block-size = floor(filesize / #full-blocks) = 1k)
> > bytes for the complete blocks
> > And trailing block size will be (trailing-size = filesize % #full-blocks
> > = 0 bytes)
> >
> > So this will give 40 1K blocks but there will not be a trailing block. So
> > in total, there will be 40 hashes in the request.
> >
> > In both examples, the server has to guess from the number of hashes and
> > from the filesize if there is any trailing block or if there are only
> > complete blocks.
> >
> > So the server would have to say something like
> >
> > if (filesize % #hashes == 0)
> > {
> >   full-block-size = filesize/#hashes;
> >   n-complete-blocks = #hashes;
> >   trailing-block-size = 0;
> > }
> > else
> > {
> >  full-block-size = floor(filesize / (#hashes-1));
> >  n-complete-blocks = #hashes - 1;
> >  trailing-block-size = filesize % (#hashes-1);
> >  assert(trailing-block-size != 0); // client did something weird
> > }
> >
> > This would cover both cases (with and without trailing block).
> >
> >
> > The alternative like Toby said is indeed to only calculate hashes for
> > complete blocks and treat the trailing block (if any) always as a literal
> > block. In that case, the client should be carefull to only pass complete
> > blocks to the crc-library and the server could simply use the
> > logic 'blocksize=floor(filesize/$hashes)' and not worry about the
> > trailing block. But given the fact that Rusty's CRC library properly
> > supports a trailing block, I propose to use it, despite the fact that it
> > will make the logic to determine the total number of hashes a little bit
> > more complex.
> >
> >
> > Cheers,
> > Alex
> >
> > > If we use last_block_size = file % block_count the final block will
> > > have a maximum size of block_count, so if we have 40 blocks for 39k + 2
> > > byte file then we have 39 1k blocks and a trailing block of size 2
> > > bytes. Another option is simply to drop the trailing blocks and they
> > > will always be returned as a literal.
> > >
> > > Feel free to correct my maths if I am missing something...
> > >
> > > Toby
> > >
> > > 2009/11/2 Rusty Russell <rusty at rustcorp.com.au>
> > >
> > > > On Thu, 29 Oct 2009 06:11:53 am Toby Collett wrote:
> > > > > The current version in git now implements the standard document
> > > >
> > > > completely
> > > >
> > > > > as far as I am aware (doc is available from git
> >
> > http://repo.or.cz/w/httpd-crcsyncproxy.git?a=tree;f=crccache/doc;h=37d90a
> >cd 37bb0199a37e6d6a779c37c4f37da29b;hb=HEAD
> >
> > > > )
> > > >
> > > > > So now we need some testing, not sure the best way to do this,
> > > > > Martin,
> > > > >
> >  > > > do
> > > > >
> > > > > you want to set up access to a server?
> > > > >
> > > > > Rusty: There was an assertion that tailsize be < block-size in the
> > > > > crc
> > > >
> > > > code.
> > > >
> > > > > The latest version has tail_size = blocksize + remainder. It seems
> > > > > to
> > > >
> > > > work
> > > >
> > > > > when that assertion is removed and I couldnt see any reason why it
> > > > > can't
> > > >
> > > > be
> > > >
> > > > > greater in the current implementation. Could you confirm?
> > > >
> > > > There's no real reason, but it seems wrong.  if tailsize > blocksize,
> > > > why isn't there simply one more block?
> > > >
> > > > Cheers,
> > > > Rusty (who hasn't really been paying any attention)
> >
> > _______________________________________________
> > Http-crcsync mailing list
> > Http-crcsync at lists.laptop.org
> > http://lists.laptop.org/listinfo/http-crcsync
>
> _______________________________________________
> Http-crcsync mailing list
> Http-crcsync at lists.laptop.org
> http://lists.laptop.org/listinfo/http-crcsync