[Http-crcsync] General comments on crcsync document

Mon Dec 28 10:37:17 EST 2009

All,

I have been reading back the mail archive and realize that a better solution 
was already proposed for this entire topic a while back. See 
http://lists.laptop.org/pipermail/http-crcsync/2009-July/000144.html

I'll update the doc and the code accordingly (I have some time on my hand this 
week)

Cheers,
Alex

Op maandag 28 december 2009, schreef Alex Wulms:
> I have (finally) been reading up on the latest spec and understand now
> where the complication comes from.
>
> It is because the server, in the current proposal, has to guess the
> block-size that the client used from the filesize and the number of hashes
> specified in the request.
>
> If we want to use a hash for the trailing block, the logic to use the same
> block-size on client and server is a little bit tricky due to the fact that
> the trailing block can have a zero size, implying that the number of
> complete blocks is sometimes the same as the number of hashes and sometimes
> one block less.
>
> I don't know how to write in one mathematical formula how the client and
> server should behave but algorithm wise I would do it like illustrated in
> the below examples (inspired on Toby's example).
>
> Example 1:
>
> The client wants to make 40 full blocks while the file is 40k + 2 bytes
>
> Client will use (full-block-size = floor(filesize / #full-blocks) = 1k)
> bytes for the complete blocks
> And trailing block size will be (trailing-size = filesize % #full-blocks =
> 2 bytes)
>
> So this will give 40 1K blocks and one 2 bytes block. In total 41 hashes in
> the request.
>
> Example 2:
>
> The client wants to make (again) 40 full blocks while the file is 40k bytes
>
> Client will use (full-block-size = floor(filesize / #full-blocks) = 1k)
> bytes for the complete blocks
> And trailing block size will be (trailing-size = filesize % #full-blocks =
> 0 bytes)
>
> So this will give 40 1K blocks but there will not be a trailing block. So
> in total, there will be 40 hashes in the request.
>
> In both examples, the server has to guess from the number of hashes and
> from the filesize if there is any trailing block or if there are only
> complete blocks.
>
> So the server would have to say something like
>
> if (filesize % #hashes == 0)
> {
>   full-block-size = filesize/#hashes;
>   n-complete-blocks = #hashes;
>   trailing-block-size = 0;
> }
> else
> {
>  full-block-size = floor(filesize / (#hashes-1));
>  n-complete-blocks = #hashes - 1;
>  trailing-block-size = filesize % (#hashes-1);
>  assert(trailing-block-size != 0); // client did something weird
> }
>
> This would cover both cases (with and without trailing block).
>
>
> The alternative like Toby said is indeed to only calculate hashes for
> complete blocks and treat the trailing block (if any) always as a literal
> block. In that case, the client should be carefull to only pass complete
> blocks to the crc-library and the server could simply use the
> logic 'blocksize=floor(filesize/$hashes)' and not worry about the trailing
> block. But given the fact that Rusty's CRC library properly supports a
> trailing block, I propose to use it, despite the fact that it will make the
> logic to determine the total number of hashes a little bit more complex.
>
>
> Cheers,
> Alex
>
> > If we use last_block_size = file % block_count the final block will have
> > a maximum size of block_count, so if we have 40 blocks for 39k + 2 byte
> > file then we have 39 1k blocks and a trailing block of size 2 bytes.
> > Another option is simply to drop the trailing blocks and they will always
> > be returned as a literal.
> >
> > Feel free to correct my maths if I am missing something...
> >
> > Toby
> >
> > 2009/11/2 Rusty Russell <rusty at rustcorp.com.au>
> >
> > > On Thu, 29 Oct 2009 06:11:53 am Toby Collett wrote:
> > > > The current version in git now implements the standard document
> > >
> > > completely
> > >
> > > > as far as I am aware (doc is available from git
>
> http://repo.or.cz/w/httpd-crcsyncproxy.git?a=tree;f=crccache/doc;h=37d90acd
>37bb0199a37e6d6a779c37c4f37da29b;hb=HEAD
>
> > > )
> > >
> > > > So now we need some testing, not sure the best way to do this,
> > > > Martin,
> > > >
>  > > > do
> > > >
> > > > you want to set up access to a server?
> > > >
> > > > Rusty: There was an assertion that tailsize be < block-size in the
> > > > crc
> > >
> > > code.
> > >
> > > > The latest version has tail_size = blocksize + remainder. It seems to
> > >
> > > work
> > >
> > > > when that assertion is removed and I couldnt see any reason why it
> > > > can't
> > >
> > > be
> > >
> > > > greater in the current implementation. Could you confirm?
> > >
> > > There's no real reason, but it seems wrong.  if tailsize > blocksize,
> > > why isn't there simply one more block?
> > >
> > > Cheers,
> > > Rusty (who hasn't really been paying any attention)
>
> _______________________________________________
> Http-crcsync mailing list
> Http-crcsync at lists.laptop.org
> http://lists.laptop.org/listinfo/http-crcsync