[Http-crcsync] General comments on crcsync document

Toby Collett toby.collett at gmail.com
Fri Jul 10 01:46:40 EDT 2009


2009/7/10 Patrick McManus <mcmanus at ducksong.com>

>
> >
> > more related comments on the spec.. it took me a few minutes to figure
> > out that the if-block hashes are crc32's. the document just calls them
> > hashes or crcs. I had to go to the code to find out it was crc 32. So
> > that should really be documented.
>
> I've read some more code in the repo just now, and I see that it really
> is crc-60.. so I need to walk back a little of that previous message I
> sent about it being crc-32. There are outdated code comments that say it
> is crc-32 which is what lead me down that path.
>

The original implementation was crc32 based, again to get 30 bits we just
masked off the upper 2, same theory applied for 60 bits from 64. There has
been some back and forward over the size of the hashes. Documenting the hash
algorithm is a definite todo for the spec (in fact I think it is marked as
such in it)


>
> is it being calculated as 64 bits and just masked off on each pass?
>
> using normal b64 rules, I also figure the 60 bits need to be padded out
> to 72bits in order to generate 12 ascii characters.. but the last 2
> chars just represent those 12 bits of pad and they are dropped from the
> message header.. This is stuff that really ought to be written in the
> doc so others don't have to reverse engineer it too. I think it isn't
> the normal way to show the b64 string (which would always be a multiple
> of 4 characters), so you might hesitate before standardizing it and
> minimally show an example or two.


It is a cludge of base64 as we do simply not represent the padding bytes,
also the reason for selecting hash sizes that are multiples of 6 bits.
Actually recently I have been thinking that we would be better off encoding
the hashes all on one go as packed bits which would remove the need for so
much messing around with the base64 encoding...


>
> But bigger picture I gotta say, a 60 bit hash is more than a little
> unusual. From an implementation standpoint a lot of folks are just going
> to have crc16 and crc32 libraries and not have any easy way to perform
> that calculation and that won't help adoption of the spec.. so I'd like
> to see a little language in the document justifying the need for it and
> explaining why it is 60 and not 32 (or even 64).
>
> Is there a strong basis for 60, or is it just "more than 32". and was 32
> shown to have problems significant enough to warrant the change? (the
> strong sha is wrapped around the whole thing should it collide,
> afterall.)
>
> Heck, Intel added CRC-32 as an SSE level instruction in SSE-4 in Nehalem
> and later as a potential offload. Its also on hardware on the cavium
> nitrox processors and (I think) other security processors you often find
> web appliances built around.. I don't think they're adding crc-60 in
> hardware any time soon;)
>
> If you're looking at a 40 in 4 billion chance (say with 40 crc-32
> blocks) that's 1 in 100 million that you have to redo the request
> without a delta. big deal. Is 1 in 100 million really enough of a
> performance problem (and because of the sha it is only a performance
> problem, not a correctness one) to justify going away from a widely
> deployed and available algorithm such as crc32? I think this is a pretty
> strong argument for doing 32 bit hashes on the blocks.
>
> in a somewhat related thought:
>
> "In case of a mismatch, the crcsync cache client should return an error
> condition to the classical cache client and discard the original
> instance from it's local store, to prevent the same error from
> re-occurring when the user retries the request. "
>
> I'm not sure this specification has any business telling the a cache
> that it should discard legitimate instances from its local store. There
> might very well be administrative policy in place pinning them there! A
> more appropriate remedy would be prohibiting said client from sending
> the same if-block sequence for a subsequent request to the same resource
> without getting a successful transaction in between. Even changing the
> block size on the next request would cure a crc conflict.
>
> -me again
>
>
>
> _______________________________________________
> Http-crcsync mailing list
> Http-crcsync at lists.laptop.org
> http://lists.laptop.org/listinfo/http-crcsync
>



-- 
This email is intended for the addressee only and may contain privileged
and/or confidential information
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.laptop.org/pipermail/http-crcsync/attachments/20090710/8ef3293d/attachment.htm 


More information about the Http-crcsync mailing list