[Http-crcsync] General comments on crcsync document

Thu Jul 9 15:48:00 EDT 2009

Hi,

Regarding caching: I'll read-up on the vary part again. Does it work for a 
combination of request and response? The important point being that the 
response is only valid for a request with the matching block-headers.

Regarding trailing block and file-size header: we specify the file-size and we 
(plan to) specify the number of blocks (though, it could indeed be implicitly 
derived from the blocks-header-size like somebody mentioned). With this 
combination we calculate the trailing block size on the server. We pass this 
info to the crc-validation library from Rusty so that he can match the 
trailing CRC only when a trailing block of trailing-block-size bytes has been 
read. I'll clarify this in the spec document. It works like a charm in the 
implementation. In a first prototype we did not explicitly specify the 
trailing-block-size to the validation library and the library used some 
heuristics to guess if it was matching a trailing block. That code was more 
complex and let to some subtle bugs between the moving-window logic in the 
encoder module (that was handling the http-stream) and Rusty's validation 
library.

Anyway, I'm happy to see this good feedback coming in on the document. There 
is certainly room for improvement.

Thanks and brs,
Alex

Op donderdag 9 juli 2009, schreef Patrick McManus:
> On Thu, 2009-07-09 at 07:30 +0200, Toby Collett wrote:
> > Hi, Some good points below, I dont have time for a full answer just
> > now, but thought I would quickly point you toward this RFC
> > http://www.ietf.org/rfc/rfc3229.txt . This was for an earlier delta
> > coding standard that relied on having the same cache content server
> > and client which the crcsync tries to avoid. However its discussion on
> > intermediate caches is informative, and it is where we have got most
> > of the concepts around the 226 response from.
>
> Hi Toby!
>
> I would strongly caution against taking advice in 3229 as state of the
> art. It is 7 years old and I am not aware of a single production use of
> it at scale. It cannot be expected to reflect the current Internet.
>
> For instance, it talks about broken caches (and yes they still exist
> once in a while) but it doesn't speak of firewalls at all. And today
> those are a more significant factor in deployed infrastructure. And,
> like I said before, I forsee inventing 226 as a significant interop
> problem with firewalls.
>
> HTTP/1.1 caching has gotten MUCH MUCH better over time. CRCSYNC is about
> speed - you want to leverage caching where you can, not run from it.
>
> At the very least, imo, all this stuff about 226 instead of 200 and
> mandatory cache busting should be dropped from any normative sections of
> a spec you write. It might be included as casual implementation advice,
> given more experience with your implementations, but HTTP contains
> perfectly sensible provisions for making this work with compliant
> infrastructure (specifically "Vary:") so it doesn't make any sense to me
> to prohibit such an implementation as being compliant with your spec.
> Does that make sense?
>
> Anyhow, thanks for the pointer. I hadn't read 3229 in a very long time!
>
> It also explains [1]  "where does A-IM come from?" .. It still seems
> kind of redundant with if-block.. do you think it is worth the extra
> bytes just to comply with a spec that doesn't really have any standing
> or meaning. (i.e. is there value in complying with 3229? It seems to be
> old debris on the side of the Internet, more interesting in terms of
> lessons learned than in terms of interop.)
>
> I guess I have a new one:
>
> [9] : if the block sizes aren't variable does the trailing block always
> fail to match or is there some kind of implicit pad-with-0's rule that
> should be stated in the spec?
>
> >         1] What's the point of the A-IM request header? Does it serve
> >         a purpose
> >         that If-Block does not?
> >
> >         2] Why is <number-of-blocks> in If-Block a bit shift value
> >         instead of an
> >         integer? Is there some reason to prevent hash sets of sizes
> >         that aren't
> >         powers of 2?
> >
> >         3] Why does <number-of-blocks> exist at all? HTTP is an ascii
> >         based
> >         protocol.. the normal way would be to use the usual comma and
> >         whitespace
> >         rules and just list the hashes and terminate them with CRLF.
> >         Putting the
> >         count in there as a leader just forces some poor client to
> >         buffer.
> >
> >         4] doesn't the server need to know the block length the client
> >         used to
> >         calculate the hashes it sent in if-block? Otherwise how can it
> >         know it
> >         matches?
> >
> >         5] what's the point of the file-size request header? Or is it
> >         really the
> >         block-size I was getting at in #4?
> >
> >         6] the block match uses a single byte binary block id.. block
> >         id isn't
> >         defined anywhere - I assume it is the index of the offered
> >         hash in the
> >         request if-block? Starting at 0 for the first one? Would be
> >         good to say.
> >
> >         7] if the block match uses an 8 bit (single byte) ID how come
> >         up to 512
> >         blocks are allowed in an if-block ?
> >
> >         8] I think literal data blocks (both header and body) should
> >         have
> >         options for uncompressed data with a binary length indicator.
> >         Certainly
> >         not everything zlib's well.
> >
> >         Hope this helps,
> >
> >         -Patrick
> >
> >         _______________________________________________
> >         Http-crcsync mailing list
> >         Http-crcsync at lists.laptop.org
> >         http://lists.laptop.org/listinfo/http-crcsync
> >
> >
> >
> > --
> > This email is intended for the addressee only and may contain
> > privileged and/or confidential information
>
> _______________________________________________
> Http-crcsync mailing list
> Http-crcsync at lists.laptop.org
> http://lists.laptop.org/listinfo/http-crcsync