[Http-crcsync] General comments on crcsync document
Alex Wulms
alex.wulms at scarlet.be
Thu Jul 9 15:48:00 EDT 2009
Hi,
Regarding caching: I'll read-up on the vary part again. Does it work for a
combination of request and response? The important point being that the
response is only valid for a request with the matching block-headers.
Regarding trailing block and file-size header: we specify the file-size and we
(plan to) specify the number of blocks (though, it could indeed be implicitly
derived from the blocks-header-size like somebody mentioned). With this
combination we calculate the trailing block size on the server. We pass this
info to the crc-validation library from Rusty so that he can match the
trailing CRC only when a trailing block of trailing-block-size bytes has been
read. I'll clarify this in the spec document. It works like a charm in the
implementation. In a first prototype we did not explicitly specify the
trailing-block-size to the validation library and the library used some
heuristics to guess if it was matching a trailing block. That code was more
complex and let to some subtle bugs between the moving-window logic in the
encoder module (that was handling the http-stream) and Rusty's validation
library.
Anyway, I'm happy to see this good feedback coming in on the document. There
is certainly room for improvement.
Thanks and brs,
Alex
Op donderdag 9 juli 2009, schreef Patrick McManus:
> On Thu, 2009-07-09 at 07:30 +0200, Toby Collett wrote:
> > Hi, Some good points below, I dont have time for a full answer just
> > now, but thought I would quickly point you toward this RFC
> > http://www.ietf.org/rfc/rfc3229.txt . This was for an earlier delta
> > coding standard that relied on having the same cache content server
> > and client which the crcsync tries to avoid. However its discussion on
> > intermediate caches is informative, and it is where we have got most
> > of the concepts around the 226 response from.
>
> Hi Toby!
>
> I would strongly caution against taking advice in 3229 as state of the
> art. It is 7 years old and I am not aware of a single production use of
> it at scale. It cannot be expected to reflect the current Internet.
>
> For instance, it talks about broken caches (and yes they still exist
> once in a while) but it doesn't speak of firewalls at all. And today
> those are a more significant factor in deployed infrastructure. And,
> like I said before, I forsee inventing 226 as a significant interop
> problem with firewalls.
>
> HTTP/1.1 caching has gotten MUCH MUCH better over time. CRCSYNC is about
> speed - you want to leverage caching where you can, not run from it.
>
> At the very least, imo, all this stuff about 226 instead of 200 and
> mandatory cache busting should be dropped from any normative sections of
> a spec you write. It might be included as casual implementation advice,
> given more experience with your implementations, but HTTP contains
> perfectly sensible provisions for making this work with compliant
> infrastructure (specifically "Vary:") so it doesn't make any sense to me
> to prohibit such an implementation as being compliant with your spec.
> Does that make sense?
>
> Anyhow, thanks for the pointer. I hadn't read 3229 in a very long time!
>
> It also explains [1] "where does A-IM come from?" .. It still seems
> kind of redundant with if-block.. do you think it is worth the extra
> bytes just to comply with a spec that doesn't really have any standing
> or meaning. (i.e. is there value in complying with 3229? It seems to be
> old debris on the side of the Internet, more interesting in terms of
> lessons learned than in terms of interop.)
>
> I guess I have a new one:
>
> [9] : if the block sizes aren't variable does the trailing block always
> fail to match or is there some kind of implicit pad-with-0's rule that
> should be stated in the spec?
>
> > 1] What's the point of the A-IM request header? Does it serve
> > a purpose
> > that If-Block does not?
> >
> > 2] Why is <number-of-blocks> in If-Block a bit shift value
> > instead of an
> > integer? Is there some reason to prevent hash sets of sizes
> > that aren't
> > powers of 2?
> >
> > 3] Why does <number-of-blocks> exist at all? HTTP is an ascii
> > based
> > protocol.. the normal way would be to use the usual comma and
> > whitespace
> > rules and just list the hashes and terminate them with CRLF.
> > Putting the
> > count in there as a leader just forces some poor client to
> > buffer.
> >
> > 4] doesn't the server need to know the block length the client
> > used to
> > calculate the hashes it sent in if-block? Otherwise how can it
> > know it
> > matches?
> >
> > 5] what's the point of the file-size request header? Or is it
> > really the
> > block-size I was getting at in #4?
> >
> > 6] the block match uses a single byte binary block id.. block
> > id isn't
> > defined anywhere - I assume it is the index of the offered
> > hash in the
> > request if-block? Starting at 0 for the first one? Would be
> > good to say.
> >
> > 7] if the block match uses an 8 bit (single byte) ID how come
> > up to 512
> > blocks are allowed in an if-block ?
> >
> > 8] I think literal data blocks (both header and body) should
> > have
> > options for uncompressed data with a binary length indicator.
> > Certainly
> > not everything zlib's well.
> >
> > Hope this helps,
> >
> > -Patrick
> >
> > _______________________________________________
> > Http-crcsync mailing list
> > Http-crcsync at lists.laptop.org
> > http://lists.laptop.org/listinfo/http-crcsync
> >
> >
> >
> > --
> > This email is intended for the addressee only and may contain
> > privileged and/or confidential information
>
> _______________________________________________
> Http-crcsync mailing list
> Http-crcsync at lists.laptop.org
> http://lists.laptop.org/listinfo/http-crcsync
More information about the Http-crcsync
mailing list