[Http-crcsync] General comments on crcsync document

Toby Collett toby.collett at gmail.com
Thu Jul 9 01:31:30 EDT 2009


Didnt reply all so forwarding to the list

---------- Forwarded message ----------
From: Toby Collett <toby.collett at gmail.com>
Date: 2009/7/9
Subject: Re: [Http-crcsync] General comments on crcsync document
To: Patrick McManus <mcmanus at ducksong.com>


Hi, Some good points below, I dont have time for a full answer just now, but
thought I would quickly point you toward this RFC
http://www.ietf.org/rfc/rfc3229.txt . This was for an earlier delta coding
standard that relied on having the same cache content server and client
which the crcsync tries to avoid. However its discussion on intermediate
caches is informative, and it is where we have got most of the concepts
around the 226 response from.

With regard to 2 and 7, the bit shift was to allow more blocks, but you
correctly point out that this is not useful if our response only has an 8
bit block number encoding, in which case we should just drop to a standard
byte. Although as you point out we can probably calculate that from the
string length of the hashes header...

Toby

2009/7/9 Patrick McManus <mcmanus at ducksong.com>

On Wed, 2009-07-08 at 16:33 -0700, Pedro R wrote:
> >
> >
> > Yes, but you are considering a client which only speaks with crcsync
> aware server.
>
> Hi Pedro and CRCSYNC team!
>
> I need to object to that characterization - that is not what I am
> considering.
>
> >By using the OPTIONS method, the non-crcsync server may not respond
> >properly.
>
> which tells you what you need to know, right? (that its a non-crcsync
> resource.)
>
> I am simply saying that if you want to use HTTP to probe resources
> regarding their implementations of extensions, crcsync is an extension
> afterall, then HTTP provides OPTIONS as the proscribed framework for
> doing so. In the best possible world this should just be done with
> something analagous to accept-ranges (e.g. "accept-crcsync: v1" ?) which
> can be put into the OPTIONS and normal GET responses even when deltas
> are not applied (see below) as a sort of advertisement.
>
> It is certainly a good thing that there is no *requirement* in the spec
> to probe in order to create a backwards compatible request - but you
> started this thread concerned about the overhead (and therefore
> optimization) of including if-block when speaking to resources that did
> not have this extension implemented. And that's a reasonable
> implementation strategy imo even if its out of scope for the spec.
>
> IMHO you really want to separate the concept of "is this implemented"
> from "is this applied on this transaction".. the former is more or less
> an immutable property while the latter might very well depend on some
> rather immediate conditions that shouldn't really be cached.
>
> There are a number of good reasons a capable server might not want to
> generate a 226 delta at any given time even though it could - and it
> should certainly not be required to do so under any circumstance by the
> presence of an If-Block on any particular request. (and because it isn't
> required to, it is not a useful probing and caching technique). The
> chief scenario in my mind is when it is going to send the "instance
> coded" delta anyhow because there is not a useful match in the request..
> instead of tunneling that through any intermediary with a 226 attached
> it should really send it as a 200 so that it can be properly cached and
> understood along the way. basically the same thing goes if the delta
> being generated does not contain any literal blocks - a 304 would also
> be a legitimate (and in my mind preferable in most cases) response. It
> isn't really the place of the spec to mandate either the delta or the
> traditional response, just to specify when they are legal and what they
> mean. Think of it kind of like chunked encodings.. for a server to
> generate one it needs to know that the client is capable of
> understanding the response - but even then whether or not to delimit any
> particular message as chunked is an implementation decision.
>
> Delving into a related topic - I'm not so sure that disabling one
> transfer optimization (caching) in order to support another (deltas) is
> the way to go. I came to this particular party a little late, can anyone
> explain to me why this "226" response is used instead of the
> Accept-Encoding/Content-Encoding/Vary (If-Block, A-IM) triumvirate which
> is meant to give fine grained cache control. This is really all just a
> variation on etags and i-m afterall..
> crccache_doc_http_crcsync_protocol.odt (am I reading the right
> document?) seems to assert that the need to disable caching is a fait
> accompli - but that isn't obvious to me at all. It frankly seems more
> like an implementation simplification in order to get some code up and
> running (with which I keenly sympathize) but that's not going to fly in
> the http standardization world.
>
> Relatedly, mandating 3 variations of "cache-control: no-cache" on every
> response is never going to survive standardization. Is there some
> inherent reason that deltas cannot coexist in a heirarchical cache
> environment but ranges (to pick an example) can?
>
> We all know that firewalls like to prevent extensions of protocols, even
> in ways protocols were meant to grow. For HTTP that is especially true
> of response codes. I'd bet money you will have more interop problems
> inventing 226 responses (a very uncommon thing to do) than you will if
> you operate in the existing response code framework. Even inventing new
> headers will cause some problems, but generally the applications level
> firewalls will just strip them from your requests which will still leave
> a working (if non delta'd) application.. where an unknown response code
> will probably result in the whole response being blocked (because the
> firewall cannot semantically make sense of it - and thats its job.)
>
> my nickel contribution.
>
> Meanwhile, I admit that tonight is the first time I have read the above
> mentioned doc in any real detail.
>
> I have some additional questions about it, that hopefully are kind of
> naive. Can anyone help me out?
>
> 1] What's the point of the A-IM request header? Does it serve a purpose
> that If-Block does not?
>
> 2] Why is <number-of-blocks> in If-Block a bit shift value instead of an
> integer? Is there some reason to prevent hash sets of sizes that aren't
> powers of 2?
>
> 3] Why does <number-of-blocks> exist at all? HTTP is an ascii based
> protocol.. the normal way would be to use the usual comma and whitespace
> rules and just list the hashes and terminate them with CRLF. Putting the
> count in there as a leader just forces some poor client to buffer.
>
> 4] doesn't the server need to know the block length the client used to
> calculate the hashes it sent in if-block? Otherwise how can it know it
> matches?
>
> 5] what's the point of the file-size request header? Or is it really the
> block-size I was getting at in #4?
>
> 6] the block match uses a single byte binary block id.. block id isn't
> defined anywhere - I assume it is the index of the offered hash in the
> request if-block? Starting at 0 for the first one? Would be good to say.
>
> 7] if the block match uses an 8 bit (single byte) ID how come up to 512
> blocks are allowed in an if-block ?
>
> 8] I think literal data blocks (both header and body) should have
> options for uncompressed data with a binary length indicator. Certainly
> not everything zlib's well.
>
> Hope this helps,
>
> -Patrick
>
> _______________________________________________
> Http-crcsync mailing list
> Http-crcsync at lists.laptop.org
> http://lists.laptop.org/listinfo/http-crcsync
>



-- 
This email is intended for the addressee only and may contain privileged
and/or confidential information



-- 
This email is intended for the addressee only and may contain privileged
and/or confidential information
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.laptop.org/pipermail/http-crcsync/attachments/20090709/c062f626/attachment.htm 


More information about the Http-crcsync mailing list