[Http-crcsync] http-sync standard
Toby Collett
thjc at plan9.net.nz
Fri Apr 3 18:25:38 EDT 2009
Hi,
Since we are starting to get more people involved in this project and
related discussions, I thought I would try and set down the 'standard' as I
see it at the moment.
I have attached a very quick first draft of the protocol as I see it,
looking forward to the comments.
On a related note, would it be more useful for this to be in the crcsyns git
repo, or up on a wiki or something like that?
Toby
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.laptop.org/pipermail/http-crcsync/attachments/20090404/591beec9/attachment.htm
-------------- next part --------------
HTTP-SYNC standard
Authors: Toby Collett, ...
Date: 2009-04-04
Version: 1.0
Status: Draft
TODO:
Chunking
Aim:
The purpose of this document is to describe a method of reducing data transfer
in http requests based on information that the client already holds in a local
cache.
This is particularlly applicable to slow links to the internet, and there are
several possible usage scenarios.
One possibly scenario is to have a caching proxy on the client end of an internet
connection, and a 'sync' proxy at the ISP end. This would allow improved use
of a limited bandwidth connection with no modification either to the client (i.e.
a browser) or the upstream web server.
Another alternative would be for internet sites that change regularly. These could
send a delta to http sync enabled clients.
Scope:
This standard specifies a set of http headers added to a client request, and an
encoding for the server response.
Theory:
The basic operation requires the client to inform the server of the information
about the requested page that the client already holds.
This is transmitted in two forms. One is an SHA1 hash of the complete chached
file, and the other is a set of block hashes.
The block hashes are formed using a crc32 (64?) hash, this has the advantage that
it can be used as a rolling hash on the server end. The least significant 30 bits
of the hash are transmitted for each block.
The content should be divided into 20 even sized blocks, with the remaining data
not included in any hash.
Request Headers:
An http-sync request must include the following headers
Block-Size
This will be the block size of the transmitted hashes, in integer bytes, i.e.
Block-Size: 20345
Block-Hashes:
This will be the hashes for the blocks base64 encoded. No padding is applied
for the base 64 encoding, as the hash bit size is a multiple of 6 bits.
Content-Hash:
This will be an sha1 hash of the entire cached body, and will allow the server
to transmit deltas based on its knowledge of past versions of the page.
Encoding:
This standard uses a simple binary encoding. The file is transmitted as a set of
sections. Each section will start with a single byte header, these are:
'L' A literal section, this will be followed by a for byte size (in netowrk byte order)
followed by the literal data.
'C' A cache section, This will be followed by two 4 byte integers (in network byte order)
This first represents an offset into the cached file on the client system, the second
the size of data to read from the cache
a section starting with the binary values 0->19 indicates that the block of that number
should be inserted from the cached file. This is shorthand for the 'C' section when
matching a full block.
The response should also include an ShA1 Content-Hash in its headers so the reassembled
file can be checked. Failures in this hash check should be considered as a transmission
error, and should be returned to the client as such.
More information about the Http-crcsync
mailing list