[Http-crcsync] Apache proxy CRCsync & mozilla gsoc project?

Martin Langhoff martin.langhoff at gmail.com
Thu Apr 2 04:00:46 EDT 2009


On Thu, Apr 2, 2009 at 9:44 AM, WULMS Alexander <Alex.WULMS at swift.com> wrote:
>>
>>If the cache just blows away the cached base page it was using when one of
>>these errors occurs, that should Do The Right Thing even without seed.
> I see some concurrency issues and race conditions if the cache would simply blow away the cached base page, just in case that two
> different concurrent requests are using the same base page and only one of them suffers from the checksum clash.

Is it really a problem? Assuming no seed, the requests are idempotent,
so of one clashes and the other one doesn't, the successful one
replaces the local cache entry with the latest successfully downloaded
page. So it is actually 'reseeding' any requests that follow, as the
cached document has changed.

Thinking in the opposite direction, will there be actual usable
opportunities for the upstream proxy to save cpu time or bandwidth
with content that doesn't change? The "win" scenario for avoiding a
random seed is:

 - The origin server didn't mark this as cacheable.
 - The upstream proxy can cache the files it serves, plus their
already computed hash.
 - On a request for an already-cached-and-hashed file, the upstream
proxy could avoid re-computing the hash by comparing the (de-chunked)
data stream to the file on disk.

So the tradeoff is of increased disk storage and IO during the content
serving to save CPU time. This implies some assumptions on the
relative costs of IO and cpu time...

cheers,


martin
btw - pruned the CC's to stop mailman from complaining of too many recipients...
-- 
 martin.langhoff at gmail.com
 martin at laptop.org -- School Server Architect
 - ask interesting questions
 - don't get distracted with shiny stuff  - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff


More information about the Http-crcsync mailing list