ianb at colorstudy.com
Mon Aug 6 12:48:01 EDT 2007
C. Scott Ananian wrote:
> On 8/6/07, Yoric <Yoric at users.sf.net> wrote:
>> On Mon, 2007-08-06 at 03:12 -0400, C. Scott Ananian wrote:
>>> On 8/3/07, Yoric <Yoric at users.sf.net> wrote:
>>>> With this in mind, I intend to be able to reference
>>>> * the package itself (to be able to download it, from Firefox or from
>>>> anywhere else)
As an aside, but also because zip files seem to be complicating some
other use cases, in my tests zip files aren't substantially smaller than
JFFS2's normal zlib compression. I listed some examples here:
And a script to test it out for yourself here:
The zip files are a bit smaller, probably because it can compress entire
files while JFFS2 only compresses 4K chunks. Notably tar.gz files are
substantially smaller than either, but I don't believe tar files are
appropriate for storing pages since it's harder to access individual files.
Zip files would be faster to download, but not much smaller if you are
connecting with a web server and client that know gzip compression
(there will be some extra overhead from all the headers). If you have
Keep-Alive working on both sides too, the added latency won't be too bad
I'd think. (Though maybe HTTP pipelining would improve things even
more? Not many clients know how to do HTTP pipelining from what I
understand, but I'm a little fuzzy on the whole thing.)
More of a problem is actually figuring out what all pages you want or
need. You can download them incrementally, which is nice (you can view
the first couple pages, for instance, while the rest is downloading).
But if you have to follow <link rel="next"> to get all the pages that's
going to be somewhat annoying to handle. A complete enumeration would
be nice to have up-front. If the server and client could agree on
passing around a tarball, that would be even nicer, but that's something
we'd have to rig up on our own. I can imagine a header:
And, seeing that header and if the client knows it has a bunch of files
to fetch, it would fetch:
And the server would create a tar.gz file from all those urls. Anyway,
just one possible solution to the possible problem of getting lots of
files over HTTP.
>> Once the book has been downloaded locally or, say, added to a
>> hypothetical peer-to-peer library, do you refer to it with the same http
>> URL or with a file URL (respectively a peer-2-peer protocol URL) ?
> The same URL. That's the whole point of URLs! The hypothetical
> peer-to-peer library is just a fancy type of web cache: resources
> which live canonically at (say) http://cscott.net can actually be
> obtained from my neighbor (say) or the schoolserver. But this is done
> via the standard http proxying and caching mechanisms. We *could*
> return (say) a special http header which indicates that this resource
> is peer-to-peer cachable, but I'd prefer not: I still don't see how
> "books" are fundamentally different from other web content. It seems
> more likely that (for example) a schoolserver would be configured
> (server-side) to preferentially cache some content which it knows to
> be the textbooks for the class.
I wrote up some thoughts for this here:
When you have poor connectivity, you want the children to be able to get
a complete/consistent set of resources when they do have connectivity.
For instance, if they have material they are supposed to study you want
them to be able to "take it home" so to speak, by preloading their
laptop with that material.
Definitely fetching it from peers is a possibility. But that can't work
predictably; you can never be sure what your peers will have, or if
they'll be around. I suspect this will annoy teachers most of all.
The idea I was thinking of is to have some way to enumerate a set of
resources, and then use that as the basis for pre-fetching. A book is
an obvious set of resources (all the pages in the book), but you could
tutorials, or whatever.
I played around with a proxy that looked for this kind of markup, and
added links to cache and uncache pages based on that. The proxy itself
was kind of lame, though. Anyway, for the curious:
Ian Bicking : ianb at colorstudy.com : http://blog.ianbicking.org
: Write code, do good : http://topp.openplans.org/careers
More information about the Devel