e-Book reader

Mon Aug 6 14:15:21 EDT 2007

On Mon, 2007-08-06 at 11:14 -0400, C. Scott Ananian wrote:
> > In what you write, is "canonical.source" a string constant that should
> > be interpreted by the proxy (say a variant on "localhost") or are there
> > a variety of different canonical sources ?
> 
> No, it is exactly what the URL says: the canonical source of the book.
>  The "publisher".  You should be able to use standard HTTP on that URL
> and get the contents of the book.

ok

> > Once the book has been downloaded locally or, say, added to a
> > hypothetical peer-to-peer library, do you refer to it with the same http
> > URL or with a file URL (respectively a peer-2-peer protocol URL) ?
> 
> The same URL.  That's the whole point of URLs!  The hypothetical
> peer-to-peer library is just a fancy type of web cache: resources
> which live canonically at (say) http://cscott.net can actually be
> obtained from my neighbor (say) or the schoolserver.  But this is done
> via the standard http proxying and caching mechanisms.  We *could*
> return (say) a special http header which indicates that this resource
> is peer-to-peer cachable, but I'd prefer not: 

Assuming that I've built a book myself or received that book from the SD
slot, how are you going to tell your proxy server about it ? How should
the proxy server know the original URL of the book ? Aren't you going to
end up needing to upload/install that book + yet more meat-information
to the proxy ? 

> I still don't see how
> "books" are fundamentally different from other web content.  It seems
> more likely that (for example) a schoolserver would be configured
> (server-side) to preferentially cache some content which it knows to
> be the textbooks for the class.

I believe one of the main differences of points of view between us is
that you're fetching a book (through a controlled path, and while
on-line), while I'm receiving one (without prior knowledge, quite
possibly off-line). Maybe you're more an on-line developer while I'm
more a desktop guy.

> Why not http://canonical.source/alice.in.wonderland.zip/ ?
Oops. You're right, of course.

> If your .zip files are some super-special book-archive format, then
> they could contain a manifest which tells what the first page is.
Sure. Then, the web server needs to be able

> But this point is moot.  How do you find out about a book?  Someone
> gives you a URL, either via a link or some other means.  
...or someone gives me a file. While they could add a link besides the
file, it does defeat somewhat the point of "one-book-inside-a-package".

> That's how you find the "first page".
> This is exactly how it works on the web today, and we don't need to
> reinvent any of it.
I only half agree with that, but that's beside the point.

> > Does this mean that in addition to hacking the proxy, you also need to
> > hack the web server ?
> 
> I could unpack the files and it will work just fine: you just won't be
> able to download the entire book at one go. That's a perfectly
> reasonable fall-back position, and should be completely transparent to
> the user (except that pages might take a bit longer to load, since
> they're being fetched on demand).  Or you can write a simple cgi
> script to serve both the .zip and the individual pages.  That's not
> "hacking the web server", any more than installing mediawiki is.
> Really, now.  Did we 'hack the web server' to make
> http://wiki.laptop.org/go/Autoreinstallation_image work?

My apologies, "hack" was a poor wording. What I meant is that you start needing a specially-configured webserver. In other words, you're largely moving the difficulty from the client to the server.

> The beauty of using standard HTTP is that the pieces work
> incrementally.   URLs work as they are without any fancy proxying or
> serving.  You can gradually make the individual pieces smarter to do
> fancy caching or bulk downloading or whatnot, without having to build
> the whole edifice at once.  

That may or may not be good, depending on what you intend to do. I
assume, perhaps wrongly, that XOs (as other e-Book devices) will spend
most of their time off-line. If this is correct, by default, downloading
a page is not sufficient, as one is likely to, say, read the next page
after having left the school building.

> Further, the content generated is still
> accessible to users *without* any of your fancy tools, avoiding the
> creation of an OLPC content ghetto.

That's a very good point. So, perhaps, I should return to my original
suggestion of bunch-of-files-inside-a-zip-archive-without-metadata.

>   --scott

Cheers,
 David