A few comments on recent work on Gadget.

Fri Jun 20 15:45:03 EDT 2008

I'm puzzled by what you could mean to say "I wrote two prototypes, one using
SQLite and one in Python".  Do you mean you placed content in a SQLite
database using its command line interface?  There is a Python interface to
SQLite.  Do you mean you reimplemented the functions of SQLite in Python?
Unlikely, since SQLite provides an ACID, mostly SQL92 conformant database
engine that is also accessible via a native call interface.   What were your
requirements for this application?

On Fri, Jun 20, 2008 at 12:12 PM, Dafydd Harries <
dafydd.harries at collabora.co.uk> wrote:

> Ar 17/06/2008 am 19:53, ysgrifennodd Michael Stone:
> > Dear Daf & Guillaume,
> >
> > I read through Guillaume's 'activity' branch a few days ago and recorded
> > several comments which I'd like to forward to you. I hope they help.
> >
> >   * Thanks for the nice documentation linked to from
> >
> >       http://wiki.laptop.org/go/Gadget
> >
> >     I really appreciate the effort spent here.
> >
> >   * It seems that we're using more naive algorithms than when I
> >     last looked. activity_by_participants() is even marked FIXME. Have
> >     you been able to execute any scaling tests on the server to see
> >     whether these algorithms are problematic?
>
> Not yet.
>
> >
> >   * We're still maintaining an entirely in-memory database.
> >       - How much memory does each additional connection, view, activity,
> >         etc. cost us?
> >       - How will the component respond to low-memory conditions?
> >       - Can we easily test this by rlimit()-ing the component separate
> >         from the rest of the system?
>
> When I started writing the database I wrote two prototypes, one using
> SQLite
> and one in Python. I then wrote a simulator for database access and timed
> both
> prototypes; the Python one was faster. Porting this simulator to the
> database
> we currently have is on my todo list.
>
> >   * I see that much of the parsing code is wrapped in try/except blocks
> >     which log parsing exceptions.
> >       - Does the parsing code never throw other errors?
> >       - If an uncaught exception propagates, how much of the component
> >         dies? Is it just the connection which generated the exception or
> >         will the whole component go down?
>
> The Twisted mainloop will catch the exception and log it with a traceback.
> The
> net effect would be the same as if the message was never received, assuming
> that Gadget doesn't make any changes to its state before the message
> parsing
> fails.
>
> >   * Thanks for keeping the unit tests up to date with the rest of the
> >     code.
> >       - Any progress on random-data tests or on load/scaling tests?
>
> No.
>
> >   * I see that we're beginning to use xpath queries instead of manually
> >     destructuring our input.
> >       - Are there any downsides to using xpath here?
>
> I'm not sure. It's not really xpath; it's an xpath-like language that
> twisted.words implements. The motivation for using it was to simplify the
> code, but I'm not entirely happy with the results. I think there's probably
> a
> ground between manually destructuring and using xpath, but my explorations
> in
> that direction didn't yield anything I considered worth using.
>
> Some of the Twisted developers have expressed a desire to use an XML
> library
> like lxml as the basis for twisted.words.protocols.jabber as opposed to the
> current home-grown API but this isn't something I've looked into in detail.
>
> >
> >   * I'm glad to see the use of __slots__ to reduce memory usage of
> >     several common objects.
> >       - How much memory is consumed under each of your test loads?
> >       - Have you done any testing for memory leaks?
> >       - If not, how long is the longest period of time you've left the
> >         component running under load (and idle?)?
>
> I haven't made any measurements yet. Certainly malicious clients could make
> Gadget consume a lot of memory.
>
> >
> >   * I see several places where we've got big if/elif trees.
> >       - Perhaps these should turn into dictionary lookups or method
> >         dispatch on objects?
>
> Absolutely. We dispatch <iq> messages from a table, and I think we have
> some
> FIXME comments for doing something similar with <presence> and <message>
> messages.
>
> >   * It seems that when handling an multi-part query, the server
> >     accumulates all its results in memory, then serializes them, then,
> >     transmits them.
> >       - How much memory/time could be consumed by a big query?
> >       - Could we profitably stream the results back to the client rather
> >         than batching them?
>
> Good question. This seems like something we should measure to see if it's
> worth optimising.
>
> >
> >   * I infer that "views" contain stored queries and cached results. This
> >     should be explained somewhere in the code. More generally, the code
> >     should contain pointers to the relevent wiki documentation. (Or
> >     vice-versa).
>
> Agreed.
>
> >
> >   * The "search", "random", and "all" logic seems to be duplicated
> >     throughout the code. We could profitably abstract this into a small
> >     class hierarchy for queries.
>
> Agreed.
>
> >
> >   * The activity_update() logic seems clunky to me. Why not just inform
> >     all the views of the request and let them apply their own update
> >     logic?
>
> I don't follow; could you elaborate?
>
> >   * I see a couple of instances of map and filter in this code.
> >     Unfortunately, this isn't Haskell. Think about using generators or
> >     regular iteration. Fuse your own loops.
>
> I assume you mean the roster code. I agree that it could be nicer; I've
> made some attempt to clean it up:
>
>  http://dev.laptop.org/git?p=users/daf/gadget;a=shortlog;h=roster
>
> >   * It seems that the Roster code doesn't adequately handle exceptions
> >     which occur during file loading. Please comment.
>
> Good point; I filed a ticket about this:
>
>  http://dev.laptop.org/ticket/7321
>
> >
> >   * How confident in are you in the unicode-safety of your code. Why?
>
> Not particularly confident, because our tests don't really cover it.
> Certainly
> something to work on. Ticket:
>
>  http://dev.laptop.org/ticket/7323
>
> > I'll try to get another person from 1cc or the general community
> > to review this code and to give you some more feedback. I've also
> > started reading the telepathy-gabble-gadget branch and will offer
> > feedback on it as I'm able. Please speak up loudly when you've got
> > something that you like someone else to review.
>
> Now that the code-review list exists, perhaps it would be an appropriate
> forum
> for requesting and posting Gadget reviews.
>
> You ask several good questions which I don't have answers to yet. Writing
> some
> stress tests so I can answer them better is a priority. Ticket:
>
>  http://dev.laptop.org/ticket/7322
>
> Thanks for the feedback!
>
> --
> Dafydd
> _______________________________________________
> Devel mailing list
> Devel at lists.laptop.org
> http://lists.laptop.org/listinfo/devel
>

-- 
"Always do right," said Mark Twain. "This will gratify some people and
astonish the rest."
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.laptop.org/pipermail/devel/attachments/20080620/3d45f46c/attachment.html>