A few comments on recent work on Gadget.

Tue Jun 17 19:53:14 EDT 2008

Dear Daf & Guillaume,

I read through Guillaume's 'activity' branch a few days ago and recorded
several comments which I'd like to forward to you. I hope they help.

  * Thanks for the nice documentation linked to from

      http://wiki.laptop.org/go/Gadget

    I really appreciate the effort spent here.

  * It seems that we're using more naive algorithms than when I
    last looked. activity_by_participants() is even marked FIXME. Have
    you been able to execute any scaling tests on the server to see
    whether these algorithms are problematic?

  * We're still maintaining an entirely in-memory database. 
      - How much memory does each additional connection, view, activity,
        etc. cost us? 
      - How will the component respond to low-memory conditions? 
      - Can we easily test this by rlimit()-ing the component separate
        from the rest of the system?

  * I see that much of the parsing code is wrapped in try/except blocks
    which log parsing exceptions. 
      - Does the parsing code never throw other errors? 
      - If an uncaught exception propagates, how much of the component
        dies? Is it just the connection which generated the exception or
        will the whole component go down?

  * Thanks for keeping the unit tests up to date with the rest of the
    code. 
      - Any progress on random-data tests or on load/scaling tests?

  * I see that we're beginning to use xpath queries instead of manually
    destructuring our input. 
      - Are there any downsides to using xpath here?

  * I'm glad to see the use of __slots__ to reduce memory usage of
    several common objects. 
      - How much memory is consumed under each of your test loads? 
      - Have you done any testing for memory leaks? 
      - If not, how long is the longest period of time you've left the
        component running under load (and idle?)?

  * I see several places where we've got big if/elif trees. 
      - Perhaps these should turn into dictionary lookups or method
        dispatch on objects?

  * It seems that when handling an multi-part query, the server
    accumulates all its results in memory, then serializes them, then,
    transmits them. 
      - How much memory/time could be consumed by a big query?
      - Could we profitably stream the results back to the client rather
        than batching them?

  * I infer that "views" contain stored queries and cached results. This
    should be explained somewhere in the code. More generally, the code
    should contain pointers to the relevent wiki documentation. (Or
    vice-versa).

  * The "search", "random", and "all" logic seems to be duplicated
    throughout the code. We could profitably abstract this into a small
    class hierarchy for queries.

  * The activity_update() logic seems clunky to me. Why not just inform
    all the views of the request and let them apply their own update
    logic?

  * I see a couple of instances of map and filter in this code.
    Unfortunately, this isn't Haskell. Think about using generators or
    regular iteration. Fuse your own loops. 

  * It seems that the Roster code doesn't adequately handle exceptions
    which occur during file loading. Please comment.

  * How confident in are you in the unicode-safety of your code. Why?

----

I'll try to get another person from 1cc or the general community
to review this code and to give you some more feedback. I've also
started reading the telepathy-gabble-gadget branch and will offer
feedback on it as I'm able. Please speak up loudly when you've got
something that you like someone else to review.

Thanks,

Michael