[Testing] [sugar] Automated testing, OLPC, code+screencasts.

Wed Mar 26 22:24:18 EDT 2008

On Wed, Mar 26, 2008 at 09:19:19PM -0400, Benjamin M. Schwartz wrote:
-> -----BEGIN PGP SIGNED MESSAGE-----
-> Hash: SHA1
-> 
-> Titus Brown wrote:
-> | Other than that, I'm interested in figuring out how to encourage the
-> | OLPC project to build some automated tests without doing it all myself
-> | :).  Thoughts welcome.
-> |
-> | Basically I think it's incredibly important for the OLPC to invest in
-> | automated tests.
-> 
-> There are many people in and around OLPC who feel the same.  There are
-> also a few who are exceptionally familiar with Tinderbox, and for a time
-> we had a tinderbox running tests on every build, hooked up to an actual XO
-> to maximize realism.  The closest I'm aware of to actual GUI testing was
-> ensuring that Sugar started successfully.  See
-> http://wiki.laptop.org/go/Tinderbox_Testing .

I've gone through the testing docs several times (and I've been
lurking on this list for a while, too).  I applaud the Tinderbox setup,
but I think the complete absence of unit tests, functional tests,
automated sugar-jhbuild builds, acceptance tests, and automated test
infrastructure in general sort of speaks for itself.  There are tons of
packages -- some of them at least moderately mature -- that can help you
manage large Python code bases, establish tests, and otherwise ensure
some sort of code quality.  None of them are in evidence.

While not quite Software Engineering 101, I don't think anyone would
venture to claim that this is good, competent software engineering
practice.

...but enough said.  I'll try not to do any more carpetbombing until
I've worked with the existing code base a bit more & have built up some
more cred.

-> | As you can guess from the "death spiral" post, I'm not terribly
-> | interested in *debating* the usefulness of automated tests, and I don't
-> | buy into the notion that GUI automation is impossible, either.  I would
-> | very much like to try to introduce systematic and robust GUI testing
-> | into the OLPC and I will be working towards that end as time and
-> | resources permit.  Constructive comments and genuine interest are
-> | welcome!
-> 
-> You will get a warmer welcome if you take a more positive tone.

I'm being intentionally obnoxious because I think it's the quickest way
to confront the issues associated with the OLPC software effort.  If you
look at the comments on that post, you'll also see what we're up against
-- people who literally want to find any excuse possible to squeak out
of writing tests.

-> Personally, I am skeptical, for several reasons:
-> 0. We already know what the bugs are.  The problem is not new bugs, and
-> it's not regressions.  The problem is bugs that we've known about for a
-> long time that are just not easy to fix.
-> 1. Regressions are rare, and they are not the main problem.  At least 90%
-> of OLPC's problems fall into the category of unimplemented functionality,
-> not really bugs at all.  Recently, a number of features have been removed
-> because we finally did enough testing to discover underlying bugs, but
-> these were not regressions.
-> 2. Many, and perhaps most, of OLPC's remaining difficult bugs are related
-> to the network.  They are most commonly related to the closed wireless
-> firmware, which is buggy and lacks key features regarding mesh routing and
-> multicast.
-> 3. Almost all of OLPC's major bugs are Heisenbugs.  They often don't
-> appear at all with only one laptop, and appear rarely until one has 12 or
-> more laptops sharing a wireless mesh.
->
-> One way to prove me wrong would be to filter the ~6800 bugs in Trac into
-> "Regression", "Non-regression", and "Other".  I think you will find that
-> there are very few regressions.  Another perfectly good way to prove me
-> wrong is to build a complete testing suite and start spotting regressions,
-> but that's more difficult.

Well, my goal isn't to prove you wrong, but rather to get some sort of
handle on whatever problems are facing the OLPC with its software
reliability :).  Nothing would make me happier than to be proven wrong &
to find that the OLPC software is quite reliable on individual laptops!

However, being a fairly experienced open source hacker myself, albeit
one now passionate about testing, I sincerely doubt that anyone has any
good idea of where the bugs lurk within the OLPC code.  Regressions tend
to go unmentioned, especially if they're minor, but they have an
overall destructive effect on the code base that cannot be measured
easily.  Moreover, the progressive constraint of a code base with
automated unit and functional tests has a real quenching effect on
"Heisenbugs".  Obviously non-determinism is an unsolved (and probably
insoluble) problem -- remind me to tell you about a brief communication
I had with google on this subject sometime... -- but I honestly don't
know how one can think about tackling non-determinism until there's some
sort of automated test infrastructure up and running.

Over the next few months, hopefully aided by Zach Riggle and Grig
Gheorghiu through the GSoC, I hope we can implement some sort of GUI
testing framework, continuous build system, and basic fixtures for some
internal functional tests.  After that I might be in a better position
to address your specific points.

cheers,
--titus