Trac: release management

Fri Jun 13 14:09:20 EDT 2008

On Thu, Jun 5, 2008 at 6:57 PM, Martin Dengler <martin at martindengler.com> wrote:
> On Thu, Jun 05, 2008 at 04:25:53PM -0400, Garrett Goebel wrote:
>
>> ... I'll write you a query which will give all the
>> non-closed tickets which have never been changed by the owner.
>
> Are you hoping to get OLPC management more justification for hiring
> more people from this metric?  Or convince others that OLPC is
> overworked?

I'm hoping to:
o  make the state of inactive tickets easier to see and distinguish
between tickets which have had:
  - no human interaction
  - no owner interaction
  - no activity for over a given period of time
o  make trac more useful for release planning and scheduling

It won't be perfect. Each problem to be solved is unique. Each
programmer different.  But if we use running aggregates based on the
last n months of historic data, we can finesse those back of the
envelope guesstimations until they're more than just guesses.

At which point using time based estimations and FTEs will give the
release manager the ability to do more than just guess at when X, Y,
and Z can be delivered.

Which is a nice position to be in, when you have to explain to upper
management why new feature 'B' which they want to put at the head of
the queue is going to push back the features already in the works.
Especially when 'B' touches a lot of other code and is going to
require a lot of FTE hours. And it is nice when you can turn around
and point to historic data and which shows that tickets which have
impacted more than 1 or 2 other subsystems and required over 40 hours
to complete have historically resulted in an average of 1.5x the
number of FTE hours in new defects.

>> Whatever you want to call it, you might find it useful to track the
>> scope and complexity of the changes required to fix an issue. Priority
>> doesn't get at that. It would allow you to collect historic data which
>> could be used to project how much time tickets will take to be
>> implemented and how many bug hours you'll get per change.
>
> Do you know of any situations where this type of information is
> usefully collected?  It sounds like trying to do a number of chained
> correlation exercises (complexity/scope estimate, complexity/scope
> actual, time to fix estimate, time to fix actual) that are based on
> partially subjective, known-hard-to-observe/predict data and expect to
> come up with something useful.  More power to you if you succeed - you
> will be able to make millions consulting / selling your software to
> project management-focused groups.  Have you ever done this analysis
> before?

For the past 10+ years where I work.

It has been one of my hats, to customize our issue tracking system and
generate web based reports per my boss' needs. In that time we've
grown from 4 to ~30 developers. We've gone back and forth between what
makes for the lightest weight system which is useful for release and
internal management of the development team, and how to mine the issue
tracking system in order to help in discussions with upper management,
so that explanations and opinions can be backed up with historic data.

>> >> How many Full Time Equivalent hours does a given developer represent?
>> >
>> > A guesstimate: about 25 hrs/wk of coding and 30 hrs/wk of talking for
>> > social folks, maybe 30 hrs/wk of coding and 10 hrs/wk of talking for
>> > contractors; and 5-8 full days off a month (including weekends).
>>
>> Is there any list of developers and which slot each fit into?
>
> Why?  What is the use of asking questions that are somewhat private (a
> co-worker's opinion as to who's social or not) and unactionable by
> you?  These are actually rhetorical questions, so let me get to the
> point (below)...

You are either joking or willfully missing the point due to what you
probably view as previous provocations...

The slots I was asking about were employee vs contractor. Because
Michael Stone has estimated different FTE hours for each: 55 vs 40.

>> >> What components are the given developers capable of working on?
>> >
>> > I don't understand this question.
>>
>> You've got folks who have particular areas of expertise. Or to put it
>> the other way, developers who can work in certain areas but not
>> others. If your Trac ticket classifies a ticket as belonging to a
>> particular area, you can then project how many FTE's you've got on
>> hand to work in that area.
>>
>> I realize that this being an open source project leaves a lot open
>> ended. But if you collect the data in a way that you can get at it
>> effectively, you can use historic data to verify your assumptions and
>> track and make projections against non-employee/non-contractor
>> developers as well.
>
> You could, if 1) it were feasible to collect; 2) its analysis was a
> tractable problem; and 3) it analysis had (significantly) greater
> benefit than cost.
>
> 1) is possible to collect in this case (who has worked on what) but
> not (I contend) in your other point (predicting future development
> speed/progress)

You are expressing an opinion. Whereas I can share the experience that
it has worked quite well at my company. That said, it may not work for
the OLPC. Where I work, we don't have much turn over, we rarely have
contractors except as contract-to-hire, and only the occasional
intern.

> 2) tractability: is highly unlikely to be the case, for both inherent
> (individual productivity over time has huge variance, high
> periodicity, significant auto-correlation (positive and negative), and
> other issues I could think of not OTTOMH) and empirical (enough people
> have had enough time to make enough money to make it likely that if
> they could've, they would've) reasons

While my suggestions were more open ended. Let's look for a moment at
an example list of changes that might result from my recommendations:
o  identify the releases in which an issue is to be addressed ('Milestone')
o  identify when the schedule wrangler thinks it should be completed
    ('Schedule')
o  identify the branch+build's a defect was 'Found In'
o  identify the branch+build's in which a defect or enhancement was
    implemented 'Impl In'
o  identify the branch+build's in which a defect or enhancement was
    tested 'Tested In'
o  identify the subsystem(s) most closely related to the issue:
    'Component' or 'Subsystem'
o  identify who is in the pool of developers
o  identify average FTE's for employee and contractor developers
o  identify developers who work exclusively on particular subsystems
o  identify at assignment time, overridable by assignee how long a
    ticket is expected to take
   Ex: 'Difficulty' == {Easy <1 hour, Medium <4 hours, Hard <8 hours,
   Formidable <40}
o  identify after implementation how long a ticket is estimated to
    have actually taken ('Hours')

Which of these pieces of data aren't useful?

Which is too much of a burden for developers?

For release management and scheduling purposes... how would this _not_
be an improvement over back of the envelope guesstimations?

>From experience, I fully agree that some developers are more
productive than others on average and depending on the context. And
developers aren't as a whole great at estimating how much time
something is going to take. Nor do they have a great record at
accurately noting how much time they actually spent on a particular
issue.

But, I can also tell you that based on historic data and the FTE hours
we have had where I work... that the hours estimated on average has
usually approximated twice as much as we've said it would. Taking this
into account we can compare hour estimates with FTEs based on the last
6 months historic data and adjust such estimations accordingly.

No it isn't perfect. A statistician would have a field day. But it
gives our release manager a 'good enough' set of knobs to play with
when juggling expectations. As long as your eyes are open to the
limitations, it is pretty darn useful to be able identify monthly
targets and which developers' work queues are overtasked.

> 3) benefit: You have just described a way to determine how many FTEs
> are available for which areas in a way that is expensive, onerous on
> the measured, of highly questionable, undemonstrated feasibility, and
> of highly questionable accuracy.  On the other hand, people have been
> answering this question on projects much larger by just counting the
> paid FTEs and a back-of-the envelope estimation of the unpaid
> contributors, which has none of the disadvantages of, and many more
> advantages than your proposed method.

How is it expensive or onerous on the measured? What proposed field
shouldn't be in a trac ticket? Does OLPC have high turnover which
would require daily management of lists of employees and contractors?
Do developers work across all subsystems? Are there no bottle-necks
based on developer expertise in particular subsystems?  Is there
insufficient specialization by subsystem to merit correlating
available FTE resources by subsystem? Fine drop the subsystem-to-FTE
correlation if it isn't useful. But if you collect the data, you can
try to find useful correlations and... use them.

It has been demonstrated to work for my employer.

There is very little difference between your solution: 'counting the
paid FTE and back-of-the-envelope estimation' and mine. My
recommendation is little more than extending your method to collect
the historical data necessary to check your back-of-the-envelope
estimations. Using a database, you can use running aggregates of
historical data to keep the estimations more closely tied to reality.

> This meta discussion takes valuable time - I don't think it's worth
> the costs, given all of the above (yeah, I know this is hypocritical,
> but this is the internet so I can do that :)).

;-)

FWIW IMHO what follows shifts from 'the big picture' to picking at the
details...

>> >> What is your rate of defects per change? How does that break down by
>> >> severity and difficulty?
>> >
>> > Are you measuring by source commits, packages, test builds, candidate
>> > builds, or releases?
>>
>> Trac tickets.  Source commits might be better.
>
> Why (are source commits better)?

Ask Michael Stone... he was the one who listed source commits. I was
suggesting Trac tickets, but respecting his opinion that counting
source commits might be useful.

>> Some analysis of the composition of git changesets associated with a
>> Trac ticket would be better.
>
> What type of automated analysis do you propose?  If you mean  manual
> analysis of each commit for defects, that's code review, which is done
> already.

Perforce has a workflow app/extension which allows changeset to be
associated with tickets in an external issue tracking system. We don't
use it so I don't know how useful it would be. I don't know if trac
can do something similar with git.

The point I was trying to imply was that associating source commits in
the context of a trac ticket would probably be a good thing. If it is
easy to collect useful data... why not? If it were easy to count the
number of files in a changeset, the lines of code involved, number of
different subsystems (unique directory paths)... All of those things
could give you some real numbers to correlate with the estimations of
the scope, extent, and complexity of a ticket.

However, it isn't clear that digging these numbers out of a changeset
would be as useful as a simple estimate for scope, extent, and/or
complexity given by the developers. -However you boil it down. Which
is why in the part you left out:

> But in practice Trac tickets with fields for complexity and length of work required
>  are probably just as good and a whole lot less complicated.

I recommended boiling it down to one or two fields.  However, if there
is a git plugin for trac that makes the data accessible... why not
gather it?

>> >> Are tickets reviewed before being closed?  By someone other than the
>> >> implementer. Who?
>> >
>> > See #7014 for an example of the problem.
>>
>> Looks like the big problems are easier to solve if you identify them
>> as a bunch of little problems problem.
>
> That's either entirely obvious or exactly what the ticket's done or
> both :).
>
>> The ticket should probably have been broken down into lots of
>> smaller tickets and then updated to list them as its blockers.
[...]
> It is broken down into smaller tickets.  They are 'listed'.  You imply
> both immediately in your next...

I don't imply both. While the related tickets are buried throughout.
They are _not_ 'listed'. Nor are they listed using 'blocking' and
'blocked by'.

The problem identified was the difficulty of reviewing big complex
tickets. If a big ticket is subdivided into little ones through the
use of blockers... and if each blocker must be reviewed before being
closed and each blocker reopened reopens its ancestors for review...
then the problem of verifying big all encompassing tickets becomes a
much simpler one.

The point was that blocking and blocked by appears to be in place...
but wasn't being used... where if it had been... there shouldn't have
been a problem.

>> There's mention of other tickets all throughout... but you're not
>> using the 'Blocked By' or 'Blocking' fields...
>
> If you're happy that's the case, update the ticket.  If you think
> Michael would've done that in case it was and thus conclude that it's
> not the right thing to do, then...this point seems pointless.
> Assuming the blockers list is known to the degree one would want it
> recorded for all time (as opposed to for immediate human
> consuption/triaging *right now* only), how are you envisoning to make
> use of it the future?

I don't know why Michael didn't use 'blocking' and 'blocked by'.
Perhaps he will enlighten us? Perhaps 'blocking' and 'blocked by'
don't work well enough. A simple test shows they allow circular
dependencies to be defined. Assign a ticket itself as a blocker. If it
is hard to review big tickets in trac... how can trac be improved to
make it easier?

>> [how can] the OLPC process [be] so broken as to allow a newly filed
>> tickets to be completely ignored forever.
>
> Are you claiming that your one ignored ticket, or a significant number
> of ignored tickets (as you assume your report will demonstrate), is an
> anomaly among viable, ongoing, value-delivering software projects?  In
> my experience it's not, and when other prominent people have the same
> complaint (google for "jwz cadt") I don't find that claim surprising,
> and thus would not assume that a lot of resources should be expended
> to fix it (as it's not a necessary - or, separate point, sufficient -
> precondition for success).  Clearly fixing it is a Good Thing, but I
> think a periodic report of untriaged trac items and volunteers to
> triage them (google for "Sugar BugSquad" or maybe "OLPC Support Gang")
> would be much better focus for motivated and able people (such as
> perhaps yourself).

So because most open source projects suck in this regard... the OLPC
should too? That is an interesting argument. Even if you disregard the
fact that the OLPC is a project that has paid employees and
contractors.

I'm glad you appear to agree that a periodic report of trac items
needing to be (re)triaged is a Good Thing (TM). When I find the TUITs
I'll take a swing at providing them. However, call me narrow-minded,
but I can't imagine paid employees and contractors taking un-fun work
queues suggestions from an unpaid volunteer.

> That's a very fair desire, IMHO. I personally thought your initial
> email was not so neutrally worded and invited such comments as you
> received (see point about Everybody Gets Angry But Then Calms Down
> below).

Well let me contribute toward arriving at 'calms down' by apologizing
for my not so neutrally worded remarks. I have a tendency to argue
points instead of discussing them. I hope that won't keep valid points
from being heard.

>> > Also notice how I'm splitting release prioritization from development
>> > prioritization into separate management problems.
>>
>> No I didn't notice. I don't see the words 'release' or 'development'
>> mentioned on that URL.
>
> This separation was clear to me.

Call me obtuse, but it still isn't clear to me. I hear that
prioritization is split into two problems. Prioritizing releases and
prioritizing development. But I don't see _how_ it is split.

>> Barter with what? My overly polite and friendly disposition :-) ?
>
> For a start, yes!  You put a smiley so I assume you're not being
> serious, which is a shame because yes, that's exactly what you should
> use as part of your barter offering (and the fact - as I read it -
> that you weren't may have led to the vehemence of some responses
> (Everybody Gets Angry But Then Calms Down)...but I think we all
> realize that frustration at an immediate/recent/repeated issue can be
> excused (why I'd excuse both your and everyone's vehemence on this,
> btw: you had a neglected ticket and others had a crowd shouting 'pick
> me! pick me!' (shrek reference))) (previous parenthetical sponsored by
> SBCL).

Please excuse this as an attempt to lighten the mood by poking fun at
my own shortcomings. I didn't mean to imply that I shouldn't be
polite. Merely that I may not be well endowed with a polite and
friendly disposition. The fact that you didn't get my point...
underscores it.