Two Wrongs

Estimating Work Lag

Estimating Work Lag

With some basic queueing theory and a team that works in a ticket system like Trello or Jira, we can compute the work lag for the team. The work lag is our expectation of how long it will take the team to finish the next thing on the top of the backlog.

What’s particularly neat about it is that it’s a leading metric; it’s not just extrapolating from the past, but it actually tries to predict what will happen in the near future based on what’s in front of us right now. It will react much faster to changes than a pure extrapolation would, which would only show us the problem once it’s too late.

Please note that work lag is not a performance metric. Oh boy what terrible mistakes you will make if you think of it that way. Instead, work lag is an effective way to discuss how busy the team is; how much concurrent work they are trying to accomplish. It is based on a very general law: the more things we’re trying to do at the same time, the longer it will take until any of the things are done.

Who Work in progress Work lag
Steve 3 tasks 2.6 weekdays
Alicia 9 tasks 4.4 weekdays
Gloria 11 tasks 13 weekdays
Robert 6 tasks 20 weekdays
Team 29 tasks 6.7 weekdays

So for example, this team has 29 tasks open. That doesn’t tell us very much. But it also means that the very next task in the backlog will take more than a week to get done – now that’s something we can easily have an opinion on! In most organisations I’ve worked for, that counts as too much lag, and the team needs to decrease their concurrency.

This is one reason work lag is an effective indicator: it gives a tangible sense of the consequences of concurrency. Another reason is that it lets us account for individual variation in the pace of work. Look at Robert in the table above. His six active tasks might seem modest, until we realise that for him, that translates to a work lag of almost a month! Robert probably needs to try to do fewer things at once.

Why Focus On Work Lag

Concurrency – keeping busy – is associated with a cost we don’t talk about enough. Concurrency comes with latency. There are two broad reasons to focus on latency:

  • If the team is slow to finish queued work, they will get requests to expedite tasks1 “My customer really needs this thing asap, do you mind doing it right away?”. This is bad because it ruins whatever prioritisation existed in the first place, and since it also increases the concurrency, it makes the work lag even worse, which means even more tasks will be expedited.2 In other words, high work lag feeds into a reinforcing feedback loop that increases work lag. It’s important to have structures in place to actively counteract this, or it spins out of control.
  • Software product development is inherently an uncertain business. You want to try out ideas fast, so you can evaluate the assumptions that went into the idea as soon as possible, and then redouble the effort on valid ideas and drop invalid ones before you’ve invested too much in them. Trying ideas fast requires being able to move from backlog to done quickly.

It’s still important to keep busy. Software developers are paid a lot, so they can’t spend most of their day twiddling their thumbs. But this is a linear effect: a software engineer that spends half the time working wastes twice the money. That’s not what’s going to give you a headache. The work lag effects are non-linear and reinforcing; thus they are more valuable targets to pay attention to.

How To Estimate Work Lag

For each person on the team, dig up a list of their recently completed tasks. Make a note of how many weekdays passed between completed tasks. The data I have on Steve above starts with

0.35 0.79 0.24 4.76 0.57 0.92 0.13 0.84 1.12 1.51

In other words, whenever Steve finishes a task, the previous task he finished was almost always done less than a weekday ago. But sometimes almost a week passes between two tasks he finishes. There’s nothing wrong with this – high variation is typical for the work we do.

We can then compute the mean time between finished tasks for Steve. In my data, this is 0.87 weekdays. If we invert this number we get Steve’s completion rate. Since \(1/0.87 = 1.15\), we learn that on average, Steve finishes 1.15 tasks per weekday. We’re going to borrow from queueing theory, so we’ll call this number \(\mu\).

We compute this number for everyone in the team. Note that we get the whole-team completion rate by simply summing up all individual team member’s \(\mu\).

Who \(\mu\)
Steve 1.15
Alicia 2.05
Gloria 0.85
Robert 0.3
Team 4.35

Now, very importantly, don’t show these numbers to anyone. If management so much as gets a whiff of these numbers, they will be used as performance metrics. They’re not. A million factors go into completion rates3 Difficulty of tasks, tendency to split big tasks into smaller, getting interrupted by meetings, being out sick, etc., etc., so they are completely useless to gauge individual performance.

But we will use them in the following calculations. Now, look up how many tasks each person is trying to work on at the same time, their work in progress (wip).

Who \(\mu\) wip (tasks)
Steve 1.15 3
Alicia 2.05 9
Gloria 0.85 11
Robert 0.3 6
Team 4.35 29

Now you know both how many tasks each person has open, but also how many tasks they complete per day. Divide one by the other and you will get how many weekdays it takes them to complete what they have open.4 This is the queueing theory we used. It’s called Little’s law. It’s very intuitive.

This number is how many days of work there is in their active queue of open tasks. This is a good estimation for how long it will take them to complete their next task in the backlog – this is the work lag!

Who \(\mu\) wip (tasks) Work lag (weekdays)
Steve 1.15 3 2.6
Alicia 2.05 9 4.4
Gloria 0.85 11 13
Robert 0.3 6 20
Team 4.35 29 6.7

Now remove the column with \(\mu\) so that nobody accidentally sees that and thinks it’s a performance metric. Done!

Note that if you want to use this as an ongoing guideline or health check, it needs to be updated frequently, because it’s very sensitive to the number of active tasks a person has open. In those cases, it pays to automate it. If you’re just using it to spot check or argue for wip limits (which you should!) then it’s no big deal to construct it manually.

Advanced Version

The computations we just looked at assume that all tasks are roughly equal. This is – surprisingly – good enough for most purposes5 I have previously collected data that indicate all tasks are on average the same size. I believe Maersk arrived at the same conclusion. I’ll dig up the reference when I write the article about this.. But if your team is verifiably estimating effort, you can improve the accuracy of the work lag metric by accounting for the different (estimated) sizes of tasks. You run the exact same calculations as above, but instead of counting tasks as the basic unit, you count estimated hours.

So for completion rate, don’t look at the average number of tasks completed per day, look at the average number of hours of work completed per day. When measuring wip, look not at how many tasks the person has open, but how many hours of work those tasks are estimated to require. Otherwise it’s all the same.

Validation

I have, of course, tried this in practise and followed up afterward on how accurate the predictions were. As long as tasks aren’t expedited, the methodology seems to hold up – certainly it’s better than a constant guess, better than pure extrapolation, and definitely better than no information about work lag at all!

If you try it, please let me know how it goes. Likewise, if you think you have a better method for extracting this information, I’m all ears.