Making better time estimates

Has the following ever happened to you? You’re being tasked with a small project. In the beginning, the boss asks how much time it will cost, and you give an honest estimate. Let’s say you estimate the effort at about two months work. As it’s a relatively small project, you know well what to do in advance and you make a happy start. The first few weeks, everything is going fine and the deadline poses no real danger. However, instead of delivering at the due date, the deadline zips by much faster than you could imagine. Four months later, you’re still working on the project. How would that be possible? What did you oversee during the initial estimate? Where did it go wrong?

If you’re like me, you’ve experienced this more than once.

Making absolute estimates

When a manager asks how long something takes, he wants to have an absolute estimate. By “absolute estimate”, I mean that he wants to know the amount of time it takes. That’s what the business counts with. Sales people want to tell to your company’s customers when they can expect the new feature, so they’re going to ask you for a date. They also want to have a low number very much, because if the number is low, they can both make a good offer and make the customer happy quickly. Also there may be competitor offers, so the lower their offer, the better. As an aside, note that two measures of time are being mixed up here: asking how long it takes does not automatically lead to a delivery date. In the meantime, you may have planned a holiday, or attending courses, or being tasked to do something else for a while.

Which brings me to the observation that those estimates are not very honest at all. Oh yes, the developers are not to blame, they do their best to estimate the time it takes. However, as it turns out, we humans are notoriously bad at giving good time estimates. The estimate we tend to give is very optimistic, especially when we think we know exactly what needs to be done. However, when developing software, we’re usually doing something we never did before. This means that we encounter unknown work during the course of the project. We may think we take that into account when making up an estimate, but in reality we don’t know how big the unknown portion of the work is. We don’t even have a way to find out, other than just starting the work!

This means that absolute estimates are not honest: they do not take unknown unknowns properly into account. These estimates assume no setbacks and they usually do not take unforeseen work into account. Hence they are notoriously optimistic, even when they account for, say, 20% unforeseen work. In practice, unforeseen work could even be the major part of the project.

Developing is learning

Setbacks are part of the project. So are new requirements. This is not unexpected given the fact that you do something that was not done before. I think it is important to realize that writing software is a form of writing down knowledge, in such a basic way that even a machine can operate on it. And while at the start the high level ideas may be clear, the low level details all need to be worked out. And those details sometimes invalidate high level ideas. In short, developing software contains a very large learning part.

Imagine the situation that for some stupid reason all your source code got lost, just the day at which you wanted to deliver the project. How long would it take to write the same software from scratch? Would it take as long as the first time? I’m convinced that it would be an order of magnitude faster. That’s because you now have all necessary knowledge. The excess time that you spent the first try was all learning effort.

Now, if you’re being asked for a time estimate and you realize that you don’t know something, then a disclaimer would be in place. After all, it may easily turn out that you must do additional work. It’s kind of a black box where you don’t know how much is inside. However, it is much more difficult in case that you don’t know that you’re missing knowledge. You cannot possibly take that into account when giving an estimate, but that’s exactly what is the problem in software development. You learn things that you didn’t know you didn’t know! But since you didn’t know you were missing knowledge, you could not take it into account when giving a time estimate. That’s why time estimates for software development work are notoriously optimistic: we don’t know which activities we will miss.

It can be improved, though

In the beginning of the post, I defined the term ‘absolute estimate’ as being an estimate for a task in a time unit: days, weeks or months. I also argued that those estimates are almost always too optimistic. You may be wondering: if there is an ‘absolute’ estimate, would there also be a ‘relative’ estimate? And would that be more realistic? How would that work then? Relative estimates are indeed possible, and they turn out to be much more accurate in predicting the time needed for a task.

It works like this: instead of giving a random number for the amount of time that would be required, you define a standard package of work. Its size does not matter so much, but a smaller packages tend to give more fine grained numbers. This work package is defined as a fixed number of points, let’s say 8 points. Other work packages are compared to this work package: is it of comparable size, then it’s estimated at 8 points; do we think it is twice as large, then it’s given 16 points. (Many teams use some Fibonacci like scheme to make clear that higher estimates have higher uncertainty.) So all work packages are estimated relative to the standard package of work. This way of comparing packages is fairly accurate, because humans tend to be relatively good at comparing amounts of work.

But now we still don’t have an estimate, do we? Not so fast, the project has just begun. The point is, as long as we work on work items that are properly estimated, we have a feedback cycle in place. We can just measure how long a work package takes. After the first work package has completed, we already know a bit how much time the remainder of the work might take. The more work packages complete, the more accurate we can estimate how long it will take.

This also works when new requirements are added during the course of the project. In that case, the number of points decreases slower when work is added. You only need to track the total number of estimated points of the remaining work, and measure how much this number changes over time.

If you divide the work up into deliverable features, then you have also the possibility to do it the other way around: given a date, you can tell clients what features they may expect.

Why do relative estimates work?

We’ve been working with relative estimates in a real-world large development project, and we’ve found that we can deliver much more reliably. Usually, we’re less than 10% off. Why does it work so well?

I already told the first reason that it works well: comparing work packages is much easier than giving time estimates. But the other reason is that of feedback: the time it requires to complete 10 points of work is not estimated, but measured instead. So this includes all unknowns, all setbacks, all things you even couldn’t think of when you begun. Now on short term, those unknown unknowns vary a lot, but when measured over months, they tend to stabilize. So now the unknown unknowns are included in the estimates. The estimates you can give are now based on actual data, they are not wild guesses anymore.

Relative estimates are really relative

Relative estimates, although much more accurate than absolute estimates, do have some drawbacks though. The first thing is that you can only give estimates after the project has started, while sales people need to have an estimate beforehand to offer to the client. In that particular situation, relative estimates may not help much.

This is worsened by the fact that you cannot mix & match estimates: they are context sensitive. If the development team changes, so does the number of points delivered per time period, and therefore the resulting time estimate changes as well. But the technique is also sensitive to the kind of work: when the same team does estimates for different kinds of work (for example, work on an different products), then those estimates result in different time scales. The points actually have a different meaning.

Some time ago, our team had to spend a number of sprints to work on validation tasks rather than doing software development. We were able to deliver at twice the number of points each sprint with the same people. However, when we returned to software development, the old velocity returned as well.

So be careful not to mix up points from different contexts, because then the result does not tell you anything anymore.

High level estimates

In our team, we did an extension to this technique for a long running project. The work was defined as a list of features, but those were very high level. One feature could easily take multiple months. For the features that we started working on, we split the features up into epics, and the epics where split up in stories. Stories were estimated in points. But this meant that only part of the project had a valid estimate, and on a given day, the project manager wanted to know how much was still remaining. But most of the features had no epics yet, and most epics did not have stories. So, next to story points, we introduced epic points and feature points. By measuring completed epics and features, we were able to give an estimate for the remainder of the work.