On our team, we are using Kanban-style queues to manage our story production. We count cycle time to be from the point it enters the queue (out of to do) to the point it leaves the queue (done-done). Currently, our average cycle time for stories is 15.7 days, which isn’t good especially considering we still track iterations as 2 weeks (10 days). So my entire focus at the moment is how to get our cycle time down.
Recently, there have been a lot of threads on both the Extreme Programming and Scrum Development lists talking about productivity vs. throughput as well as collocated vs. distributed teams. This has been rather timely from my perspective, since for the past several months my manager and I have been discussing productivity vs. throughput; how to measure it; and where our struggles lie. This is combined with one of our struggles being that we are a distributed team and there is no changing that.
We do track velocity (by 2 week iteration), but the big conclusion I’ve come to realize is that this is not a measure of productivity. Velocity is a measure that is beneficial in estimating. Productivity is really about output and shouldn’t be in terms of estimates. We are starting to measure output more in terms of “forward moving stories”. Now there’s a concept out there of Minimum Marketing Feature (MMF)… which I believe is similar to a story for us, but I need to do some more reading around that. In any case, if our story is equal or smaller than a true MMF, I don’t see that as a problem for us at this point. This even brings me thinking in alignment with the other current thread “cost estimates are quaint”. (Aside: is it me or anytime the word “quaint” comes up J.B. Rainsberger isn’t far away :)) Anyway, with this thread there is a lot of discussion about teams not worrying about estimating their stories. Just eyeball the stories you have and “commit” to some number of them. With this concept, it appears especially important keep your releases small (we release every 3 weeks) and keep your stories small. I really like this idea of not worrying about estimates in points. And it suggests even further that estimating tasks within stories and the ideal hours for those tasks is even more of a waste of time, but that’s a different topic.
We also track bug counts. A bug to us is a defect from a user’s perspective that is either deployed to production or simply due to work considered done in a prior iteration. Bugs can be found by the team, by support, or a by clients. What it specifically doesn’t include is issues related to a story currently in process. The actual number we track is bugs/developer-month. This number is important to me and even more important is the trend in this number, because with my focus on reducing cycle time, I really want to make sure I don’t see an upward trend in my bug metric.
My entire premise is that if we can reduce cycle time while not negatively impacting the bug metric and not incurring new technical debt, then we will better serve our clients and be able to be more responsive to any of their requests.
The two major impediments to our cycle time that we see are:
· Distributed team (and long feedback loops that result)
· Technical debt (plenty of it)
With respect to being a distributed team, the number one problem is the lack of overlap in our time zones. Between people in EST and MST, things move pretty smoothly. We use all kinds of tools and techniques to communicate, but the point is that if you have a question or want to pair, you can. Between the US and India, it’s a whole other story. The group in India is collocated, so their own interaction is fine, but there is very little overlap in our days, so we end up having to make adjustments in our schedules to accommodate. In India, they stay late. In the US, we’re online early and when we get to daylight savings, we’ll likely switch some to be online later. The big problem is that we are not able to operate as independent teams working on separate projects. Instead, we all need to work on the same codebase. We do have tools in place to help such as a central SubVersion repository, Cruise Control, and a wiki, but if I’m in the middle of my day, I can’t simply pair with someone in India.
The price we pay for this (read extra cost) is in longer feedback loops. Instead of pairing on a piece of code, I code it; check it in; wait for review comments tomorrow. And we have to pay close attention to having detailed written communication to manage the handoffs effectively. We are of course operating under the mantra of “communication rather documentation” but notes in a wiki and email always fall short of face-to-face or a phone call.
On the topic of technical debt, there has also been recent threads (and even a mini-conference coming up at the end of the week) looking for a better metaphor such as flipping it on its head and making technical investments rather than call it technical debt, but anyway I’ll assume you know what I mean. Our number one debt is legacy code with no automated tests. For this reason, we do work on stories that are nothing but get a particular feature at least under some sort of automated test. The other big class of debt is a non-OO base which means to reuse anything it must first always be refactored. This combined with the lack of tests means moving pretty slowly to try to avoid incurring bugs.
Outside of keeping focused on those two topics, we’ve employed a Kanban process to reduce the number of stories in progress at any given moment. It’s normal for a collocated team to only work on as many stories as the number of pairs will allow. If you’re blocked on any given story, you find and ask the right person and move on. With a distributed team, it’s very tempting if you have to wait 12 to 24 hours for an answer to put the blocked story on hold and start another story. But this leads us down the path of additional context switching (among n+1 stories now) and of course incurs additional waste by putting story 1 in inventory while beginning to build inventory around story 2. A limited number of Kanban tickets are how we at least limit this amount of waste.
On a more practical level, what happens when we start even more stories is that we end up with even more stories blocked over time due to lack of attention and this directly increases (rather than reduce) our cycle time. This to me is direct evidence that cycle time is far more important than efficiency. It’s efficiency and wanting to efficiently use those programmer hours that tempts you to start yet another story. Yet it turns around and has a negative impact on the number of stories you can get done.
I’m very interested to hear any suggestions from people out there as to what you think we could do to improve. Or if you think we just need to plug along and get the technical debt down enough that we can finally starting flowing faster?
Lastly, it occurs to me that actually drawing out a value stream map and looking there for waste may be a valuable step. With that, it seems the real cycle time is from the point of having a story defined to the point that it is in production. Which raises the question: is that the cycle time we should be tracking and minimizing to truly “optimize the whole”?