I just finished reading the awesome book Actionable Agile Metrics for Predictability by Daniel S. Vacanti. For me a must read to get a deep insight in metrics for agile environments.
“The only metrics that entrepreneurs should invest in are those that help them make decisions” (Eric Ries – e.g. on Vanity Metrics vs. Actionable Metrics)
1 – Definitions
ACTIONABLE METRICS: The set of metrics that will suggest specific interventions that will result in the outcomes you are expecting
WORK IN PROGRESS – WIP: The total count of items currently being worked on; the number of items that we are working on at any given time; all discrete units of customer value that have entered a given process but have not exited.
 predictor of over overall system performance
 all work items between arrival point and departure point
 can be segmented across different types
CYCLE TIME: The amount of elapsed time that a work item spends as Work in Progress.
 How long it takes each of those items to get through our process?
 How long to complete?
 When will it be done?
 Can also be used as a predictor of cost
 The amount of time it takes to get customer feedback
Unpredictability lies it the time an items spends waiting to be worked on – that’s why it’s the elapsed time that is important.
THROUGHPUT: The amount of WIP completed per unit of time.
 How many of those items complete per unit of time?
 How many features am I going to get in the next release
FLOW EFFICIENCY: Ratio of total elapsed time that an item was actively worked on to the total elapsed time that it took for an item to complete.
 not actively worked = waiting to be pulled, waiting for feedback
 often a starting point of 15% flow efficiency
LITTLE’S LAW:
implies that increasing WIP leads to a higher CT and vice versa – check to reduce WIP to increase CT … in order to get stuff done faster, you need to work on less (on average)
 L = average number of items in the queuing system
 lambda = average number of items arriving per unit of time
 W = average wait time in the system for an item
Thanks Daniel S. Vacanti for your explanation!
2 – Cumulative Flow Diagrams (CFD)
 offer a concise, coherent visualization of the three metrics of flow (Avg. Cycle Time, WIP, Throughput)
 provides qualitative and quantitative insight into problems with flow
 shows cumulative process arrivals and departures over time
 not a tool for projection (but introspection)
 Backlog should not be part of the CFD
 they are not committed too (but in the diagram it looks like they are)
 it will destroy cycle time calculation
 Try to show active and done states (as it shows areas of delays)
Avoid trap of drawing conclusions just by looking at the CFD! It’s a tool to ask the right questions
Important Properties
 Top line = cumulative arrivals
 Bottom line = cumulative departures
 No line can ever decrease! (it’s a cumulative chart) … If it happens the chart is wrong (very likely work items disappeared in the process)
 The vertical distance between any two lines is the total amount of work that is in progress between the two workflow steps represented by the two chosen lines
 The horizontal distance between any two lines represents the approximate average cycle time for items that finished between the two workflow steps represented by the chosen two lines (average cycle time = bottom line date – top line date + 1)
 The data displayed depicts only what has happened (no projections allowed)
 The slope (top line) of any line between any two reporting intervals represents the average arrival rate
 The slope (bottom line) of any line between any two reporting intervals represents the average departure
 Average throughput = rise of average departure / run (Nr. of days)
Necessary assumptions
 The average Arrival Rate (Lambda) should equal the average Departure Rate (TH) = we will only start new work at about the same rate that we finish old work
 Needs a more late binding (commitment) approach.
 Monitor policies around the order in which we pull items through the system – so that work items do not sit and age unnecessarily
 all work started will be completed and exit the system
 WIP should be roughly the same in the chosen interval
 average WIP is neither increasing nor decreasing
 CT,WIP,TH are measured using consistent units
Some patterns

Flat lines


Stair steps


Bulging bands


Disappearing bands


SCurve


A boring one

3 – Cycle time scatterplots
 xaxis it the timeline
 yaxis represents the cycle time (per item)
 percentile lines – e.g. 85% of all items finish within 43 days (per item)
 recommended is to use 50th, 70th, 85th, 95th percentiles
 work item cycle time data is not a normal distribution! – that’s why applying standard deviation and arithmetic mean is not appropriate (as e.g. done with control charts)
 percentile lines are preferable – it’s their robustness in the face of outliers
“If Bill Gates walks into a bar, then on average everyone in the bar is a millionare”
4 – Cycle time histograms
 a condensed, spatial view based on the frequency of occurrence of Cycle Times
 yaxis are the number of items
 xaxis is the cycle time
 vertical percentile lines (like in the scatterplot)
 in addition to the scatterplot the histogram shows the shape of the data. You can better detect patterns for Cycle Times over a given timeline … it’s a more advanced cycle time analysis
5 – SLAs
 Cycle Time Target; Service Level Expectation
 it is expressed using a probability to meet a cycle time range
 e.g. with a 50% percent probability a work item finishes in 10 days (according to the 50% percentile of all previously finished work items in our system)
 can be used as a substitute for many upfront planning and estimation activities
 the choice of a teams SLA should be made in close collaboration with their customers
 get predictable at an overall system level first … very likely good enough … only optimize for subtypes if really necessary
 use the SLA for right sizing items too – SLA as the litmus test for right size of an item for flow through the process
 the older a work item gets, the greater chance it has of aging still more
 true definition of Agile is – to respond quickly to new information
One of the most common things we do that hinders our predictability is not pay attention to the order in which items are pulled through our process.
Mistakes
 set a SLA independent of analyzing cycle time data
 set by an external manager
 set without close customer collaboration
Classes of Services (CoS)
 For all practical purposes, introducing COS is one of the worst things you can do to predictability
 CoS – every time you put a policy in place around the order in which you pull something
 will introduce variability and unpredictability into the process (e.g. will produce flow debt)
 Only introduce if you have operated your process for a while and are confident that CoS is necessary
In his book Daniel S. Vacanti shows via a simulation the terrible effect on predictability of random pulling from queues in combination with having an expedite lane. A cycle time increase from 50 days to 100 days – meaning 100% more time.
FIFS (FIFO) – is the clear winner for cycle time predictability … the further you stray from FIFO, the less predictable you are
Slack
Slack is pretty much the only way to PREDICTABLY deliver in the face of variability introduced by different CoS is to build slack into the system.
6 – Forecasting
 a proper forecast includes a date range and a probability
 for forecasting a single items use SLAs
 do not use Little’s Law and averages (as data is not in a normal distribution)
 straight line projections are problematic – they do not communicate a probability
Monte Carlo Method
The Monte Carlo Method is the future of forecasting in knowledge work.
Further readings
 Actionable Agile Metrics for Predictability by Daniel S. Vacanti
 my Book summary with more details and e.g. patterns for charts
 Troy Magennis work on Lean Forecasting e.g. Agile Metrics for the Metrically Challenged
[…] #1 Actionable Agile Metrics for predictability […]