“The only metrics that entrepreneurs should invest in are those that help them make decisions” (Eric Ries – e.g. on Vanity Metrics vs. Actionable Metrics)
1 – Definitions
ACTIONABLE METRICS: The set of metrics that will suggest specific interventions that will result in the outcomes you are expecting
WORK IN PROGRESS – WIP: The total count of items currently being worked on; the number of items that we are working on at any given time; all discrete units of customer value that have entered a given process but have not exited.
- predictor of over overall system performance
- all work items between arrival point and departure point
- can be segmented across different types
CYCLE TIME: The amount of elapsed time that a work item spends as Work in Progress.
- How long it takes each of those items to get through our process?
- How long to complete?
- When will it be done?
- Can also be used as a predictor of cost
- The amount of time it takes to get customer feedback
Unpredictability lies it the time an items spends waiting to be worked on – that’s why it’s the elapsed time that is important.
THROUGHPUT: The amount of WIP completed per unit of time.
- How many of those items complete per unit of time?
- How many features am I going to get in the next release
FLOW EFFICIENCY: Ratio of total elapsed time that an item was actively worked on to the total elapsed time that it took for an item to complete.
- not actively worked = waiting to be pulled, waiting for feedback
- often a starting point of 15% flow efficiency
implies that increasing WIP leads to a higher CT and vice versa – check to reduce WIP to increase CT … in order to get stuff done faster, you need to work on less (on average)
- L = average number of items in the queuing system
- lambda = average number of items arriving per unit of time
- W = average wait time in the system for an item
Thanks Daniel S. Vacanti for your explanation!
2 – Cumulative Flow Diagrams (CFD)
- offer a concise, coherent visualization of the three metrics of flow (Avg. Cycle Time, WIP, Throughput)
- provides qualitative and quantitative insight into problems with flow
- shows cumulative process arrivals and departures over time
- not a tool for projection (but introspection)
- Backlog should not be part of the CFD
- they are not committed too (but in the diagram it looks like they are)
- it will destroy cycle time calculation
- Try to show active and done states (as it shows areas of delays)
Avoid trap of drawing conclusions just by looking at the CFD! It’s a tool to ask the right questions
- Top line = cumulative arrivals
- Bottom line = cumulative departures
- No line can ever decrease! (it’s a cumulative chart) … If it happens the chart is wrong (very likely work items disappeared in the process)
- The vertical distance between any two lines is the total amount of work that is in progress between the two workflow steps represented by the two chosen lines
- The horizontal distance between any two lines represents the approximate average cycle time for items that finished between the two workflow steps represented by the chosen two lines (average cycle time = bottom line date – top line date + 1)
- The data displayed depicts only what has happened (no projections allowed)
- The slope (top line) of any line between any two reporting intervals represents the average arrival rate
- The slope (bottom line) of any line between any two reporting intervals represents the average departure
- Average throughput = rise of average departure / run (Nr. of days)
- The average Arrival Rate (Lambda) should equal the average Departure Rate (TH) = we will only start new work at about the same rate that we finish old work
- Needs a more late binding (commitment) approach.
- Monitor policies around the order in which we pull items through the system – so that work items do not sit and age unnecessarily
- all work started will be completed and exit the system
- WIP should be roughly the same in the chosen interval
- average WIP is neither increasing nor decreasing
- CT,WIP,TH are measured using consistent units
A boring one
3 – Cycle time scatterplots
- x-axis it the timeline
- y-axis represents the cycle time (per item)
- percentile lines – e.g. 85% of all items finish within 43 days (per item)
- recommended is to use 50th, 70th, 85th, 95th percentiles
- work item cycle time data is not a normal distribution! – that’s why applying standard deviation and arithmetic mean is not appropriate (as e.g. done with control charts)
- percentile lines are preferable – it’s their robustness in the face of outliers
“If Bill Gates walks into a bar, then on average everyone in the bar is a millionare”
4 – Cycle time histograms
- a condensed, spatial view based on the frequency of occurrence of Cycle Times
- y-axis are the number of items
- x-axis is the cycle time
- vertical percentile lines (like in the scatterplot)
- in addition to the scatterplot the histogram shows the shape of the data. You can better detect patterns for Cycle Times over a given timeline … it’s a more advanced cycle time analysis
5 – SLAs
- Cycle Time Target; Service Level Expectation
- it is expressed using a probability to meet a cycle time range
- e.g. with a 50% percent probability a work item finishes in 10 days (according to the 50% percentile of all previously finished work items in our system)
- can be used as a substitute for many upfront planning and estimation activities
- the choice of a teams SLA should be made in close collaboration with their customers
- get predictable at an overall system level first … very likely good enough … only optimize for subtypes if really necessary
- use the SLA for right sizing items too – SLA as the litmus test for right size of an item for flow through the process
- the older a work item gets, the greater chance it has of aging still more
- true definition of Agile is – to respond quickly to new information
One of the most common things we do that hinders our predictability is not pay attention to the order in which items are pulled through our process.
- set a SLA independent of analyzing cycle time data
- set by an external manager
- set without close customer collaboration
Classes of Services (CoS)
- For all practical purposes, introducing COS is one of the worst things you can do to predictability
- CoS – every time you put a policy in place around the order in which you pull something
- will introduce variability and unpredictability into the process (e.g. will produce flow debt)
- Only introduce if you have operated your process for a while and are confident that CoS is necessary
In his book Daniel S. Vacanti shows via a simulation the terrible effect on predictability of random pulling from queues in combination with having an expedite lane. A cycle time increase from 50 days to 100 days – meaning 100% more time.
FIFS (FIFO) – is the clear winner for cycle time predictability … the further you stray from FIFO, the less predictable you are
Slack is pretty much the only way to PREDICTABLY deliver in the face of variability introduced by different CoS is to build slack into the system.
6 – Forecasting
- a proper forecast includes a date range and a probability
- for forecasting a single items use SLAs
- do not use Little’s Law and averages (as data is not in a normal distribution)
- straight line projections are problematic – they do not communicate a probability
Monte Carlo Method
The Monte Carlo Method is the future of forecasting in knowledge work.