In this article we explore how thinking more deeply and deliberately about KPIs and dashboard visualisation can lead to better decision-making and improvement in organisations.
Here is an example dashboard which is at first glance quite appealing and uses elements such as doughnuts and bubbles commonly seen in visualisation tools.
This dashboard displays information about whether tasks have been closed within an SLA. Task resolution time is a common feature of transactional customer service or operational requests. The audience for this information may well be a service manager, a manager of the request handling or fulfilment processes or perhaps an internal customer of the process. Different categories of task (AD, FC, I3 etc.) have different SLAs which are calculated daily and reported over a 7 day period.
What decisions does this dashboard trigger?
Presumably, the proportion of tasks completed within the SLA is a result which a service provider might want to actively improve (or prevent from getting worse). With this in mind we’d expect that the primary decision requirements for this dashboard are to:
- Decide whether intervention is needed to improve this result (or prevent it from degrading)
- Decide which categories of task to focus improvement efforts and investment on.
- Decide whether to consider task volume as a factor in this improvement effort.
These decision requirements give rise to questions which this dashboard ought to answer:
- What is overall task closure performance doing?
- What is overall task closure performance doing relative to targets?
- Which task category is contributing most to this overall performance?
- Which task category is contributing most to a shift in this overall performance?
- Does task volume contribute to overall performance?
- Which task category contributes the most to the consequences (impact) of unmet SLAs?
Does the dashboard answer these questions adequately?
Percentage of Tasks Closed Inside SLA
What it shows
- The upper panel displays the proportion of tasks which were closed within an SLA in a doughnut chart. There’s an implied comparison between the actual closure time of a task and an SLA target.
- Todays performance (so far) is compared with yesterday and the aggregate for the previous 7 day period.
- The lower panel provides a daily comparison of SLA performance over the previous 7 day period in a line chart.
Doughnut Charts are very popular because of their visual appeal. In this case however they fail to provide a direct comparison of the measure across the 3 time periods as well as a bar or bullet chart would. The spacing of the charts also discourages this comparison and takes up screen estate.
Because the Doughnut scale encodes 100% as 360 degrees, its hard to distinguish between 96% & 97%, especially where these figures are consistently 90% or above.
Is a comparison between today, yesterday and the last 7 days a useful one? If the SLA today is 97% compared with a 7 day aggregate of 96%, what insight and decision does that produce? Comparing both these figures to some threshold would perhaps be more meaningful. Even then, we don’t know whether today’s figure taken in isolation isn’t just due to natural variation.
The use of a line chart is appropriate for time series data but the Y axis scale doesn’t provide adequate differentiation of the daily data points. The scale could start at 75% for example or, if we knew the % SLA target, be expressed as a variance.
The colour of the line – bright green – implies ‘good’ but we don’t known whether this is the case or not. The chart border and grid lines are largely superfluous and the x-axis dates would be more immediately readable if rotated horizontally.
Using more than 7 days of historical context might reveal a visual pattern for how performance is changing.
We don’t know whether day to day variation in this SLA is random or exceptional and the use of XmR chart would enable robust signal detection and help avoid knee-jerk over-correction.
Breakdown of Tasks Outside SLA
What it shows
- The upper panel shows the total number of tasks which missed the SLA broken down by the proportion of tasks of a given category.
- The lower panel shows how these proportions changed over a 7 day period in a stacked bar chart.
Part to Whole Weights
By only counting the tasks which fell outside the SLA, the context of the total task volumes is lost and the part to whole proportion becomes misleading. Some categories of task might end up being given more attention than they deserve.
Actual Closure Times vs. Targets
The actual closure time for tasks isn’t revealed in this data, either as a distribution or as an aggregate (mean, median, percentile etc.). Without this tangible yardstick it’s hard to give context to the SLAs. What does 97% SLA achievement look like to customers? Knowing the closure time target level for each task category would place this in the real world. Percentiles can be useful because they take into account the task volume and are expressed in the same units as the target closure time itself.
If the 95th percentile closure time SLA is 4 hours, we know that 5 hours is ‘worse’ on a human scale.
SLA Breach Impacts
We don’t know whether certain categories of task have greater impact, either because they dominate the volume or have different relative consequences for the service provider or customer. We might discover that a single breach of the closure time SLA for an ‘AD’ category task has far greater impact than 100 breaches of an ‘I3’ task.
If closure time matters as the SLA suggests, we might expect that more non-closure time is a ‘bad thing’. By aggregating closure time by task category we could assign ‘loss’ weights to each and discover that 10 hours of ‘AD’ non-closure time has a greater economic impact than 10 ‘I3’ non-closure hours.
Why not represent impact weights so that improvements are focused on breaches which are most damaging for customers?
Stacked Bar Chart
In a time series chart we want to see how a value changes from one period to the next. Apart from the first series (AD), the relative widths of the segments within stacked bars can’t easily be compared. A 6 series line chart would have been more useful.
The use of varying shades of blue distinguishes 5 of the task categories but choosing to encode the ‘AM’ task differently in green suggests a significance which may not be intentional.
Total Task Volume
What it shows
- The upper panel compares the volume of all tasks open, closed and cancelled over the 3 reference periods.
- The lower panel shows the total volume of closed tasks over the 7 day period including a weekend (6th/7th June).
Open vs. Created
It would appear that ‘Open’ means currently open rather than ‘Created’ because there are no ‘Open’ tasks recorded for previous periods. No tasks were counted as still open at the end of yesterday or at the end of the previous 7 day period. This either implies very short service times or that tasks which were open at the end of these prior periods have been closed in a later period.
To fully understand how task demand may be contributing to task closure cycle time, the daily creation/open rate is more important.
Task Closure Cycle Time
There’s an inference that task volume may contribute to task closure performance. This is reasonable if task closure cycle time is sensitive to the utilisation of the resources involved in task fulfilment.
What we want to know at this level is whether different task categories from the previous Task Breakdown have different throughput characteristics and whether this contributes to the task completion performance. One way to represent this would be to compare the average backlog of tasks awaiting service in each task category. This could be calculated from the cycle time & arrival rate using Little’s Law or counted in the request handling system. The number of en-queued tasks would indicate whether resource utilisation by task category could be a factor in falling performance.
Using bubble charts to represent task volumes fails to convey comparisons. The general problem is that we can’t easily compare bubble sizes as easily as, say, bar lengths. In this case the bubble proportions are clearly wrong both within and between the clusters. The circle which represents 662 open tasks clearly isn’t twice the radial or area size of the 343 closed tasks, nor is the bubble for 932 closed tasks yesterday a quarter of the size of the 4.0K bubble for the period.
A comparison of the total daily volume (today or yesterday) to the total period volume isn’t very meaningful. If we closed 932 tasks yesterday and 4000 in the last 7 days including a weekend, what does that tell us? We might want instead to compare these to the average weekday volume.
The Line Chart shows total task closure volume by day whereas task creation by day would be a better indicator of demand. XmR charts would signal whether demand were increasing or decreasing overall. It would also be useful at this level if there were a line for each task category. If service isn’t provided at weekends, the Saturday/Sunday volumes could be omitted or the chart aggregated over a full week.
A dashboard can be superficially appealling but fail to answer the questions which will help its intended audience make good decisions.
Fundamentally in this example the dashboard audience still doesn’t know:
- Which category of task is having the biggest economic impact on customers.
- Whether intervention is needed to improve something.
- Whether task demand is a candidate cause of an over-burdened process.
Deeper thinking in this case would have improved the customer value of both the KPIs and the dashboard.
Put better visual signals in front of your decision makers by starting with an easy experiment: Translate your existing management reports into a visual prototype.
A Visual Management Pack is a quick, practical exercise – one of our Visible Sprints – and only takes a couple of weeks (for your next leadership meeting?).
If you want to set up a discovery call to explore the idea of a Visual Management Pack then get in touch.