Testable Improvements



… the Scrum Team has completed a Sprint and wishes to improve in the next Sprint. Team members have collected information on their current performance, and are doing a Sprint Retrospective to determine what they should do to improve. Naturally, a team wants to take actions that have a real, lasting effect on performance.

✥       ✥       ✥ 

Self-improvement efforts are typically abstract platitudes. If a performance boost follows a planned change, it may simply be a coincidence.

It’s easy to decide to do something in the hope that it will improve team performance. The success of a good kaizen (incremental improvement) depends first on agreeing on a plan of attack, second on adhering to that plan, and third on testing whether the plan worked. To change the plan without testing the consequences is an arbitrary behavior. Not following the plan of attack may mean that different team members have interpreted the plan differently and that everyone is following their own interpretation. It will be a waste of effort if the team blindly follows its plan without heeding data that suggest that the team is headed in the wrong direction. It feels good to focus on improvement, and it’s easy to confuse how hard the team is trying with the degree the team is remaining faithful to the new discipline. Without feedback about results, it is as likely that a change is wreaking damage than that it is doing good. If the team is not conscious about what it is doing and the degree it is faithful to the agreed kaizen, it will be difficult to ascribe any change in performance to the planned kaizen itself. This leaves the team in a position of not knowing whether to continue a given behavior in the long term, or not.

When we take action that we expect to improve results, we expect to see improved results. Without specific objective measurement, we might just imagine improvement. For example, people who buy a dietary supplement designed to make them feel healthier and stronger may enjoy a placebo affect that leads them to feel healthier and stronger, regardless of any physiological improvement.

Similarly, when we take some action to improve results, we may subconsciously change other behavior because we know we are watching ourselves. For example, a person who buys a fuel efficiency device will probably see an improvement in gas mileage — not from the device but because they subconsciously change their driving habits to drive more efficiently. Who knows if the fuel efficiency device is even operating? A kaizen that proposes that inspections might increase the fault detection rate brings as much focus to fault density as to the inspections themselves. Team members are likely to unconsciously be more circumspect about preventing faults during their work leading up to the inspection. It’s an evil form of the old adage that you will get the results for which you measure, and the fault lies in measuring results alone.

Some actions sound good, but without measurement we may invest considerable time and effort, with no objective understanding of their impact. One of the authors worked in a large company when his division worked to obtain ISO 9001 certification. He asked the process coordinator whether he thought that the actions required by certification would really result in any team improvement. The process coordinator was fully confident that ISO certification would spark considerable improvement. However, this author never saw any studies to determine whether quality or productivity had improved after ISO certification. The unspoken goal had become the certification rather than the culture of improvement it might have introduced. Furthermore, the certification evaluation is by nature somewhat subjective, and entails personal judgment about the degree of compliance.

Therefore:

Write improvement plans in terms of specific concrete actions (not goals) that the team can measure objectively, to assess whether the team is applying the process change. First measure to see if the team is following the planned action. Second, measure the change in performance to evaluate whether the kaizen had the desired results.

In short: say what you will do, and do what you say.

✥       ✥       ✥ 

If the team knows that it has been following a planned set of improvement actions, then its members can assess results to know whether to continue on that path or rather to try something else.

In a Community of Trust such as a good Scrum Team there is no independent or external testing agent to scrutinize adherence to a plan of action. While Scrum follows the Toyota Production System in promoting transparency, research shows that decreasing employee monitoring can actually increase net transparency. [1] The ScrumMaster should encourage team members to assess themselves with checklists and reflection.

Important: This pattern is as much about knowing whether people are taking the agreed actions as it is about measuring to see whether the team has improved (e.g., did your velocity increase.) First, we want to find out if the team is actually adhering to the planned action. Second, we measure whether the desired improvement actually came to pass. 

This is all about getting away from wishing and hoping we will get better, and moving to actually doing something concrete to get better, and understanding whether actions really had the desired results.

Examples:

Avoid measures that are hard to quantify, such as:

Note that if the actions are not testable, it’s usually really hard to know how to do them. For example, exactly how do you go about “communicating better?” So the natural reaction is to not bother trying. And you end up with retrospectives that are only window dressing, and the sense that this may be true becomes the elephant in the room. [2] Such pro forma actions lead people to view Scrum as an arbitrarily imposed set of hoops through which everyone must jump, and that can fuel apathy and cynicism.

Most Scrum tradition measures success in terms of ROI, some other impact on financials, or some other value proposition (see Value and ROI). Chris Matts [3] suggests that the business hold the Product Owners accountable to some direct, measurable outcome of their Value Stream management, within their purview of influence. For example, instead of measuring ROI, we might measure the market engagement with the product; a financial group, or management, can convert that to ROI if needed.

In order to implement Testable Improvements, you must have regular Retrospectives. Within a Retrospective you do the following:

  1. Examine the previous Testable Improvements to see whether the team actually did them, and whether they had positive impact.
  2. For each proposed improvement, ask how the team will validate the improvement (how you know whether the team took the planned action, and to what extent.) If you can’t validate the proposed improvement, don’t accept it.

Note that some improvements provide that the team “stop doing” something (e.g. “Stop picking your nose!”). Such improvements are generally easy to measure (e.g. video record you, and count the number of times you picked your nose.)

To measure whether an improvement action actually works you need a meter, a scale, and a baseline of the current performance. The team uses the scale to quantify the improvement, like for example percentage of Daily Scrum meetings nobody missed. The meter indicates the process to establish a location on the scale, like for example counting the Daily Scrum meetings nobody missed and calculating the percentage at the end of the Sprint.

You will be able to test whether you are moving towards Greatest Value.

Regarding objectivity and subjectivity, Jerry Weinberg says, “You can turn anything into a number.” (Jerry has long said this, and confirmed it again in a conversation with Jim Coplien on 15 December, 2017.) This is a two-edged sword. On one hand you can come up with numbers that are meaningless, and people view them with far more statistical significance than they deserve. On the other hand, one can take deeply meaningful “subjective” concepts such as team engagement and quantify its trend as improving (+1), getting worse (-1), or staying the same (0). These measures provide a foundation for discussion and powerful discovery, particularly as one explores the why behind them. See Happiness Metric.

Contrast this pattern with Definition of Done, which is more about the result than the process used to obtain it. To a rough approximation, Testable Improvements is best when the process is the primary focus, and Done when the focus is on improving the product.

This is loosely related to the Japanese concept of kamashibai, which is a record of observations of conformance to standard work. The term kamashibai usually applies to management activities. Here, we intend that every team member self-monitor and that, in addition, the ScrumMaster continuously assess the team members’ faithfulness to their charted kaizen direction.

See also [4], pp. 73-128.


[1] Ethan S. Bernstein. “The Transparency Paradox: A Role for Privacy in Organizational Learning and Operational Control.” In Administrative Science Quarterly 57(2), 21 June, 2012, http://journals.sagepub.com/doi/abs/10.1177/0001839212453028 (accessed 2 November 2017).

[2] —. Wikipedia: ElephantInTheRoom. https://en.wikipedia.org/wiki/Elephant_in_the_room.

[3] Chris Matts. “Why business cases are toxic.” The Risk Manager, https://theitriskmanager.wordpress.com/2017/08/20/why-business-cases-are-toxic/, 20 August, 2017, accessed 7 April 2018.

[4] Mike Rother. Toyota Kata: Managing People for Improvement, Adaptiveness and Superior Results. New York: McGraw-Hill Education, 2009, pp. 73-128.


Picture credits: https://pixabay.com/en/eyeglasses-exam-optometry-vision-2003188/ (under CC0 license).