Emergency Procedure

Alias: Stop the Line

Jeff Sutherland’s colleague Ed Atterbury getting shot down over Hanoi by a SAM Missile. You can see him bailing out!

...companies, teams, and individuals often find their efforts are failing to deliver on time and the Sprint Burndown Chart shows failure is virtually certain. Rapid identification of problems and quick response is fundamental to the spirit of agility.

✥       ✥       ✥ 

Problems arise in the middle of a Sprint due to emergent requirements or unanticipated changes. By mid-Sprint, it may be obvious that the Development Team cannot complete the Sprint Backlog successfully. The team is high on the Sprint Burndown Chart and sees that it cannot achieve the Sprint Goal at the current rate of getting things done.

Causes of Sprint dysfunction are legion and this pattern focuses primarily on the top three of these common problems:

Agility requires rapid response to change, and that means making problems visible as early as possible. Unfortunately, new teams and average teams often do not want problems to become visible. In particular, they do not want to stop work, fix problems, and risk criticism. At the first NUMMI (New United Motor Manufacturing, Inc) Toyota plant in America, Japanese management visited the plant after six months and saw that employees were afraid to pull the andon (lamp) cord—the cord that causes a trouble lamp to turn on and that starts a countdown timer to stop the production line. Workers had not stopped the line enough to fix their impediments. The management pulled the andon cord to stop the production line then and there, to communicate to the workers that their biggest impediment was their reluctance to stop the line. Stopping the line makes problems visible so they are fixed properly. “No problem is a problem” is the Japanese management mantra ([1]).

The team must consult the Product Owner when things are not going well. Not only that, the Development Team should agree with the Product Owner on how to quickly address major problems that affect reaching the Sprint Goal.

Therefore:

When high on the burndown, try a technique used routinely by pilots. When bad things happen, execute the Emergency Procedure designed specifically for the problem.

Do not delay execution while trying to figure out what is wrong or what to do. In a fighter aircraft, you could be dead in less time than it takes to figure out what is going on. It is the responsibility of the ScrumMaster to make sure the team immediately executes the Scrum Emergency Procedure, preferably by mid-Sprint, when things are going off-track. This will require careful coordination with the Product Owner, yet kaizen mind requires execution of this pattern even when the Product Owner is not available. Great teams act without permission and ask for forgiveness later (see Community of Trust).

Scrum Emergency Procedure: (do only as much as necessary)

  1. Change the way the team does the work. Do something different.
  2. Get help, usually by offloading backlog to someone else.
  3. Reduce scope.
  4. Abort the Sprint and replan.
  5. Inform management how the emergency affects release dates.

Teams often want to reduce scope when they encounter difficulty. Great teams find a way to instead execute a different strategy to achieve the Sprint Goal. In the 2005–6 football (soccer) season, John Terry, Chelsea’s captain and center back, had to take over as goalkeeper in a game against Reading after Petr Cech suffered a fractured skull, and then the substitute goalkeeper who replaced him, Carlo Cuddicini, was carried off unconscious before halftime. Terry made two fine saves and Chelsea won the game 2-0. Similarly in software, adopting new practices that remove waste can multiply performance while drastically cutting effort.

When multiple teams are working on the same products, one team can often pass backlog to another team who has slack. The company PatientKeeper, a pioneer in agile development in the medical sector, automated this strategy ([2]). If a team was behind, it could assign Sprint Backlog Items to another team. If the second team could not take them, they passed it to a third team. If the third team could not take them, all three met to decide what to do. This automatically leveled the loading of backlog across teams so they could all finish together.

Reducing scope early so the team can finish planned work is better than coasting into failure. The organization can inspect and adapt to problems rather than be surprised. See Teams That Finish Early Accelerate Faster.

Aborting the Sprint (stop the line) may be the best option, particularly if the team consistently fails to deliver. Only the Product Owner may decide whether to abort the Sprint: as bad as things may be, the Product Owner may judge that the business payoff may not be worth it, or that aborting the Sprint may otherwise have long-term negative consequences in the market or the business.

After terminating the Sprint the team typically convenes a brief Sprint Planning for an abbreviated Sprint (to stay on cadence as per Organizational Sprint Pulse; see also Follow the Moon) to achieve the Sprint Goal, if possible, and to deliver as much value as possible. Alternatively, the team may convene a more protracted Retrospective to explore and rectify problems in the environment and the team’s Scrum implementation, and then replan and move on to the next Sprint. But, again, much of the value in Sprint termination comes in making it publicly visible that there are fundamental impediments that keep the team from doing its job. A visible problem is one that the team can fix.

Sprint termination sends a strong message throughout the organization that something is wrong and increases the capability of removing impediments that cause failure. One playful Scrum tradition (or at least metaphor) is the “Abnormal Termination of Sprint Ceremony” which is ostensibly carried out in the lobby of corporate headquarters, where the Development Team members gather to lay on their backs, scream, and flail their arms and legs in the air to let off steam. The intent is to make it visible that the Product Owner has abrogated the team’s commitment.

The Scrum Team executes this pattern, and it is particularly useful for high-performing teams. For teams that are serious about kaizen (see Kaizen and Kaikaku), Scrum is an extreme sport and they enter into a Sprint with some risk in order to go faster. Their primary risk is emergent requirements or unexpected technical problems as the team has addressed most other causes of failure. The team may need to use this pattern every third or fourth Sprint, particularly when implementing new technologies and pushing the state of the art. However, for most emergencies, great teams will recover and meet Sprint Goals. And if they stop the line (abort the Sprint) they will poka-yoke ([3]) their process so the same problem does not recur.

Poka-yoke (ポカヨケ) [poka joke] is a Japanese term that means “fail-safing” or “mistake-proofing.” A poka-yoke is any mechanism in a lean manufacturing process that helps an equipment operator avoid (yokeru) mistakes (poka). Its purpose is to eliminate product defects by preventing, correcting, or drawing attention to human errors as they occur. [4]



Making problems visible is part of kaizen mind; see Kaizen and Kaikaku.

✥       ✥       ✥ 

The team will learn to rapidly respond to change in a disciplined way and overcome challenges. In many organizations, when things are not going well, teams are not thinking clearly and are frustrated and demotivated. They fail to understand the cause of their problems and the way to fix them. Executing the Emergency Procedure will train the team to focus on success and systematically remove impediments. Great teams will surprise themselves with their ability to overcome adversity and move from strength to strength. It increases chances for successfully delivering a Potentially Shippable Product Increment both in the short term and long term. The team will feel it is doing all it can when using Emergency Procedure to get back on the right track, out of both professional pride (see Team Pride) and Product Pride.

You can use Emergency Procedure in a more disciplined way to raise transparency into unmanaged requirements with the pattern Illegitimus Non Interruptus.

See also Take No Small Slips.

This pattern anticipates use by highly disciplined teams. If a team is using this too often (e.g., more often than once every four Sprints) and is not improving its value, quality, and rate of delivery, then the team might reflect about whether something is fundamentally wrong in the environment or in the team’s use of Scrum. It is usually better for young teams to do their best to deliver, to take the Sprint to the end, then fail the Sprint. Away from the heat of battle in the Sprint Retrospective, the team can explore the drivers for failure and plan kaizen. Some process improvements may help the team resort to Emergency Procedure in analogous future situations.


[1] John Shook. “How to Change a Culture: Lessons from NUMMI.” In MIT Sloan Management Review 51, Winter 2010, pp. 63–68.

[2] Jeff Sutherland. “Future of Scrum: Parallel Pipelining of Sprints in Complex Projects.” In Proceedings of Agile Development Conference (ADC'05), Denver, CO: IEEE Press, 2005.

[3] Mike Rother. Toyota Kata: Managing People for Improvement, Adaptiveness and Superior Results. New York: McGraw-Hill, 2010.

[4] —. “Poke-yoke.” Wikipedia, https://en.wikipedia.org/wiki/Poka-yoke, 19 May 2018 (accessed 6 June 2018).


Picture credits: National Museum of the U.S. Air Force photo 090605-F-1234P-021, http://www.nationalmuseum.af.mil/Upcoming/Photos.aspx?igphoto=2000558698 (in public domain).