Playbook: turning around a software engineering team

A note-to-self kind of post on a playbook to turning around a struggling sw engineering team.

Core principles

  1. always behave trustworthily
  2. slow down and make time to address problems
  3. do you have the right people?
  4. if you can't get consensus, seek consent

Foundational engineering best practices

With regards to engineering best practices, the following are foundational and should be part of the execution somewhere between steps 4 and 8 of the playbook:
  • trunk-based development
  • continuous integration
  • no separate tester or devops team (this can be relaxed after the team begins performing), seek out a stream-aligned team instead
  • SCRUM with its process is useful to align the team and at last one main stakeholder
  • automate as much as you can, especially the parts that come up often for discussion; one obvious but often overlooked example are customized coding styles (use the consent-over-consensus principle to reach a decision)
If the team resists them or does not make progress, then see the right-people-principle.

Playbook

Building on those principles, the playbook is about the following:
  1. observe the team as it performs in its current setup, do not intervene yet
  2. get to know the people, preferrably in 1:1 where the group/herd effect is removed. Listen to their concerns
  3. understand what conditions are preventing them from performing
  4. figure out if they have a plan or they just complain (this is where the right-people-principle kicks in, in which case you need to get the right people)
    If they have a plan what is preventing its execution?
  5. at the first problem, pull the metaphorical andon cord and guide the team to storm the problem (do a post-mortem)
  6. agree with stakeholders on either a full pause or a slowdown where only one item at the time can be worked on
  7. use the capacity freed up by the slowification to address the current problem
  8. repeat from 5 until team performs as expected

Metrics

With regards to "performing as expected", it is useful to establish metrics. There's plenty of engineering metrics available, a short list in no particular order:
  • cycle time (from first commit to merge)
  • PRs merged/week
  • 4 DORA metrics: lead time (from code committed to production), mean time to restore, deploy failure rate, deploy frequency
  • CI stability (failed runs/total runs)

Dealing with the inevitable "Big Refactoring"

Make sure the refactoring is targeting parts of the codebase that are touched frequently. These are what Codescene calls hotspots.
The purpose of a refactoring is to make the code simpler to change, and if that code is rarely touched then the refactoring is a bad investment of the team's time.

Follow or draw inspiration from the prescriptions on this page: https://max.engineer/long-term-refactors
Summarized below for your convenience (and in case the page disappears):
  1. identify code needing refactoring.
  2. determine the refactoring pattern: explore the codebase to find common patterns of required changes, focusing on commonalities rather than special cases.
  3. create an example: implement the refactor on the smallest representative sample. Make it exemplary and well-documented, as it will serve as a reference for the team.
  4. prepare the codebase: make minor preparatory changes to smooth the path for others, like restructuring code and improving naming, but don't do the entire refactor yourself.
  5. name the refactor: choose a clear, concise, and descriptive name for reference in discussions and documentation.
  6. document instructions: write specific, step-by-step guidelines in your knowledge base. Include the example and relevant context, but keep the main instructions brief and clear.
  7. add to refactor list: include this refactor in your knowledge base's list of long-term refactors.
  8. share with the team: introduce the refactor through an announcement or meeting, explaining the process and sharing the documentation.
  9. create tasks gradually: instead of planning all tasks upfront, assign refactoring work as natural opportunities arise during development.
  10. maintain awareness: include the refactor list in onboarding and keep it present in team discussions.
  11. mark as complete: consider the refactor done when the major parts are complete and the new approach is clear in the codebase.

When all else fails

If you can't change your job, change your job.

Popular posts

Mirth: recover space when mirthdb grows out of control

LLMs (might) make it easier to port code away from CUDA

Quasi-code with Apache Camel