This is the second post in a series. If you haven’t read the first one yet, I recommend you check it out here, otherwise some of the ideas might not make sense.
With an understanding of our system, its goal and measurements we know what kinds of work we shouldn’t be doing: work that doesn’t contribute to the goal. The next step is to figure out how to do work we should be doing more efficient and reliable.
I prefer to open with techniques I tried that didn’t work. Starting with the most common.
If everyone works as hard as they can all the time, we get more work done.
This is the most wide-spread conviction I notice among managers. Here is a story from one of my friends, let’s call him Tim.
Tim is an engineering team lead. The structure of the company he is working in changed after a merger. His new superior has a peculiar approach to management. The main document they use at their one-on-ones is a spreadsheet that looks something like this (I don’t have the actual document, I reproduced it from his words):
Employee load spreadsheet. Data is made up.
Employees in rows, workdays in columns. Cells of this spreadsheet represent how much work does an employee have on each specific date. How much work is assigned to them relative to how many hours they have in a workday.
Tim’s manager wants my friend to make sure numbers on that spreadsheet never go below 120%. Every employee should have 20% more work assigned to them than they can handle… every day. You know, just in case they finish early.
I think psychology is yet to discover that disorder I call “American Dream Syndrome”. Managers fall victims to it all the time. Good news is, it’s easy to diagnose, bad news — it’s hard to cure. Here are the symptoms:
Obligatory graph for a management article. Drawing by Cristina Amate.
I won’t go in depth on how we can treat the syndrome in this article. But I will try to show why working at 100% of your capabilities is bad for business.
Let’s run a simulation. Say we are managing a team in a startup. The goal of the team is to satisfy frequently changing business requirements. We have one product owner (PO) who writes specs, four engineers who implement them and one QA person verifying that features work as expected.
Simulation is running in perfect conditions, no rejections from QA, zero variation in people’s performance, no disruptions. Tasks are processed in a first-in-first-out manner.
We’ll measure team’s performance using metrics from the previous article:
Lead time per task — the number of days from the time PO starts working on a ticket until the time it is released.Since the goal of the team is responding to business requirements fast, we can take lead time as a measure of throughput. The faster the team processes work — the higher the throughput.
Here’s a slide deck with this simulation:
Hard-working team simulation on a Trello board
Some observations:
Everyone is busy 100% and in the long run Infinitely ineffective.
Lead time and Inventory over 60 days
Keep in mind — this team is “perfect”. Everyone works at 100% of their capability. Still all our metrics get worse and worse with every release. What is going on?
First, let’s tackle the problem of disappearing days. Why do we have longer lead times if the release schedule is constant?
Take a look at the fastest (#4) and slowest (#21) tasks.
The fastest (#4) spent:
The slowest (#21) spent:
Here’s our answer. The slowest ticket spent 7 days more waiting for Engineering and QA to become available. We have overloaded the system by putting in more work that it can process.
We have simulated work within one department because it’s easy to show on one Trello Board. Of course the same principle applies to a company as a whole.
On a scale of an organization departments are system’s “work centers”, and customers’ orders/projects/requests are “tasks” moving through the system.
I’ve seen the issue of overload in every company I’ve worked at. Sometimes I was the one causing it. In the past as a team lead I was making sure everyone had work to do at all times. Here is what this type of management often leads to:
“Lobbying at a workplace” by Cristina Amate
In the real world things are always more dramatic than in simulations.
If we make everyone work as hard as they can, finishing projects becomes harder and harder. Management would often misjudge it as a resource allocation problem and will push for hiring more people or spending more money. If they get the resources they think they need, it might help… temporarily. Once inventory is at a high-enough level, waiting time will rise, projects will be stuck in the system again, the company will be back at square one.
Hiring people or spending more resources won’t solve this issue.
This is one of the mantras I used as an engineer. When we encounter a problem, there’s a high chance someone already solved it. If only there was a StackOverflow for managers…
In manufacturing Taiichi Ohno and his peers established Toyota production system (TPS) in the 80s. One of the practices used in TPS is Kanban. It addresses the same issues we’ve faced in the simulation — growing inventory and lead times, overproduction, low quality.
In the next article we’ll take Kanban for a spin in our simulated environment and see what it’s good and not so good for. We’ll also get a glimpse of the ideas that go beyond Kanban and help us manage the flow of work in infinitely complex systems.
I’d like to thank people that shared their experience and useful insights with me. Their inputs are the foundation of this series. In no particular order these people are: Stefan Willuda, Ricardo J. Méndez, Ed Hill, Adiya Mohr, Conny Petrovic, Goran Ојkić.
Special thanks to Cristina Amate for the illustrations as well as her support and early feedback on the talk and articles.
I’m looking for opportunities to talk about Theory of Constraints in startups. If you’d like to recommend a conference or invite me to speak, please reach out: https://flpvsk.com