How to Not Be A Micro-Manager - Part 2

Blast Radius FTW

This is going to be a short one, a follow up to How to Not be a Micro-Manager - Part 1, where we discussed Cadence.

The second technique I use to prevent myself from micro-managing is the concept of Blast Radius - ie how much damage can a person do if they blow up their work. Blast -radius is the material analogue of cadence, which deals with the issue of time.

In my experience, orgs that have high autonomy are more fun to work in because they go faster. But autonomy brings with it risks to alignment and sometimes, catastrophic failure. Cadence deals with the issue of alignment and Blast Radius addresses the risk of catastrophic failure.

True Story - At @ShaadiTech, any developer can deploy any API they want to. Front-end developers routinely deploy API endpoints that best serve their needs without needing to get it built by the API team. This is great for autonomy and as a result, for velocity. But in order to get this level of autonomy, we had to secure the blast-radius. So one can deploy any API *but* it has to be using Shaadi’s API gateway. The API Gateway ensures authentication, rate-limiting, protection from bots and so on. What about data? You can read off the Kafka, populate your own data store and send that data out over your API. For anything more, talk to the API team. But in fast moving product development this usually suffices.

There are tonnes of places where you are already using this concept - code reviews are an example of limiting blast radius. CI/CD and automated testing are others. Think about how your work goes from dev to production and you’ll see examples of Blast Radius everywhere. Now that you have a name for them, you can start to manipulate them to provide maximum autonomy while addressing the risks of catastrophic failure.

To increase autonomy, we have one guiding principle - If a developer wants to do something but isn’t allowed to do it, that’s a bug in the organisation and it it needs to be fixed.

Naturally, this comes with it’s set of exceptions - access to the production DB for example, but using this guiding principle we’ve been able to build tooling and instrumentation that allows devs maximum autonomy while maintaining the security, sanity and constraints of the system.

In a growing or scaled up org, one often finds that the maximum time is spent in ‘Ready for X’ stage. Lack of co-ordination between teams (ie organisational debt) is a bigger drag on your velocity than even technical debt. Growing autonomy of teams reduces the need for co-ordination so you should push for every team to be as autonomous as it needs to be, which is to say it should rarely depend on work from other teams.

Being conscious about the Blast Radius allowed at various levels of your org will allow you to sleep peacefully while still providing engineers the autonomy the require.

That’s all for today :-)