Leadership That Ships: Calm Teams, Clean Pipelines
How we lead without theatrics, while delivery stays steady.
Leadership Starts With What We Repeat Daily
Leadership gets romanticised: big speeches, heroic all-nighters, sudden “turnarounds.” In DevOps land, we don’t get much value from romance. We get value from what we do on an ordinary Tuesday when a deploy goes sideways and someone’s coffee has gone cold.
So we start with repetition. What do we consistently reward? If we praise speed but punish mistakes, we’ll get risky changes and hidden incidents. If we praise learning and protect people who surface problems early, we’ll get fewer surprises and quicker recovery. This isn’t motivational-poster stuff; it’s cause and effect.
One habit that’s paid off for us: we narrate trade-offs out loud. “We can ship today with a manual step, but it increases on-call pain” is a leadership move, because it teaches the team how decisions are made. Another: we separate urgency from importance. Incidents are urgent; reducing incident frequency is important. If we only ever fund urgent work, we’ll stay busy forever and feel oddly proud of it.
We also try to be boringly consistent in how we show up: same expectations, same definitions, same escalation paths. Consistency is kindness. People can’t do good work if they’re decoding leadership moods.
If we want a simple north star, it’s this: our leadership should make it easier to do the right thing than the wrong thing—especially under pressure.
Psychological Safety Isn’t Soft—It’s Operational
Let’s say the quiet part out loud: most outages have a “human” component, but that doesn’t mean the human is the root cause. Leadership that treats every incident like a trial will create a team that optimises for self-protection. That’s when people stop reporting near-misses, stop asking questions, and start doing risky work in the shadows.
We treat psychological safety as an operational requirement. If we can’t speak plainly in a tense moment, we’ll ship the same failure repeatedly—just with different characters in the story. That’s expensive.
A practical tactic: during incidents, we ban adjectives about people. No “careless,” no “lazy,” no “incompetent.” We stick to observable facts: what changed, what failed, what we saw in logs, what mitigations were attempted. This keeps the room calm and makes the timeline easier to build later.
We also coach leaders (and senior engineers) to model uncertainty. “I don’t know yet, but here’s what I’m checking” is a superpower. It invites collaboration and reduces panic. Teams follow what we do, not what we say.
If you want a framework that aligns with this, the concepts behind blameless postmortems are worth a read. Not because Google said it, but because it works: when people can tell the truth, we can fix the system.
The Incident Is a Leadership Exam (Open-Book)
Every incident is an unplanned assessment of our leadership. The “open-book” part is that the answers are already in our runbooks, tooling, and habits—or they’re missing, and we’ll find out in real time.
We like to define three roles during an incident, even for small ones: Incident Commander, Communications, and Ops/Investigations. One person can cover two roles in a pinch, but if everyone investigates, nobody communicates, and stakeholders start “helping” by sending Slack pings that feel like bees.
A steady incident rhythm helps: acknowledge, assess, mitigate, communicate, and only then deep-dive. Leadership means protecting the team’s focus and keeping the tempo calm. We’ve all seen leaders who sprint straight to “who did this?” That’s how you lose minutes, morale, and sometimes customers.
Here’s a trimmed incident checklist we’ve used in Git:
# Incident Checklist
## First 5 minutes
- [ ] Declare severity (SEV1/2/3)
- [ ] Assign Incident Commander (IC)
- [ ] Start incident channel + timeline doc
- [ ] Confirm customer impact + scope
- [ ] Identify safe rollback/mitigation options
## Every 15 minutes
- [ ] Update status page / internal updates
- [ ] Reassess blast radius
- [ ] Decide: continue mitigation vs rollback
## After stabilization
- [ ] Capture key timestamps
- [ ] Schedule post-incident review within 48 hours
- [ ] Track follow-ups as work items, not “good intentions”
For status comms, we’ve borrowed ideas from Atlassian’s incident communication practices. The trick isn’t fancy wording; it’s predictable updates that reduce anxiety and prevent leaders from improvising under stress.
Metrics We Use Without Worshipping Them
Leadership loves a number. It’s comforting: a dashboard doesn’t argue back. But metrics are like kitchen knives—useful, and also excellent at removing toes if we’re careless.
We’ve had success with a small set of metrics that balance speed and stability: deployment frequency, change failure rate, mean time to restore, and lead time for changes. If those sound familiar, it’s because they map well to the DORA metrics. The key is how we use them.
We don’t use metrics to rank teams or individuals. That turns measurement into theatre and encourages gaming. We use them to spot bottlenecks and to justify investment. If lead time is growing, we ask: reviews? CI duration? flaky tests? environment provisioning? If change failure rate spikes, we ask: big batch sizes? weak canaries? missing runbooks?
We also watch workload health: on-call interrupts, after-hours pages, and incident frequency. A team can “hit the numbers” and still burn out. Burnout is just technical debt, but in the human nervous system.
When we present metrics, we attach an action. “MTTR is 90 minutes” is trivia. “MTTR is 90 minutes because logs are missing correlation IDs; we’ll add them and retest” is leadership.
Finally, we share metrics with context, not judgement. We’re not running a talent show. We’re running a system that ships.
Standardisation Without Turning Into a Bureaucracy
Standardisation is one of those words that can make engineers reach for headphones. Fair. Bad standardisation is centralised control dressed up as “consistency.” Good standardisation is removing pointless decisions so people can spend energy on the work that matters.
Our leadership approach: we standardise interfaces, not creativity. We want teams to have freedom inside a guardrail, not freedom to rebuild the guardrail every quarter.
A practical example: we standardise CI stages (lint/test/build/scan/package), artifact naming, and deployment promotion rules. We don’t mandate everyone uses the same framework or logging library—unless it creates cross-team pain.
Here’s a minimal GitHub Actions template we’ve used as a starting point:
name: ci
on:
pull_request:
push:
branches: [ main ]
jobs:
test-build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up runtime
uses: actions/setup-node@v4
with:
node-version: "20"
- name: Install
run: npm ci
- name: Test
run: npm test -- --ci
- name: Build
run: npm run build
- name: Security scan
run: npm audit --audit-level=high
The leadership move isn’t the YAML. It’s making the default path safe and fast, so teams don’t need to negotiate for basics. When we publish templates, we also publish the “why,” and we keep an owner responsible for maintaining them. Otherwise templates rot, and everyone quietly forks them into a dozen incompatible snowflakes.
We also keep an escape hatch. If a team needs to diverge, we ask for a short write-up and a revisit date. Standards should earn their place, not squat forever.
Feedback Loops: The Only Leadership Superpower We Need
We can’t out-manage a slow feedback loop. If our tests take an hour, our reviews take three days, and our deploy approvals require a council of elders, leadership will devolve into pushing people harder—and nobody wants that (including us, if we like sleeping).
So we lead by tightening feedback loops. That often means investing in unglamorous things: faster CI, stable test data, reproducible environments, clear ownership, and a culture that prefers small changes over “big bang” releases.
A habit we’ve adopted: make work visible and discuss it early. That can be as simple as a lightweight RFC process for changes that affect other teams. Not a novel, just enough to surface risks while they’re cheap.
We also keep retrospectives short and frequent. If we wait for the quarterly retro, we’ll forget what happened and we’ll be too annoyed to be constructive. Weekly or biweekly team retros, plus post-incident reviews, keeps learning close to the moment.
And yes, leaders need feedback too. We ask: “What’s one thing leadership did this sprint that helped? One thing that got in the way?” It stings sometimes, which is how we know it’s real.
If you want inspiration for making systems observable (which improves feedback loops dramatically), OpenTelemetry is a solid place to start. Better telemetry makes debugging less like folklore and more like engineering.
Growing Leaders: Promote the Ones Who Reduce Drama
Leadership pipelines matter as much as software pipelines. If we only promote the loudest firefighter, we’ll train the whole org to set small fires for attention. We’ve seen it. It’s not even malicious—it’s just incentives doing their thing.
We try to promote people who:
– make incidents calmer, not louder
– write runbooks, not legends
– reduce toil, not tolerate it
– mentor others without hoarding context
– ship improvements that outlive them
That means we need career ladders that value reliability work, documentation, operational excellence, and cross-team enabling. Otherwise our best operators will leave or burn out, because the org only rewards feature delivery.
We also rotate opportunities. Incident Commander duty shouldn’t be a permanent crown. It should be a skill the team develops. Same with presenting postmortems, owning templates, or leading a migration. Leadership is learned by doing, but it’s learned safely when there’s coaching, pairing, and clear expectations.
Finally, we treat “being busy” as a smell. If someone is always the only person who can do a thing, that’s a leadership failure in knowledge distribution. Our job is to make ourselves less critical over time. If that sounds threatening, congratulations—you’re human. We all have to grow past it.
Leadership that scales is mostly about multiplying competence, not multiplying meetings.



