Scrum Without the Ceremony: Ship More, Suffer Less
How we keep scrum useful in DevOps-heavy teams.
Keep scrum Small: Outcomes Over Acrobatics
We’ve all seen it: scrum “done right” turns into a weekly theatre production where everyone’s busy yet nothing ships. The trick isn’t to ditch scrum—it’s to keep it small and stubbornly focused on outcomes. For us, scrum is a lightweight operating system: it helps us decide what matters this sprint, how we’ll know it’s done, and how we’ll avoid stepping on each other’s toes while production keeps doing production things.
We start by naming the outcome in plain language. Not “Improve reliability” (that’s a wish), but “Reduce checkout API 500s from 1% to 0.2%” (that’s a target). Then we shape work around it. This also stops the classic “ten half-finished tickets” problem; if it doesn’t move the outcome, it’s not sprint-critical.
A useful pattern: keep a short “sprint goal statement” that anyone can repeat without checking Jira. If the goal can’t fit in one sentence, it’s probably two sprints or a quarterly initiative wearing a sprint moustache.
We also treat incidents and interrupts as first-class citizens. Real teams get paged. Pretending they don’t is how scrum becomes a guilt factory. Reserve capacity (we like 20–30% depending on on-call load) and be explicit about it. If you’re not, your sprint plan is just a work of fiction—and we’re not trying to win literary awards.
If you want a quick refresher on the official bits, the Scrum Guide is still the clearest reference, even if we don’t follow it like scripture.
Backlog Hygiene That Doesn’t Make Us Miserable
A backlog is either a helpful queue or a junk drawer with aspirations. We’ve learned that backlog hygiene works best when it’s ruthlessly pragmatic: fewer items, clearer “done,” and just enough detail to start. Not every ticket needs a novella; it needs a reason to exist.
We keep three tiers:
1) Now (this sprint or next): thin slices, testable, sized.
2) Soon (next 1–2 months): shaped, but not over-specified.
3) Later (maybe): a parking lot with ruthless pruning.
The big move is writing tickets as verifiable behaviour changes. “Add retries” isn’t a behaviour. “When upstream times out, we retry up to 2 times with jitter and cap total latency at 800ms” is. That makes review and testing less of a guessing game, and it prevents “done-ish” from sneaking into the sprint.
We also standardise a tiny checklist per ticket: Why, What, Done, Risk. “Risk” is where we note if it touches prod paths, needs migration, or might need a feature flag. This is how scrum starts helping DevOps instead of fighting it.
For teams that struggle with slicing, we borrow a page from the INVEST idea—independent, testable work beats “one ticket to rule them all.” And for planning, we’d rather be roughly right than precisely wrong. If everything is “8 points,” we don’t have estimation—we have a shrug with numbers.
Planning: Capacity, On-Call Reality, and the Boring Math
Sprint planning gets a bad rap because people use it to negotiate with physics. Our approach is boring, which is why it works. First, we compute capacity like grown-ups: available days minus on-call load, meetings, holidays, and the fact that humans occasionally need lunch.
Here’s a simple capacity template we’ve used in team docs (or drop it into a shared spreadsheet if you’re feeling fancy):
sprint:
length_days: 10
team:
engineers: 6
avg_focus_per_day_hours: 4.5
oncall_factor: 0.75 # 25% hit overall due to interrupts
planned_time_off_days: 3
calc:
total_focus_hours: (length_days*engineers - planned_time_off_days) * avg_focus_per_day_hours * oncall_factor
notes:
- Reduce oncall_factor to 0.6 during big launches
- Increase to 0.85 if oncall is quiet and services are stable
Then we pick work to match the computed focus hours (or your equivalent points). If we’re debating the fourth or fifth item, that’s usually a sign we’ve hit the line. We stop adding. That’s the whole secret.
We also plan for “risk spikes” inside the sprint. If a story includes migration risk, we schedule the risky step early. Nothing’s worse than discovering on day nine that your schema change is… optimistic.
One more thing: if your sprint goal depends on another team, write that dependency down and treat it as a risk, not a hope. We’ll often add a small “dependency ticket” with a named owner and a due date for the handshake. Boring? Yes. Effective? Also yes.
Daily scrum That’s Actually Daily (and Actually Useful)
Daily scrum should not be a status meeting where we recite our calendars to each other. We aim for 10 minutes, and we keep it centred on flow: what’s stuck, what’s risky, and what we’re doing today to move the sprint goal.
A helpful framing we use:
– Sprint goal progress: are we closer than yesterday?
– Blockers: anything preventing progress in the next 24 hours?
– Work-in-progress limits: who’s juggling too much?
If someone starts deep-diving, we park it and spin off a “two-person chat after standup.” That’s not rude; that’s how you respect everyone’s time.
For distributed teams, we’ve had good results with a hybrid: a short live standup plus an async note in Slack/Teams. The async note becomes a lightweight log that’s searchable during incident review (“Wait, when did we start seeing errors?”).
Also: rotate facilitation. When one person runs daily scrum forever, it becomes their meeting. When facilitation rotates, it stays the team’s meeting. A tiny change, big payoff.
If your team’s daily scrum is consistently painful, that’s a smell. The cure usually isn’t “do scrum harder.” It’s to reduce WIP, clarify acceptance criteria, and stop starting work without a clear path to merge and deploy. The daily scrum is where those issues surface; it shouldn’t be where they go to die.
Definition of Done: Where Dev Meets Ops Meets Reality
If scrum has a hidden superpower, it’s the Definition of Done (DoD). If scrum has a hidden trap, it’s also the Definition of Done. Teams either make it so strict nothing finishes, or so vague everything is “done” until prod says otherwise.
Our DoD is layered:
– Code done: merged, reviewed, linted, tested.
– Release-ready: feature flagged or safely deployable, rollout plan exists.
– Operable: metrics/logs updated, alerting considered, runbook notes added.
We keep it short and we treat it as a contract. Here’s a trimmed version you can steal:
## Definition of Done (Team)
- [ ] Code merged to main with 1+ review
- [ ] Unit/integration tests added or explicitly not needed (with reason)
- [ ] CI green; no new high-severity security findings
- [ ] Deployable safely (flag, canary, or backwards-compatible change)
- [ ] Observability updated (dashboard/metric/log as appropriate)
- [ ] Runbook note for on-call if behaviour changes
- [ ] Acceptance criteria validated in staging or prod (as agreed)
If that looks “Ops-heavy,” good. In DevOps-flavoured teams, “done” isn’t done until it’s survivable at 2 a.m. We don’t add these items to punish ourselves; we add them to avoid future pain.
We also explicitly allow exceptions—because reality. But exceptions must be written down in the ticket, with a follow-up task. That way “temporary” doesn’t become “forever,” which is the natural state of all things in software.
For security posture, we’ll often align DoD with baseline guidance like the OWASP Top 10—not to overcomplicate, just to keep the obvious traps from repeating.
Reviews and Demos: Show the Work, Not the Slide Deck
Sprint review should be about what changed in the product and systems, not what we intend to do someday. We aim to demo production-like behaviour: a new endpoint, a faster job, a cleaner dashboard, fewer alerts. The goal is to make progress visible and invite feedback early, before we’ve invested three sprints in the wrong direction.
We keep a simple review agenda:
1) Sprint goal recap (one sentence).
2) Demo the shipped work (live if possible).
3) What didn’t ship and why (no shame, just facts).
4) Metrics: did the outcome move?
5) Next sprint risks/dependencies (quick scan).
The “metrics” part is where DevOps teams shine. If we claim we improved reliability, let’s show error rates. If we claim performance wins, show latency percentiles. We don’t need a PhD thesis—just a graph and a sentence. If you’re already using DORA metrics, a sprint review is a perfect place to sanity-check trends without turning it into a KPI ritual.
Also, invite the right people: product partners, support, maybe someone from security if the work touches auth. But keep the group small enough that it stays a review, not a town hall.
And yes, sometimes the demo fails. That’s fine. If our systems can’t survive a demo, they probably can’t survive Friday traffic either.
Retrospectives: Fix One Thing, Not Everything
Retros are where scrum teams either get better or get philosophical. We prefer “better.” Our retro rule: pick one improvement we’ll actually implement next sprint, and make it visible as a backlog item. If we pick five, we implement zero, and everyone learns that retros are just feelings with sticky notes.
We use a simple format:
– Keep: what helped?
– Stop: what hurt?
– Start: what should we try?
Then we vote, pick one, and define success. Example: “Reduce PR review time.” Success might be “80% of PRs reviewed within 24 hours” plus a small policy change (like a rotating reviewer).
For DevOps-y teams, our best retro improvements tend to be workflow tweaks: smaller PRs, clearer ticket acceptance criteria, WIP limits, or an on-call handoff checklist. Occasionally it’s a technical investment—like tightening CI, adding a canary, or cleaning up flaky tests. The key is making it doable in one sprint.
We also keep a “retro log” in a doc. Not because we love paperwork, but because teams forget. Seeing last month’s action items is a gentle nudge toward accountability—and a reminder that we’ve already tried some things and learned from them.
If you want inspiration, Atlassian’s retro techniques are a decent menu. We just recommend ordering one dish, not the whole buffet.



