Cybersecurity That Works: Pragmatic Defences For Busy Teams

cybersecurity

Cybersecurity That Works: Pragmatic Defences For Busy Teams

Ship features, not incidents—solid controls without the circus.

Start With A Threat Model We’ll Actually Use

We’ve all seen the “threat model” that’s a 40-page PDF no one opens again. Let’s not do that. For most teams, a lightweight, repeatable approach beats a perfect one. We start with three questions: what are we protecting, from whom, and what happens if it breaks. Then we write it down in a format we’ll revisit—usually a one-pager in the repo.

Our favourite trick: pick two “crown jewels” per service (customer data table, signing keys, payments workflow) and focus on them first. Then list the most likely threats: credential theft, exposed storage, dependency compromise, and “someone merged to main at 5pm Friday.” (That last one’s not in the framework, but it’s real.)

We map the flow: browser → API gateway → service → database → third parties. Each arrow is an opportunity for auth mistakes, logging leaks, or missing rate limits. We also identify trust boundaries: public internet, internal network, CI runners, cloud control plane. That helps us decide where we need strong authentication, where encryption matters, and where we should add detection.

If you want a simple reference for common categories, skim MITRE ATT&CK and steal the bits that fit. For risk ranking, we keep it blunt: likelihood (low/med/high) × impact (low/med/high). The output isn’t “security theatre”—it’s a backlog: a handful of concrete controls we’ll implement this sprint and review quarterly.

Lock Down Identity First (Because Passwords Are A Chaos Gremlin)

If we’re honest, most real-world breaches we’ve had to respond to start with identity: leaked credentials, overly-permissive roles, or tokens living forever in a place they shouldn’t. So our cybersecurity baseline begins with identity and access management.

Rule one: SSO + MFA everywhere. Not just GitHub and Google Workspace—CI, cloud consoles, monitoring tools, incident tooling. If an app can’t do SSO, we treat it like it’s radioactive: strict access, strong unique passwords in a manager, and an exit plan.

Rule two: least privilege that’s maintained, not least privilege that’s promised. We create roles that match job functions, then automate membership. We aim for short-lived credentials (OIDC to cloud, ephemeral tokens), and we kill “shared admin” accounts with prejudice.

Rule three: separate human and machine identity. Humans assume roles; machines use workload identities. We also set up guardrails like conditional access (geo/risk checks) and mandatory device posture where possible.

Helpful reading: the NIST Cybersecurity Framework is a decent compass, and OWASP keeps us honest about common app-level identity mistakes.

Finally, we make access reviews boring and scheduled: quarterly for normal systems, monthly for crown jewels. If it’s exciting, something’s wrong. The goal is to reduce the blast radius when credentials leak—because they will, and we’d rather it be a small splash than a tidal wave.

Secure Our CI/CD With A Pipeline That Doesn’t Trust Us

CI/CD is where good intentions go to die. It’s also where attackers love to hang out: build logs, injected dependencies, stolen tokens, and runners that can reach everything. We treat the pipeline as production-grade infrastructure, because it is.

We start with basics: pin actions, pin container images, and restrict who can modify pipeline definitions. We also avoid long-lived cloud keys in CI. Where possible, we use OIDC federation so the runner gets a short-lived cloud role, not a secret that’ll eventually show up in a pastebin.

Here’s a minimal GitHub Actions example using OIDC to assume an AWS role (no static AWS keys):

name: deploy
on:
  push:
    branches: [ "main" ]

permissions:
  id-token: write
  contents: read

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Configure AWS credentials (OIDC)
        uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::123456789012:role/gha-deploy-role
          aws-region: eu-west-1

      - name: Deploy
        run: ./scripts/deploy.sh

We add policy guardrails: environment protection rules, required reviewers for production, and signed releases where it matters. We also scan dependencies and containers, but we keep it actionable: fail builds only on high severity with known exploits, and track the rest as debt.

If you want to go deeper on supply chain hygiene, SLSA is a solid guide. The aim isn’t perfection; it’s reducing the odds that “our build system” becomes “their build system.”

Ship Secure Defaults In Infrastructure As Code

Our cloud environments don’t drift into insecurity—they’re pushed there. That’s why infrastructure as code (IaC) is one of the best places to enforce cybersecurity controls: it’s repeatable, reviewable, and you can fail fast.

We keep a small set of non-negotiable defaults: encryption at rest, private-by-default networking, no public buckets, no public databases, and logs enabled. Then we enforce those defaults with policy checks in CI.

Here’s an example using Terraform plus a simple policy check approach (you can use tools like tfsec, Checkov, or OPA). A tiny Terraform snippet for an S3 bucket with sane defaults:

resource "aws_s3_bucket" "app_data" {
  bucket = "our-app-data-prod"
}

resource "aws_s3_bucket_public_access_block" "app_data" {
  bucket = aws_s3_bucket.app_data.id

  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true
}

resource "aws_s3_bucket_server_side_encryption_configuration" "app_data" {
  bucket = aws_s3_bucket.app_data.id

  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm = "AES256"
    }
  }
}

This isn’t fancy, but it stops a surprising number of “oops” moments. We also tag everything (owner, environment, data classification) and use that for alerts and access boundaries later.

A useful practice: treat network exposure as a change requiring extra scrutiny. If a PR opens inbound rules to 0.0.0.0/0, it should trigger a review and a question: “Do we really need this public?” Most times, the answer is “no, we needed a load balancer and got impatient.”

Make Logging And Detection Boring (In A Good Way)

Prevention is great. Detection is what saves us when prevention fails at 2am. The trick is to build logging and alerting that’s useful, not noisy. Our standard: if an alert fires, someone should know what to do next.

We separate logs into three buckets: audit logs (who did what), security signals (auth failures, suspicious API usage), and application logs (debugging). Audit logs go to a write-once or protected store, because attackers love deleting evidence.

We focus on high-signal detections:
– Impossible travel / unusual login patterns (SSO provider)
– Privilege escalation and policy changes (cloud control plane)
– New access keys, token creation, or secret reads (secrets manager)
– Unexpected outbound traffic spikes (egress monitoring)
– Container exec in production (runtime alerts)

We also commit to a retention policy. “Keep everything forever” is expensive and usually pointless. We align retention with compliance needs and incident response reality: e.g., 30–90 days hot, 6–12 months cold, longer for audit trails if required.

For guidance on incident-friendly logging, the CIS Controls are practical and not too preachy. The outcome we want: when something weird happens, we can answer “what changed?” and “who touched it?” in minutes, not days.

Manage Secrets Like We Don’t Want To Apologise Later

Secrets management is one of those topics where everyone agrees and then quietly commits a .env file. We keep our approach simple: no secrets in git, no secrets in build logs, no long-lived secrets if we can avoid them.

We standardise on one secrets manager per environment (cloud-native is fine), and we use workload identity to access it. Developers shouldn’t need production credentials locally; they should have dev credentials and a path to test safely.

Rotation is where the bodies are buried. So we automate it: database passwords, API keys, signing keys—rotated on a schedule, with immediate rotation available during incidents. When rotation is painful, people avoid it, and then one leaked key becomes a long-running subscription service for an attacker.

We also classify secrets:
Tier 0: root keys, org admin creds, signing keys (extra controls, break-glass)
Tier 1: production DB credentials, third-party API keys
Tier 2: dev/test credentials

And we limit where secrets can be used. If a CI job doesn’t need access to prod, it doesn’t get it. This sounds obvious, but it’s the difference between a contained incident and a headline.

One more thing: we treat secret scanning findings as “stop the bleeding” events. Rotate first, investigate second. The investigation matters—but not as much as shutting the door.

Practise Incident Response Before The Incident Practises On Us

The day we need incident response isn’t the day we want to discover missing runbooks, nonexistent contacts, or that nobody knows how to revoke tokens. We keep incident response lightweight but real: a short playbook, a few rehearsals per year, and clear ownership.

Our baseline playbook includes:
– How to declare an incident (and who can)
– How to communicate (internal + customer-facing)
– How to preserve evidence (logs, snapshots, access records)
– How to contain (disable accounts, rotate secrets, block egress)
– How to recover (redeploy, validate, monitor)
– How to do a post-incident review without blame

We run tabletop exercises with scenarios that match our threat model: compromised CI token, leaked database creds, exposed bucket, dependency backdoor. We time ourselves: how fast can we detect, contain, and confirm recovery? We learn a lot from the awkward pauses.

We also keep a “break-glass” path: a tightly controlled admin account with strong MFA, stored and monitored, for when SSO is down or the normal path is compromised. Break-glass should be rare and loud.

If you’re in regulated land, align with ISO/IEC 27001 where needed, but don’t let compliance replace readiness. The best incident is the one we prevent; the second-best is the one we handle calmly, quickly, and with receipts.

Share