Practical DevOps delivery guide for CI/CD, environments, rollout safety, observability, and release workflows.
If I were defining DevOps defaults for a product team today, I would start with a boring rule: every change should earn trust in layers — branch, pull request, staging, canary, full rollout.
- DevOps Delivery Playbook
- Table of Contents
- Companion playbooks
- The defaults I'd reach for first
- Branch and pull request flow
- CI test lanes
- Staging validation
- Canary releases
- Rollback strategy
- Secret scanning
- A practical workflow model
- Example GitHub Actions layout
- Node.js test runners and monorepos
- Things I would avoid
- References and inspiration
- License
These repositories form one playbook suite:
- Auth & Identity Playbook — sessions, tokens, OAuth, and identity boundaries across the stack
- Backend Architecture Playbook — APIs, boundaries, OpenAPI, persistence, and errors
- Best of JavaScript — curated JS/TS tooling and stack defaults
- Caching Playbook — HTTP, CDN, and application caches; consistency and invalidation
- Code Review Playbook — PR quality, ownership, and review culture
- DevOps Delivery Playbook — CI/CD, environments, rollout safety, and observability
- Engineering Lead Playbook — standards, RFCs, and technical leadership habits
- Frontend Architecture Playbook — React structure, performance, and consuming API contracts
- Marketing and SEO Playbook — growth, SEO, experimentation, and marketing surfaces
- Monorepo Architecture Playbook — workspaces, package boundaries, and shared code at scale
- Node.js Runtime & Performance Playbook — event loop, streams, memory, and production Node performance
- Testing Strategy Playbook — unit, integration, contract, E2E, and CI-friendly test layers
If I were setting release rules for a team today, I would usually start here:
- Feature branch push: lint, unit tests, and fast smoke coverage
- Pull request: reviewable diff, status checks, staging validation path
- Main branch merge: full test suite
- Manual dispatch: ability to run the full suite on a feature branch when risk is high
- Secrets: scan the repository for exposed keys and tokens on every meaningful path
- Deploy: stage first, then canary, then full rollout
- Rollback: automatic where signals are clear, manual where judgment is needed
- Visibility: error rate, latency, throughput, and business metrics visible during rollout
The goal is not "more pipelines" The goal is progressive confidence.
A healthy DevOps flow starts before deployment.
- every change starts on a branch;
- pull requests are the collaboration and review boundary;
- checks run on PRs before merge;
- merged work is what earns heavier validation and deployment rights.
A branch gives isolation. A pull request gives review. Status checks give a gate. That combination is simple, scalable, and easy to explain to a team.
Do not wait until main to discover something your feature branch could have told you in minutes.
The source notes contain a very good layered model. I would keep it almost exactly, but make it explicit.
Run the things that should almost never be skipped:
- linting;
- unit tests;
- fast static checks;
- lightweight build validation.
For Node + TypeScript frontend and API repositories, this is where Vitest (vitest run or pnpm exec vitest run) or, in legacy setups, Jest should run on every push. Pick one primary runner per package and document it in package.json so CI stays copy-pasteable.
Smoke tests are not the whole e2e catalog. They are the minimum critical path that tells you whether the branch is fundamentally broken.
Use them for:
- app boots;
- login or auth shell works;
- the most critical happy-path flows do not immediately fail.
Once a branch is merged to the main branch, run the expensive confidence layer:
- broader e2e coverage;
- integration suites;
- slower contract checks (including OpenAPI codegen or contract tests when the web app depends on generated types — see the backend and frontend playbooks);
- deployment packaging if appropriate.
Sometimes you know a branch is large, risky, or hard to reason about. That is when manual full-suite execution on a branch is worth the time.
This is a very healthy capability. It gives teams a way to buy extra certainty without making every single push unbearably slow.
One of the strongest lines in the source notes is also one of the most operationally useful:
All PRs should be tested on staging using feature branches
That is exactly the kind of sentence a repository guide should contain.
- the deployable artifact from the branch can run in a realistic environment;
- downstream dependencies are present or acceptably simulated;
- the team can verify critical flows before production traffic touches the build.
- branch checks passed;
- staging deploy is healthy;
- critical smoke or acceptance path is validated;
- rollback path is understood.
Canary deployment is one of the best ways to reduce release risk without freezing delivery.
- expose the new version to a small percentage of traffic first;
- compare the canary against the stable version;
- expand only if health stays good;
- rollback automatically when clear alarm thresholds are crossed.
- error rate;
- latency;
- throughput;
- resource saturation;
- business outcomes if the change can affect them.
- deploy to staging;
- validate readiness and health checks;
- release to a small traffic slice;
- watch alarms and dashboards during the evaluation window;
- expand traffic if healthy;
- rollback if the canary degrades.
A canary is not just "deploy to 5%" It is "deploy to 5% with enough observability and authority to stop".
Rollback is not a note you add because it sounds mature. It is part of the release design.
- automatic rollback
- when health checks, error rates, or alarm thresholds fail clearly;
- manual rollback
- when the issue is subtle, business-specific, or not captured by simple thresholds.
- who can execute rollback;
- which signals trigger it;
- where the rollback command or workflow lives;
- how to verify that rollback actually restored health.
A team that cannot explain rollback in one minute does not yet have a finished deploy process.
The source material explicitly mentions Gitleaks. That is a good choice.
Credential leaks are rarely "interesting" incidents. They are expensive, preventable, and embarrassing.
Scan for hardcoded secrets in:
- commits;
- pull requests;
- repositories;
- local pre-commit or CI paths where possible.
Gitleaks is a strong default because it is easy to run in CI and directly targets passwords, API keys, tokens, and similar credential patterns.
This is the model I would share with a team:
Feature branch push
-> lint + unit tests
-> smoke tests
-> optional secret scan
Pull request
-> review
-> status checks
-> staging validation path
Merge to main
-> full suite
-> build release artifact
-> deploy to staging or pre-prod
-> canary rollout
-> monitor
-> full rollout or rollbackThat sequence is simple enough to remember and strict enough to protect production.
This is only an example, but it reflects the intended flow:
name: ci
on:
push:
branches-ignore:
- main
pull_request:
workflow_dispatch:
jobs:
fast-checks:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install
run: npm ci
- name: Lint
run: npm run lint
- name: Unit tests
# Vitest (typical for new Vite/React + Node TS repos):
run: npx vitest run
# Jest (legacy): npm test -- --runInBand
smoke-tests:
if: github.event_name != 'workflow_dispatch' || github.ref != 'refs/heads/main'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run smoke tests
run: npm run test:smoke
full-suite:
if: github.ref == 'refs/heads/main' || github.event_name == 'workflow_dispatch'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run full test suite
run: npm run test:full
secret-scan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Gitleaks
run: gitleaks dir .The exact toolchain can change. The layered confidence model should not.
- Vitest is the default this playbook assumes for new React (Vite) and Node + TypeScript packages (
vitest run, orpnpm exec vitest run). Jest remains valid for large legacy repos; usenpm test -- --runInBand(or your existing script) only when that is what the package already defines. - In a pnpm workspace or npm workspaces monorepo, prefer
pnpm turbo run test/nx test(or equivalent) so API, web, andpackages/*run in dependency order; add aopenapi:generate(or codegen) task to that graph when the UI imports generated types (see the backend and frontend playbooks). - Keep one documented unit-test entry point per package so
fast-checksjobs stay boring to copy across repos.
- only running serious tests after merge;
- treating staging as ceremonial instead of useful;
- giant all-or-nothing rollouts by default;
- canaries without dashboards and alarms;
- rollback plans that live only in tribal knowledge;
- storing secrets in tracked files;
- making the slowest suite run on every tiny push when a layered model would work better.
- GitHub Actions documentation
- Continuous integration with GitHub Actions
- GitHub flow
- Amazon ECS canary deployments
MIT is a sensible default for a playbook repository like this, but choose the license that fits your sharing goals.