DevOps from Zero to Hero: Automated Testing
Support this blog
If you find this content useful, consider supporting the blog.
Introduction
Welcome to article four of the DevOps from Zero to Hero series. In the previous articles we covered the fundamentals of Linux, networking, and version control with Git. Now it is time to talk about something that separates hobby projects from production-ready software: automated testing.
If you have ever pushed a change to production and immediately regretted it, you already understand why testing matters. Automated tests give you confidence that your code works as expected before it reaches users. In a DevOps context, tests are the gate between “code written” and “code deployed.” Without them, your CI/CD pipeline is just a fast way to ship bugs.
In this article we will cover the testing pyramid, write real unit and integration tests in TypeScript using Vitest and Supertest, talk about what coverage actually means (and why chasing 100% is a trap), and lay the groundwork for running tests in CI, which we will cover in depth in article five.
Let’s get into it.
Why testing matters for DevOps
Testing is not just a developer concern. In a DevOps workflow, tests are the foundation of everything else you build. Here is why:
- Confidence to deploy: If your tests pass, you can deploy without fear. If they do not, you know something is broken before users do.
- Fast feedback: A good test suite tells you within minutes whether a change is safe. Compare that to waiting for manual QA or finding out from a user report.
- Catch regressions: Code that worked yesterday can break today because of a seemingly unrelated change. Tests catch these regressions automatically.
- Enable automation: CI/CD pipelines depend on tests. Without automated tests, your pipeline is just automated deployment of untested code.
- Documentation: Well-written tests describe what your code should do. They serve as living documentation that stays in sync with the actual behavior.
Think of it this way: every test you write is a tiny contract that says “this behavior must be preserved.” When someone changes the code six months from now, those contracts catch anything that breaks. That is incredibly valuable in a team environment where multiple people touch the same codebase.
The testing pyramid
The testing pyramid is a model that helps you decide how many tests of each type to write. It looks like this:
/ E2E \ Few, slow, expensive
/----------\
/ Integration \ Some, moderate speed
/----------------\
/ Unit Tests \ Many, fast, cheap
/____________________\
The shape matters. Here is why:
- Unit tests (base of the pyramid): These test individual functions or modules in isolation. They are fast, cheap to write, and cheap to run. You should have the most of these.
- Integration tests (middle): These test how multiple pieces work together, like an API endpoint hitting a database. They are slower and more complex, but they catch issues that unit tests miss.
- End-to-end tests (top): These test the entire application from the user’s perspective, often through a browser. They are the slowest, most fragile, and most expensive to maintain. You should have the fewest of these.
The pyramid shape exists because of a tradeoff between speed and confidence. Unit tests run in milliseconds but only test small pieces. E2E tests take seconds or minutes but test the full flow. If you invert the pyramid (lots of E2E, few unit tests), your test suite becomes slow, flaky, and painful to maintain.
A healthy ratio might look something like 70% unit, 20% integration, 10% E2E. These numbers are not rules, they are guidelines. The key insight is: push testing down to the lowest level that gives you confidence. If you can catch a bug with a unit test, do not write an E2E test for it.
Setting up the project
Let’s build a small TypeScript project with tests. We will use Vitest as our test runner because it is fast, modern, and works great with TypeScript out of the box.
First, initialize the project:
mkdir testing-demo && cd testing-demo
npm init -y
npm install -D typescript vitest @types/node
npm install express
npm install -D @types/express supertest @types/supertest
Create a tsconfig.json:
{
"compilerOptions": {
"target": "ES2022",
"module": "ESNext",
"moduleResolution": "bundler",
"strict": true,
"esModuleInterop": true,
"outDir": "./dist",
"rootDir": "./src",
"declaration": true,
"sourceMap": true
},
"include": ["src/**/*"],
"exclude": ["node_modules", "dist"]
}
Add the test script to package.json:
{
"scripts": {
"test": "vitest run",
"test:watch": "vitest",
"test:coverage": "vitest run --coverage"
}
}
Unit testing with Vitest
Let’s start with the base of the pyramid. Unit tests verify that individual functions do what they are supposed to do. They should be fast, isolated, and deterministic.
Here is a simple utility module at src/utils.ts:
// src/utils.ts
export function slugify(text: string): string {
return text
.toLowerCase()
.trim()
.replace(/[^\w\s-]/g, "")
.replace(/[\s_]+/g, "-")
.replace(/-+/g, "-")
.replace(/^-|-$/g, "");
}
export function truncate(text: string, maxLength: number): string {
if (text.length <= maxLength) {
return text;
}
const truncated = text.slice(0, maxLength);
const lastSpace = truncated.lastIndexOf(" ");
if (lastSpace > 0) {
return truncated.slice(0, lastSpace) + "...";
}
return truncated + "...";
}
export function parseQueryParams(query: string): Record<string, string> {
if (!query || query.trim() === "") {
return {};
}
const cleaned = query.startsWith("?") ? query.slice(1) : query;
return cleaned.split("&").reduce(
(params, pair) => {
const [key, value] = pair.split("=");
if (key) {
params[decodeURIComponent(key)] = decodeURIComponent(value ?? "");
}
return params;
},
{} as Record<string, string>,
);
}
Now let’s write the tests at src/utils.test.ts:
// src/utils.test.ts
import { describe, it, expect } from "vitest";
import { slugify, truncate, parseQueryParams } from "./utils";
describe("slugify", () => {
it("converts a simple string to a slug", () => {
expect(slugify("Hello World")).toBe("hello-world");
});
it("handles special characters", () => {
expect(slugify("Hello, World! How's it going?")).toBe(
"hello-world-hows-it-going",
);
});
it("collapses multiple spaces and dashes", () => {
expect(slugify("too many spaces")).toBe("too-many-spaces");
expect(slugify("too---many---dashes")).toBe("too-many-dashes");
});
it("trims leading and trailing dashes", () => {
expect(slugify(" -hello- ")).toBe("hello");
});
it("handles empty string", () => {
expect(slugify("")).toBe("");
});
});
describe("truncate", () => {
it("returns the full string if it is shorter than maxLength", () => {
expect(truncate("short", 10)).toBe("short");
});
it("returns the full string if it equals maxLength", () => {
expect(truncate("exact", 5)).toBe("exact");
});
it("truncates at the last space before maxLength", () => {
expect(truncate("this is a longer sentence", 15)).toBe("this is a...");
});
it("truncates without space if no space is found", () => {
expect(truncate("superlongwordwithoutspaces", 10)).toBe(
"superlongw...",
);
});
});
describe("parseQueryParams", () => {
it("parses a simple query string", () => {
expect(parseQueryParams("name=alice&age=30")).toEqual({
name: "alice",
age: "30",
});
});
it("handles a leading question mark", () => {
expect(parseQueryParams("?foo=bar")).toEqual({ foo: "bar" });
});
it("handles URL-encoded values", () => {
expect(parseQueryParams("msg=hello%20world")).toEqual({
msg: "hello world",
});
});
it("returns an empty object for empty input", () => {
expect(parseQueryParams("")).toEqual({});
});
it("handles keys without values", () => {
expect(parseQueryParams("flag=")).toEqual({ flag: "" });
});
});
Let’s break down the test structure:
describegroups related tests. Think of it as a section header for a set of behaviors.itdefines an individual test case. The string should read like a sentence: “it converts a simple string to a slug.”expectis the assertion. It takes a value and chains a matcher liketoBe,toEqual,toContain, ortoThrow.
Run the tests:
npx vitest run
# Output:
# ✓ src/utils.test.ts (10 tests) 5ms
# ✓ slugify (5 tests)
# ✓ truncate (4 tests)
# ✓ parseQueryParams (5 tests)
# Test Files 1 passed (1)
# Tests 14 passed (14)
Mocking dependencies
Real-world code has dependencies: databases, APIs, file systems. In unit tests, you want to isolate the function under test by replacing those dependencies with controlled substitutes. This is called mocking.
Here is a module that depends on an external API at src/weather.ts:
// src/weather.ts
export interface WeatherData {
city: string;
temperature: number;
description: string;
}
export async function fetchWeather(city: string): Promise<WeatherData> {
const response = await fetch(
`https://api.weather.example.com/v1/current?city=${encodeURIComponent(city)}`,
);
if (!response.ok) {
throw new Error(`Weather API returned ${response.status}`);
}
const data = await response.json();
return {
city: data.location.name,
temperature: data.current.temp_c,
description: data.current.condition.text,
};
}
export function formatWeatherReport(weather: WeatherData): string {
return `${weather.city}: ${weather.temperature}C, ${weather.description}`;
}
And the tests at src/weather.test.ts:
// src/weather.test.ts
import { describe, it, expect, vi, beforeEach } from "vitest";
import { fetchWeather, formatWeatherReport } from "./weather";
// Mock the global fetch function
const mockFetch = vi.fn();
vi.stubGlobal("fetch", mockFetch);
beforeEach(() => {
mockFetch.mockReset();
});
describe("fetchWeather", () => {
it("returns parsed weather data on success", async () => {
mockFetch.mockResolvedValueOnce({
ok: true,
json: async () => ({
location: { name: "London" },
current: { temp_c: 15, condition: { text: "Partly cloudy" } },
}),
});
const result = await fetchWeather("London");
expect(result).toEqual({
city: "London",
temperature: 15,
description: "Partly cloudy",
});
expect(mockFetch).toHaveBeenCalledWith(
"https://api.weather.example.com/v1/current?city=London",
);
});
it("throws on non-ok response", async () => {
mockFetch.mockResolvedValueOnce({
ok: false,
status: 404,
});
await expect(fetchWeather("Nowhere")).rejects.toThrow(
"Weather API returned 404",
);
});
});
describe("formatWeatherReport", () => {
it("formats the weather data as a readable string", () => {
const weather = {
city: "Berlin",
temperature: 22,
description: "Sunny",
};
expect(formatWeatherReport(weather)).toBe("Berlin: 22C, Sunny");
});
});
Key mocking concepts:
vi.fn()creates a mock function that records how it was called.vi.stubGlobal()replaces a global likefetchwith your mock.mockResolvedValueOnce()tells the mock what to return the next time it is called.mockReset()clears the mock state between tests so they do not leak into each other.
The important thing to understand about mocking is this: you are not testing fetch. You are testing
that your code correctly handles the response from fetch. The mock lets you simulate different
scenarios (success, error, timeout) without making real network calls.
Integration testing with Supertest
Integration tests verify that multiple pieces of your application work together. For web applications, the most common integration test is hitting an API endpoint and verifying the response.
Here is a simple Express app at src/app.ts:
// src/app.ts
import express from "express";
import { slugify, truncate } from "./utils";
export const app = express();
app.use(express.json());
interface Article {
id: number;
title: string;
slug: string;
content: string;
summary?: string;
}
const articles: Article[] = [];
let nextId = 1;
app.get("/api/articles", (_req, res) => {
res.json(articles);
});
app.get("/api/articles/:slug", (req, res) => {
const article = articles.find((a) => a.slug === req.params.slug);
if (!article) {
res.status(404).json({ error: "Article not found" });
return;
}
res.json(article);
});
app.post("/api/articles", (req, res) => {
const { title, content } = req.body;
if (!title || !content) {
res.status(400).json({ error: "Title and content are required" });
return;
}
const article: Article = {
id: nextId++,
title,
slug: slugify(title),
content,
summary: truncate(content, 100),
};
articles.push(article);
res.status(201).json(article);
});
app.delete("/api/articles/:slug", (req, res) => {
const index = articles.findIndex((a) => a.slug === req.params.slug);
if (index === -1) {
res.status(404).json({ error: "Article not found" });
return;
}
articles.splice(index, 1);
res.status(204).send();
});
Now the integration tests at src/app.test.ts:
// src/app.test.ts
import { describe, it, expect, beforeEach } from "vitest";
import request from "supertest";
import { app } from "./app";
describe("Articles API", () => {
// Note: In a real app, you would reset the database between tests.
// Here we rely on the in-memory array.
describe("POST /api/articles", () => {
it("creates a new article", async () => {
const response = await request(app)
.post("/api/articles")
.send({
title: "My First Post",
content:
"This is the content of my first blog post. It has enough words to test truncation properly.",
})
.expect(201);
expect(response.body).toMatchObject({
title: "My First Post",
slug: "my-first-post",
content:
"This is the content of my first blog post. It has enough words to test truncation properly.",
});
expect(response.body.id).toBeDefined();
expect(response.body.summary).toBeDefined();
});
it("returns 400 when title is missing", async () => {
const response = await request(app)
.post("/api/articles")
.send({ content: "some content" })
.expect(400);
expect(response.body.error).toBe("Title and content are required");
});
it("returns 400 when content is missing", async () => {
const response = await request(app)
.post("/api/articles")
.send({ title: "A Title" })
.expect(400);
expect(response.body.error).toBe("Title and content are required");
});
});
describe("GET /api/articles", () => {
it("returns the list of articles", async () => {
const response = await request(app).get("/api/articles").expect(200);
expect(Array.isArray(response.body)).toBe(true);
expect(response.body.length).toBeGreaterThan(0);
});
});
describe("GET /api/articles/:slug", () => {
it("returns an article by slug", async () => {
const response = await request(app)
.get("/api/articles/my-first-post")
.expect(200);
expect(response.body.slug).toBe("my-first-post");
expect(response.body.title).toBe("My First Post");
});
it("returns 404 for a non-existent slug", async () => {
const response = await request(app)
.get("/api/articles/does-not-exist")
.expect(404);
expect(response.body.error).toBe("Article not found");
});
});
describe("DELETE /api/articles/:slug", () => {
it("deletes an article by slug", async () => {
// First, create an article to delete
await request(app)
.post("/api/articles")
.send({ title: "To Be Deleted", content: "This will be removed" });
await request(app)
.delete("/api/articles/to-be-deleted")
.expect(204);
// Verify it is gone
await request(app)
.get("/api/articles/to-be-deleted")
.expect(404);
});
it("returns 404 when deleting a non-existent article", async () => {
await request(app)
.delete("/api/articles/ghost-article")
.expect(404);
});
});
});
Notice how integration tests differ from unit tests:
- They test the full request/response cycle, not just a single function.
- They exercise multiple layers (routing, validation, business logic) together.
- They are slower because they spin up the HTTP layer, but they catch bugs that unit tests cannot, like incorrect route definitions or missing middleware.
Supertest is excellent because it does not require you to start the server on a port. It hooks directly into Express, so tests are fast and do not conflict with each other.
Testing against real databases with Testcontainers
For applications that use a database, you need to decide: do you mock the database or use a real one? Mocking is faster but can hide bugs related to SQL syntax, constraints, or query behavior. Testcontainers gives you the best of both worlds by spinning up a real database in Docker for your tests.
Here is what using Testcontainers looks like conceptually:
// src/db.integration.test.ts (conceptual example)
import { describe, it, expect, beforeAll, afterAll } from "vitest";
import { PostgreSqlContainer } from "@testcontainers/postgresql";
import { Client } from "pg";
describe("Database integration", () => {
let container: any;
let client: Client;
beforeAll(async () => {
// Start a real PostgreSQL container
container = await new PostgreSqlContainer("postgres:16")
.withDatabase("testdb")
.start();
client = new Client({
connectionString: container.getConnectionUri(),
});
await client.connect();
// Run migrations
await client.query(`
CREATE TABLE articles (
id SERIAL PRIMARY KEY,
title TEXT NOT NULL,
slug TEXT UNIQUE NOT NULL,
content TEXT NOT NULL,
created_at TIMESTAMPTZ DEFAULT NOW()
)
`);
}, 60000); // Containers can take a moment to start
afterAll(async () => {
await client.end();
await container.stop();
});
it("inserts and retrieves an article", async () => {
await client.query(
"INSERT INTO articles (title, slug, content) VALUES ($1, $2, $3)",
["Test Article", "test-article", "Some content"],
);
const result = await client.query(
"SELECT * FROM articles WHERE slug = $1",
["test-article"],
);
expect(result.rows).toHaveLength(1);
expect(result.rows[0].title).toBe("Test Article");
});
});
Testcontainers is especially useful because:
- Tests run against the same database engine you use in production, catching driver-specific bugs.
- Each test suite gets a fresh container, so tests do not interfere with each other.
- It works in CI as long as Docker is available (which it usually is in GitHub Actions).
- No shared test database that accumulates stale data or causes flaky tests from parallel runs.
The tradeoff is speed: starting a container takes a few seconds. For this reason, Testcontainers tests belong in the integration tier, not the unit tier.
Test naming conventions and organization
How you name and organize tests matters more than you might think. In six months, when a test fails in CI, the test name is the first thing you will see. A good name tells you exactly what broke without reading the code.
Here are some conventions that work well:
File organization:
src/
utils.ts
utils.test.ts # Co-located with the source file
app.ts
app.test.ts
weather.ts
weather.test.ts
Co-locating tests with source files makes it obvious which file a test covers. Some teams prefer a
separate __tests__ directory, but co-location has the advantage that when you rename or move a file,
the test moves with it.
Naming patterns:
// Good: Describes the behavior clearly
describe("slugify", () => {
it("converts spaces to dashes", () => {});
it("removes special characters", () => {});
it("handles empty string", () => {});
});
// Bad: Vague or implementation-focused
describe("slugify", () => {
it("works", () => {});
it("test 1", () => {});
it("uses regex", () => {}); // who cares about the implementation?
});
The test name should answer: “What behavior does this test verify?” When it fails, the output should
read like a bug report: slugify > removes special characters: FAILED.
What coverage actually means
Code coverage measures what percentage of your code is executed when your tests run. You can generate a coverage report with Vitest:
npx vitest run --coverage
This gives you metrics like:
- Line coverage: What percentage of lines were executed?
- Branch coverage: What percentage of if/else paths were taken?
- Function coverage: What percentage of functions were called?
- Statement coverage: What percentage of statements were executed?
A coverage report might look like this:
# ------------------|---------|----------|---------|---------|
# File | % Stmts | % Branch | % Funcs | % Lines |
# ------------------|---------|----------|---------|---------|
# src/utils.ts | 100 | 100 | 100 | 100 |
# src/weather.ts | 85 | 75 | 100 | 85 |
# src/app.ts | 92 | 80 | 100 | 92 |
# ------------------|---------|----------|---------|---------|
Why 100% coverage is a trap:
Coverage tells you what code was executed, not what code was tested correctly. Consider this:
// This test has 100% coverage of the add function
function add(a: number, b: number): number {
return a + b;
}
it("covers the add function", () => {
add(1, 2); // Look, we called it! 100% coverage!
// But we never checked the result...
});
That test executes every line of add but proves nothing. The function could return "banana" and
the test would still pass. Coverage without meaningful assertions is theater.
What metrics to watch instead:
- Mutation testing: Tools like Stryker modify your code (change
+to-, remove conditionals) and check if any tests fail. If a mutation survives, your tests have a blind spot. This is far more meaningful than line coverage.- Branch coverage over line coverage: Branch coverage catches untested conditional paths. A function with an if/else might have 100% line coverage but only 50% branch coverage if you never test the else path.
- Test failure rate in CI: If tests never fail, they might not be testing anything meaningful. If they fail constantly, they might be flaky. A healthy test suite fails occasionally when real bugs are introduced.
- Time to detection: How quickly do tests catch a real bug after it is introduced? This is the metric that actually matters for DevOps.
A reasonable coverage target is somewhere between 70% and 90%. Anything above 90% usually means you are writing tests for trivial code just to hit a number.
When to NOT write tests
Testing everything is not the goal. Testing the right things is. Here are cases where writing tests adds cost without meaningful value:
- Generated code: If a tool generates your API client, ORM models, or GraphQL types, do not test the generation output. Test the code that uses them.
- Simple getters and setters: A function that just returns a property does not need a test. If you feel the need to test it, the function is probably too simple to break.
- Framework internals: Do not test that Express routes requests or that React renders components. Those are the framework’s job. Test your logic that runs inside the framework.
- Third-party libraries: Do not test that
lodash.groupByworks correctly. The library maintainers already did that.- Configuration files: JSON configs, environment variable listings, and static data do not need unit tests.
Focus your testing effort where bugs are most likely and most expensive: business logic, data transformations, edge cases in parsing, and integration points between systems.
Running tests in CI
We will cover CI/CD in detail in the next article, but here is a preview of what running tests in GitHub Actions looks like:
# .github/workflows/test.yml
name: Tests
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: "22"
cache: "npm"
- run: npm ci
- run: npm test
- run: npm run test:coverage
- name: Upload coverage report
uses: actions/upload-artifact@v4
with:
name: coverage-report
path: coverage/
This workflow runs on every push and pull request. If any test fails, the CI run fails and the PR cannot be merged (assuming you have branch protection enabled). This is the gate we talked about earlier: code does not reach production unless it passes the tests.
Key things to notice:
npm ciinstead ofnpm install: this installs exact versions frompackage-lock.json, ensuring reproducible builds.- Separate test and coverage steps: run tests first for fast feedback, then coverage as a separate step.
- Upload artifacts: coverage reports are saved so you can download and review them later.
We will expand on this significantly in article five, covering caching, matrix builds, parallel test execution, and more.
Putting it all together
Let’s review what a well-tested project looks like. Here is the full directory structure:
testing-demo/
package.json
tsconfig.json
src/
utils.ts # Pure utility functions
utils.test.ts # Unit tests for utils
weather.ts # Module with external dependency
weather.test.ts # Unit tests with mocks
app.ts # Express application
app.test.ts # Integration tests with Supertest
Each layer of the pyramid is covered:
- Unit tests (
utils.test.ts,weather.test.ts): Fast, isolated, no external dependencies. These catch logic bugs.- Integration tests (
app.test.ts): Test the HTTP layer end to end (within the app). These catch wiring bugs.- E2E tests (not shown here): Would use a tool like Playwright or Cypress to test the full stack through a browser.
The testing workflow in a DevOps pipeline looks like this:
- Developer pushes code.
- CI runs unit tests (seconds).
- CI runs integration tests (seconds to minutes).
- CI runs E2E tests (minutes).
- If all pass, the code is eligible for deployment.
- If any fail, the pipeline stops and the developer is notified.
This is the fast feedback loop that makes DevOps work. You find bugs in minutes, not days.
Closing notes
Testing is not optional in a DevOps workflow. It is the foundation that makes everything else possible: continuous integration, continuous deployment, and the confidence to ship changes multiple times a day.
Start with unit tests. They are the cheapest and give you the most value per line of test code. Add integration tests for your API endpoints and critical data flows. Use E2E tests sparingly for your most important user journeys.
Do not chase coverage numbers. Focus on testing behavior that matters: business logic, edge cases, and integration points. A well-placed test that catches a real bug is worth more than a hundred tests that just inflate a coverage metric.
In the next article, we will take these tests and wire them into a proper CI/CD pipeline with GitHub Actions. You will see how to run tests automatically, cache dependencies for speed, and set up branch protection so untested code never reaches production.
Hope you found this useful and enjoyed reading it, until next time!
Errata
If you spot any error or have any suggestion, please send me a message so it gets fixed.
Also, you can check the source code and changes in the sources here
$ Comments
Online: 0Please sign in to be able to write comments.