DevOps from Zero to Hero: What It Actually Means and Why You Should Care

2026-04-21 | Gabriel Garrido | 10 min read

[devops] 20

[beginners] 20

[culture] 1

0 0 0

On this page

Introduction
What is DevOps?
A brief history: the wall of confusion
The DORA metrics: measuring DevOps performance
DevOps vs SRE vs Platform Engineering
The DevOps toolchain
What this series will cover
What DevOps is NOT: common anti-patterns
Closing notes
Errata

Support this blog

If you find this content useful, consider supporting the blog.

Buy Me a Coffee

Introduction#

This is the first article in a twenty-part series called "DevOps from Zero to Hero." The goal is to take you from knowing nothing about DevOps to being comfortable with the tools and practices that modern teams use every day. We will use TypeScript, AWS, Kubernetes, and GitHub Actions throughout the series, building real things along the way.

But before we touch any tools, we need to understand what DevOps actually is. This word gets thrown around a lot. Job postings ask for "DevOps Engineers," companies buy "DevOps tools," and somehow everyone has a different definition. In this article we are going to cut through the noise and talk about what DevOps really means, where it came from, how to measure it, and what it is definitely not.

Let's get into it.

What is DevOps?#

DevOps is not a tool. It is not a job title. It is not a team you create so developers can stop caring about production. DevOps is a combination of cultural practices, processes, and tools that increases an organization's ability to deliver software faster and more reliably.

The simplest way to think about it: DevOps is about removing the walls between the people who write code and the people who run it in production.

There are three pillars to DevOps:

Culture: Teams share responsibility for the full lifecycle of their software, from writing it to running it

Practices: Continuous integration, continuous delivery, infrastructure as code, monitoring, and fast feedback loops

Tools: The automation that makes those practices possible at scale

If you only adopt the tools without changing how your teams work, you are not doing DevOps. You are just automating the same broken process. This is a critical point that many organizations miss.

A brief history: the wall of confusion#

To understand why DevOps exists, you need to know what came before it. For decades, software organizations had two separate groups:

Development (Dev): Writes the code, ships features, moves fast, wants to deploy often

Operations (Ops): Runs the servers, keeps things stable, moves carefully, wants to deploy never

These two groups had completely different incentives. Dev wanted change because change meant new features. Ops wanted stability because change meant risk. The handoff between them was called "the wall of confusion." Dev would throw code over the wall, Ops would try to figure out how to run it, and when things broke, everyone blamed each other.

This created a painful cycle:

Deployments were rare (monthly or quarterly) because they were risky and stressful

Each deployment was huge because all the changes piled up

Huge deployments meant more things could go wrong

When things went wrong, it took forever to figure out which change caused the problem

So deployments became even more rare, and the cycle continued

In 2008 and 2009, a few people started talking about breaking this cycle. Patrick Debois organized the first "DevOpsDays" conference in Ghent, Belgium in 2009. The idea was simple: what if Dev and Ops worked together instead of against each other? What if we deployed small changes frequently instead of big changes rarely? What if we automated everything that could be automated?

These ideas were not entirely new. Google had been practicing something similar internally for years (they later published it as Site Reliability Engineering). But the DevOps movement gave it a name and made it accessible to everyone, not just companies with Google-scale resources.

The DORA metrics: measuring DevOps performance#

One of the most important contributions to the DevOps movement came from the DORA (DevOps Research and Assessment) team, led by Dr. Nicole Forsgren, Jez Humble, and Gene Kim. They spent years researching what separates high-performing teams from low-performing ones. Their findings were published in the book "Accelerate" and in annual State of DevOps reports.

They identified four key metrics that predict software delivery performance:

Deployment Frequency: How often your team deploys to production. Elite teams deploy on demand, multiple times per day. Low performers deploy monthly or less.

Lead Time for Changes: How long it takes from a code commit to that code running in production. Elite teams measure this in less than one hour. Low performers take between one and six months.

Change Failure Rate: What percentage of deployments cause a failure in production that requires a fix (rollback, patch, etc.). Elite teams have a rate of 0-15%. Low performers hit 46-60%.

Mean Time to Recovery (MTTR): When something breaks in production, how long does it take to restore service? Elite teams recover in less than one hour. Low performers take between one week and one month.

Here is the key insight from their research: these four metrics are correlated. Teams that deploy more frequently also have lower failure rates and faster recovery times. Speed and stability are not enemies. They reinforce each other.

Traditional thinking:
  "If we deploy more often, more things will break"

What DORA research actually shows:
  "Teams that deploy more often break fewer things AND recover faster"

Why? Because:
  - Smaller changes are easier to understand and debug
  - Frequent deployments mean faster feedback loops
  - Fast feedback loops mean problems get caught earlier
  - Earlier problems are cheaper and simpler to fix

This might feel counterintuitive at first. But think about it this way: would you rather debug a deployment that contains 3 commits or one that contains 300? The answer is obvious. Deploying frequently forces you to keep changes small, and small changes are inherently less risky.

DevOps vs SRE vs Platform Engineering#

You will hear these three terms used interchangeably, but they are distinct (and complementary) disciplines. Understanding how they relate will save you a lot of confusion.

DevOps is the cultural movement. It is the philosophy that says Dev and Ops should work together, share responsibility, and use automation to deliver software faster and more reliably. DevOps is about principles: you own what you build, you automate everything you can, and you measure outcomes.

Site Reliability Engineering (SRE) is one way to implement DevOps principles. Google created it in the early 2000s before the term "DevOps" even existed. SRE treats operations as a software engineering problem. SRE teams write code to automate operational work, define Service Level Objectives (SLOs) to measure reliability, and use error budgets to balance reliability with feature velocity.

Ben Treynor Sloss, the founder of Google's SRE team, described it this way:

"SRE is what happens when you ask a software engineer to design an operations function."

If DevOps is the "what" (principles and culture), SRE is one answer to the "how" (specific practices and frameworks).

Platform Engineering is the newest of the three. It emerged as organizations realized that asking every development team to fully own their infrastructure was not scaling. Platform Engineering teams build internal developer platforms (IDPs) that abstract away infrastructure complexity. Instead of every team learning Kubernetes, Terraform, and CI/CD pipelines from scratch, the platform team provides golden paths, templates, and self-service tools.

Think of it this way:

DevOps says:     "You build it, you run it"
SRE says:        "Here are the practices and metrics to run it well"
Platform Eng:    "Here is a platform that makes running it easy"

These three approaches are not competing. In a mature organization, they work together. DevOps provides the culture, SRE provides the reliability framework, and Platform Engineering provides the developer experience layer on top.

The DevOps toolchain#

While DevOps is not just about tools, the tools do matter. They are what make the practices possible at scale. Here is the typical DevOps toolchain, organized by stage:

Plan and track

Issue trackers (GitHub Issues, Jira, Linear)

Project boards, documentation wikis

Version control

Git (GitHub, GitLab, Bitbucket)

Branching strategies, pull requests, code review

Continuous Integration (CI)

Automatically build, test, and validate every code change

Tools: GitHub Actions, GitLab CI, Jenkins, CircleCI

Continuous Delivery/Deployment (CD)

Automatically deploy validated changes to production

Tools: ArgoCD, Flux, Spinnaker, GitHub Actions

Containers and orchestration

Package applications consistently across environments

Tools: Docker, Kubernetes, ECS

Infrastructure as Code (IaC)

Define and manage infrastructure through code, not clicking in consoles

Tools: Terraform, Pulumi, AWS CDK, CloudFormation

Monitoring and observability

Know what is happening in production before your users tell you

Tools: Prometheus, Grafana, Datadog, OpenTelemetry

Security

Shift security left, automate scanning, manage secrets

Tools: Trivy, Snyk, HashiCorp Vault, GitHub security features

In this series we will focus on a specific subset of these tools: TypeScript for application code, GitHub Actions for CI/CD, Docker for containers, Kubernetes for orchestration, and AWS for cloud infrastructure. This stack is widely used, well documented, and gives you skills that transfer to almost any organization.

What this series will cover#

Here is the roadmap for the twenty articles in this series:

Article 1 (this one): What DevOps actually means

Articles 2-3: Version control with Git and GitHub workflows

Articles 4-5: Containers with Docker, from basics to multi-stage builds

Articles 6-8: CI/CD with GitHub Actions, from simple pipelines to advanced workflows

Articles 9-11: Cloud fundamentals with AWS (networking, compute, storage)

Articles 12-14: Kubernetes from scratch, deploying and managing real applications

Articles 15-16: Infrastructure as Code with Terraform

Articles 17-18: Monitoring, logging, and observability

Article 19: Security practices and secrets management

Article 20: Putting it all together, a complete DevOps pipeline from commit to production

Each article builds on the previous ones. By the end of the series, you will have built a complete pipeline that takes a TypeScript application from a git commit all the way to a production Kubernetes cluster on AWS, with automated testing, security scanning, monitoring, and alerting.

Who is this for?

Developers who want to understand what happens to their code after they push it

Junior engineers or students who want to learn modern DevOps practices from scratch

Ops people who want to adopt a more engineering-driven approach

Anyone who keeps hearing "DevOps" in meetings and wants to actually understand what it means

You do not need prior experience with any of the tools we will use. I will explain everything from the ground up. Basic programming knowledge and comfort with the command line are helpful but not strictly required.

What DevOps is NOT: common anti-patterns#

Let's close with something equally important: what DevOps is not. These are real anti-patterns that organizations fall into constantly.

Anti-pattern 1: Renaming your Ops team to "DevOps"

If you take your existing operations team, change their title to "DevOps Engineer," and nothing else changes, you have not adopted DevOps. You have renamed a team. DevOps requires cultural change, not just a title change.

Anti-pattern 2: Buying tools and calling it DevOps

Purchasing a CI/CD platform, a container orchestrator, and a monitoring tool does not make you a DevOps organization. Tools without the right practices and culture are just expensive shelfware. I have seen organizations spend millions on tooling while their teams still deploy manually every two weeks.

Anti-pattern 3: Creating a DevOps silo

The irony of this one is painful. DevOps was created to break down silos between Dev and Ops. Some organizations responded by creating a third silo called "the DevOps team" that sits between Dev and Ops. Now you have three walls of confusion instead of one.

Anti-pattern 4: All tools, no culture

This is worth repeating because it is the most common mistake. If your developers write code and then throw it over the wall to someone else to deploy, you are not doing DevOps no matter what tools you use. DevOps means shared ownership. The team that builds the software is responsible for running it.

Anti-pattern 5: DevOps means "developers do everything"

DevOps does not mean firing your ops team and making developers manage servers. It means that development and operations work together, share knowledge, and both contribute to automation. Developers gain operational awareness, and ops engineers gain development skills. The goal is collaboration, not consolidation.

Closing notes#

DevOps is, at its core, a simple idea: the people who build software and the people who run it should work together, share responsibility, and use automation to move faster without sacrificing stability. The DORA metrics prove that this approach works. Speed and reliability are not opposites. They go hand in hand.

In the next article, we will start getting practical. We will set up a development environment, create a TypeScript project, initialize a Git repository, and learn the version control fundamentals that everything else in this series will build on.

Hope you found this useful and enjoyed reading it, until next time!

Errata#

If you spot any error or have any suggestion, please send me a message so it gets fixed.

Also, you can check the source code and changes in the sources here

$ Comments

Online: 0

Please sign in to be able to write comments.

2026-04-21 | Gabriel Garrido

$ Related Posts

> DevOps from Zero to Hero: Cost Optimization and What Comes Next (2026-06-17)

> DevOps from Zero to Hero: Incident Response and On-Call (2026-06-14)

> DevOps from Zero to Hero: Security Hardening (2026-06-11)