DevOps from Zero to Hero: What It Actually Means and Why You Should Care

2026-04-21 | Gabriel Garrido | 10 min read
Share:

Support this blog

If you find this content useful, consider supporting the blog.

Introduction

This is the first article in a twenty-part series called “DevOps from Zero to Hero.” The goal is to take you from knowing nothing about DevOps to being comfortable with the tools and practices that modern teams use every day. We will use TypeScript, AWS, Kubernetes, and GitHub Actions throughout the series, building real things along the way.


But before we touch any tools, we need to understand what DevOps actually is. This word gets thrown around a lot. Job postings ask for “DevOps Engineers,” companies buy “DevOps tools,” and somehow everyone has a different definition. In this article we are going to cut through the noise and talk about what DevOps really means, where it came from, how to measure it, and what it is definitely not.


Let’s get into it.


What is DevOps?

DevOps is not a tool. It is not a job title. It is not a team you create so developers can stop caring about production. DevOps is a combination of cultural practices, processes, and tools that increases an organization’s ability to deliver software faster and more reliably.


The simplest way to think about it: DevOps is about removing the walls between the people who write code and the people who run it in production.


There are three pillars to DevOps:


  • Culture: Teams share responsibility for the full lifecycle of their software, from writing it to running it
  • Practices: Continuous integration, continuous delivery, infrastructure as code, monitoring, and fast feedback loops
  • Tools: The automation that makes those practices possible at scale

If you only adopt the tools without changing how your teams work, you are not doing DevOps. You are just automating the same broken process. This is a critical point that many organizations miss.


A brief history: the wall of confusion

To understand why DevOps exists, you need to know what came before it. For decades, software organizations had two separate groups:


  • Development (Dev): Writes the code, ships features, moves fast, wants to deploy often
  • Operations (Ops): Runs the servers, keeps things stable, moves carefully, wants to deploy never

These two groups had completely different incentives. Dev wanted change because change meant new features. Ops wanted stability because change meant risk. The handoff between them was called “the wall of confusion.” Dev would throw code over the wall, Ops would try to figure out how to run it, and when things broke, everyone blamed each other.


This created a painful cycle:


  • Deployments were rare (monthly or quarterly) because they were risky and stressful
  • Each deployment was huge because all the changes piled up
  • Huge deployments meant more things could go wrong
  • When things went wrong, it took forever to figure out which change caused the problem
  • So deployments became even more rare, and the cycle continued

In 2008 and 2009, a few people started talking about breaking this cycle. Patrick Debois organized the first “DevOpsDays” conference in Ghent, Belgium in 2009. The idea was simple: what if Dev and Ops worked together instead of against each other? What if we deployed small changes frequently instead of big changes rarely? What if we automated everything that could be automated?


These ideas were not entirely new. Google had been practicing something similar internally for years (they later published it as Site Reliability Engineering). But the DevOps movement gave it a name and made it accessible to everyone, not just companies with Google-scale resources.


The DORA metrics: measuring DevOps performance

One of the most important contributions to the DevOps movement came from the DORA (DevOps Research and Assessment) team, led by Dr. Nicole Forsgren, Jez Humble, and Gene Kim. They spent years researching what separates high-performing teams from low-performing ones. Their findings were published in the book “Accelerate” and in annual State of DevOps reports.


They identified four key metrics that predict software delivery performance:


  • Deployment Frequency: How often your team deploys to production. Elite teams deploy on demand, multiple times per day. Low performers deploy monthly or less.
  • Lead Time for Changes: How long it takes from a code commit to that code running in production. Elite teams measure this in less than one hour. Low performers take between one and six months.
  • Change Failure Rate: What percentage of deployments cause a failure in production that requires a fix (rollback, patch, etc.). Elite teams have a rate of 0-15%. Low performers hit 46-60%.
  • Mean Time to Recovery (MTTR): When something breaks in production, how long does it take to restore service? Elite teams recover in less than one hour. Low performers take between one week and one month.

Here is the key insight from their research: these four metrics are correlated. Teams that deploy more frequently also have lower failure rates and faster recovery times. Speed and stability are not enemies. They reinforce each other.


Traditional thinking:
  "If we deploy more often, more things will break"

What DORA research actually shows:
  "Teams that deploy more often break fewer things AND recover faster"

Why? Because:
  - Smaller changes are easier to understand and debug
  - Frequent deployments mean faster feedback loops
  - Fast feedback loops mean problems get caught earlier
  - Earlier problems are cheaper and simpler to fix

This might feel counterintuitive at first. But think about it this way: would you rather debug a deployment that contains 3 commits or one that contains 300? The answer is obvious. Deploying frequently forces you to keep changes small, and small changes are inherently less risky.


DevOps vs SRE vs Platform Engineering

You will hear these three terms used interchangeably, but they are distinct (and complementary) disciplines. Understanding how they relate will save you a lot of confusion.


DevOps is the cultural movement. It is the philosophy that says Dev and Ops should work together, share responsibility, and use automation to deliver software faster and more reliably. DevOps is about principles: you own what you build, you automate everything you can, and you measure outcomes.


Site Reliability Engineering (SRE) is one way to implement DevOps principles. Google created it in the early 2000s before the term “DevOps” even existed. SRE treats operations as a software engineering problem. SRE teams write code to automate operational work, define Service Level Objectives (SLOs) to measure reliability, and use error budgets to balance reliability with feature velocity.


Ben Treynor Sloss, the founder of Google’s SRE team, described it this way:


"SRE is what happens when you ask a software engineer to design an operations function."

If DevOps is the “what” (principles and culture), SRE is one answer to the “how” (specific practices and frameworks).


Platform Engineering is the newest of the three. It emerged as organizations realized that asking every development team to fully own their infrastructure was not scaling. Platform Engineering teams build internal developer platforms (IDPs) that abstract away infrastructure complexity. Instead of every team learning Kubernetes, Terraform, and CI/CD pipelines from scratch, the platform team provides golden paths, templates, and self-service tools.


Think of it this way:


DevOps says:     "You build it, you run it"
SRE says:        "Here are the practices and metrics to run it well"
Platform Eng:    "Here is a platform that makes running it easy"

These three approaches are not competing. In a mature organization, they work together. DevOps provides the culture, SRE provides the reliability framework, and Platform Engineering provides the developer experience layer on top.


The DevOps toolchain

While DevOps is not just about tools, the tools do matter. They are what make the practices possible at scale. Here is the typical DevOps toolchain, organized by stage:


Plan and track

  • Issue trackers (GitHub Issues, Jira, Linear)
  • Project boards, documentation wikis

Version control

  • Git (GitHub, GitLab, Bitbucket)
  • Branching strategies, pull requests, code review

Continuous Integration (CI)

  • Automatically build, test, and validate every code change
  • Tools: GitHub Actions, GitLab CI, Jenkins, CircleCI

Continuous Delivery/Deployment (CD)

  • Automatically deploy validated changes to production
  • Tools: ArgoCD, Flux, Spinnaker, GitHub Actions

Containers and orchestration

  • Package applications consistently across environments
  • Tools: Docker, Kubernetes, ECS

Infrastructure as Code (IaC)

  • Define and manage infrastructure through code, not clicking in consoles
  • Tools: Terraform, Pulumi, AWS CDK, CloudFormation

Monitoring and observability

  • Know what is happening in production before your users tell you
  • Tools: Prometheus, Grafana, Datadog, OpenTelemetry

Security

  • Shift security left, automate scanning, manage secrets
  • Tools: Trivy, Snyk, HashiCorp Vault, GitHub security features

In this series we will focus on a specific subset of these tools: TypeScript for application code, GitHub Actions for CI/CD, Docker for containers, Kubernetes for orchestration, and AWS for cloud infrastructure. This stack is widely used, well documented, and gives you skills that transfer to almost any organization.


What this series will cover

Here is the roadmap for the twenty articles in this series:


  • Article 1 (this one): What DevOps actually means
  • Articles 2-3: Version control with Git and GitHub workflows
  • Articles 4-5: Containers with Docker, from basics to multi-stage builds
  • Articles 6-8: CI/CD with GitHub Actions, from simple pipelines to advanced workflows
  • Articles 9-11: Cloud fundamentals with AWS (networking, compute, storage)
  • Articles 12-14: Kubernetes from scratch, deploying and managing real applications
  • Articles 15-16: Infrastructure as Code with Terraform
  • Articles 17-18: Monitoring, logging, and observability
  • Article 19: Security practices and secrets management
  • Article 20: Putting it all together, a complete DevOps pipeline from commit to production

Each article builds on the previous ones. By the end of the series, you will have built a complete pipeline that takes a TypeScript application from a git commit all the way to a production Kubernetes cluster on AWS, with automated testing, security scanning, monitoring, and alerting.


Who is this for?


  • Developers who want to understand what happens to their code after they push it
  • Junior engineers or students who want to learn modern DevOps practices from scratch
  • Ops people who want to adopt a more engineering-driven approach
  • Anyone who keeps hearing “DevOps” in meetings and wants to actually understand what it means

You do not need prior experience with any of the tools we will use. I will explain everything from the ground up. Basic programming knowledge and comfort with the command line are helpful but not strictly required.


What DevOps is NOT: common anti-patterns

Let’s close with something equally important: what DevOps is not. These are real anti-patterns that organizations fall into constantly.


Anti-pattern 1: Renaming your Ops team to “DevOps”


If you take your existing operations team, change their title to “DevOps Engineer,” and nothing else changes, you have not adopted DevOps. You have renamed a team. DevOps requires cultural change, not just a title change.


Anti-pattern 2: Buying tools and calling it DevOps


Purchasing a CI/CD platform, a container orchestrator, and a monitoring tool does not make you a DevOps organization. Tools without the right practices and culture are just expensive shelfware. I have seen organizations spend millions on tooling while their teams still deploy manually every two weeks.


Anti-pattern 3: Creating a DevOps silo


The irony of this one is painful. DevOps was created to break down silos between Dev and Ops. Some organizations responded by creating a third silo called “the DevOps team” that sits between Dev and Ops. Now you have three walls of confusion instead of one.


Anti-pattern 4: All tools, no culture


This is worth repeating because it is the most common mistake. If your developers write code and then throw it over the wall to someone else to deploy, you are not doing DevOps no matter what tools you use. DevOps means shared ownership. The team that builds the software is responsible for running it.


Anti-pattern 5: DevOps means “developers do everything”


DevOps does not mean firing your ops team and making developers manage servers. It means that development and operations work together, share knowledge, and both contribute to automation. Developers gain operational awareness, and ops engineers gain development skills. The goal is collaboration, not consolidation.


Closing notes

DevOps is, at its core, a simple idea: the people who build software and the people who run it should work together, share responsibility, and use automation to move faster without sacrificing stability. The DORA metrics prove that this approach works. Speed and reliability are not opposites. They go hand in hand.


In the next article, we will start getting practical. We will set up a development environment, create a TypeScript project, initialize a Git repository, and learn the version control fundamentals that everything else in this series will build on.


Hope you found this useful and enjoyed reading it, until next time!


Errata

If you spot any error or have any suggestion, please send me a message so it gets fixed.

Also, you can check the source code and changes in the sources here



$ Comments

Online: 0

Please sign in to be able to write comments.

2026-04-21 | Gabriel Garrido