NetBox-Driven Infrastructure as Code: Making the Inventory the Source of Truth

Managed IT service provider
Managing 100+ devices and services

The Challenge

An IT operation responsible for more than a hundred devices and services was tracking its estate in spreadsheets and tribal knowledge, while its infrastructure-as-code repositories slowly drifted away from what was actually deployed. Nobody could say with confidence which record was right: the inventory, the code, or the running systems. Every audit meant a manual reconciliation exercise, and every change carried the risk of being applied against stale assumptions.

The Approach

We made NetBox the single source of truth and built agent-driven reconciliation around it: automated processes that continuously compare the inventory, the infrastructure-as-code definitions, and the deployed state, then surface—and where appropriate resolve—the differences. Everything ran dry-run by default, with a tiered approval model separating read-only reconnaissance from non-destructive checks and from gated changes requiring explicit sign-off. The aim was to replace periodic manual audits with a continuous loop that keeps all three views of the estate in agreement.

System Architecture

Key Components

  • NetBox as Source of Truth: Devices, services, IP addressing, and ownership modelled in NetBox as the authoritative record the rest of the system reads from.
  • Discovery & Ingestion Agents: Scheduled collectors that observe the deployed estate and normalise what they find into a comparable form.
  • Reconciliation Engine: Compares NetBox, the infrastructure-as-code repositories, and observed state, classifying each discrepancy as inventory drift, code drift, or deployment drift.
  • Tiered Approval Workflow: Read-only reconnaissance runs freely; non-destructive checks run on schedule; anything that changes a system is gated behind explicit human sign-off.
  • Dry-Run Execution: Every proposed change is rendered as a diff and reviewed before any apply step is permitted.
  • Drift Alerting: Discrepancies raise notifications with enough context to act on, rather than waiting to be discovered at audit time.
  • Audit Trail: Immutable log of every agent action—what was read, what was proposed, what was approved, and what was changed.

What Was Built

The system was implemented as a set of scheduled agents and automation workflows integrating NetBox's API with the existing configuration management and infrastructure-as-code tooling, rather than replacing any of it. Custom reconciliation logic handled the estate's local conventions, and all runs were logged centrally so that the state of the estate—and every decision made about it—could be evidenced on demand.

Measurable Outcome

Drift detection became automatic rather than an occasional project: discrepancies between inventory, code, and deployed state were surfaced within hours instead of being discovered months later during an audit. Manual reconciliation effort dropped materially, because the routine comparison work was handled by agents and humans were only involved where a decision was genuinely required.

Just as importantly, confidence in the inventory recovered. Once the team could trust that NetBox reflected reality, it became the natural starting point for change planning, onboarding, and incident response rather than a record nobody quite believed.

Lessons Learned

Declaring a source of truth is easy; earning trust in it is the real work, and that only happened once reconciliation ran continuously rather than occasionally. Dry-run by default proved essential—seeing proposed changes as diffs before anything was applied made the approval tiers meaningful rather than theatrical. Finally, classifying drift by origin (inventory, code, or deployment) mattered more than merely detecting it, because each type calls for a different fix.

Why This Approach Worked

This case study demonstrates that infrastructure as code only stays trustworthy when something is continuously checking it against reality. Making the inventory authoritative and putting agents in charge of the routine comparison work turns drift from a slow, invisible liability into an operational signal that gets handled the same day it appears.