Personal AI ops — a multi-agent operating partner

An AI agent as persistent operating partner for personal infrastructure. The foundation that lets one person ship at the pace this site shows — drift detection, runbooks, segmented identities, every change in git with rationale, the rails the proactive-AIOps turn will run on.

build time
7 weeks · 89 commits · daily use
first published
last updated

In one paragraph

This is how I run my personal infrastructure with an AI as operating partner — and what it lets me ship as a result. The artifact is a version-controlled homelab-ops repo: docs, runbooks, per-host sudoers references, baselines, operational scripts. The collaborator is Claude Code — Anthropic’s coding-agent CLI — held to project-specific instructions in a CLAUDE.md at the repo root, with a memory layer that accumulates feedback across sessions. The boundary is a hardened bastion LXC that holds the only SSH keys reaching the rest of the estate, so a repo compromise means bastion-only foothold rather than a full takeover. The point of all of it: personal infrastructure that runs itself frees the attention budget that ships everything else on this site. The longer arc is AIOps — the same rails are what proactive AI agents will need to act on infrastructure under supervision, then with diminishing supervision.

0 days
from first commit to today
0 commits
all with rationale messages
0 runbooks
for risky operations
0 categories
closed in one sitting

The attention cost this pattern eliminates

Personal infrastructure ages badly when it costs attention to maintain. Secrets accumulate in committed compose files. Sudoers grow wildcards. Drift between deployed state and the readme creeps in unnoticed. Security observability never gets set up because the budget for setting it up never arrives.

This pattern moves the catalog of “things to check this week” into a place a long-context AI agent re-reads every session: the repo, the runbooks, the memory layer. The agent inspects live state through the bastion, proposes changes against reference files, asks before doing anything destructive. My time goes into deciding and approving — and into the other systems shipped elsewhere on this site.

how operational hygiene scales for one operator before vs after the operating-partner pattern
manual catalog
  • the catalog of 'things to check' lives only in working memory — first thing dropped when attention is needed elsewhere
  • drift between deployed config and reference docs accumulates silently
  • secrets land in committed compose files; the refactor never gets time
  • sudoers grow wildcards as one-off needs are added; no audit ever follows
  • risky operations are reasoned about each time from scratch
agent-shared catalog
  • the catalog lives in the repo + memory layer; the agent re-reads it every session
  • drift is detected by a daily cron diffing live state against reference configs in git
  • secrets live in gitignored `.env` files with 0600 perms; a CI scanner blocks regressions
  • sudoers are a checked-in per-host reference; 'what is the agent allowed to do' is a grep, not a vibe
  • risky operations have a runbook in `docs/runbooks/` before they run; subsequent runs reuse it

How it’s wired

Three artifacts tied together. The operator surface is a terminal in Claude Code on whichever Mac I happen to be at; the agent inspects, drafts, and commits; the bastion is where its hands actually reach.

the three artifacts repo · agent · bastion — each carries a piece of the trust model
  1. 01
    the repo
    homelab-ops on Mac · Syncthing-replicated · holds docs, runbooks, references, baselines
  2. 02
    the agent
    Claude Code · CLAUDE.md sets identity + SOPs · memory layer accumulates across sessions
  3. 03
    the bastion
    thin Debian LXC · holds the only SSH keys reaching other hosts · tightest sudoers in fleet

The Mac holds exactly one outbound key — Mac → bastion. The bastion holds the chain into every other host. If the laptop is stolen or the repo leaks, the blast radius is “bastion-only foothold” — every other host stays unreachable.

The collaboration pattern

The agent runs a “recon before action” loop every session. Before it proposes a change against any host, it reads CLAUDE.md, the relevant memory file (one per recurring gotcha), and the target host’s reference config. Memory entries look like “on this host, sudo-rs doesn’t support wildcard subcommand patterns” or “Pi-hole v6 records live in pihole.toml’s hosts array, not in custom.list — gotchas the catalog now holds so the next session doesn’t re-discover them.

Three discipline rules govern what happens after recon:

  1. Every operational change ends in a commit. Commit messages explain why, not just what — the next session reads them as authoritative context.
  2. Destructive operations require explicit confirmation. The agent asks before deleting, rotating, or restarting; my так / yes is the gate.
  3. Defaults to the limited-sudo identity. A luna user lives on every host with a narrow checked-in sudoers; the agent uses my personal account only when luna’s whitelist is genuinely insufficient AND I’ve approved the exception.

That third rule is the one that pays the most dividends. What is the agent allowed to do, on which host is answerable in one sentence: “grep the configs/sudoers/luna-* files in the repo.”

agent boundary, per host checked into the repo as `configs/sudoers/luna-<host>`
criterion luna (agent's identity) max (personal identity)
default identity for routine ops
read live state full full
package upgrades, service restarts limited per-host whitelist unrestricted
container lifecycle (run/exec/build) dropped — read-only subcommands only unrestricted
shell-escape patterns (less, cat, vi) closed — bare paths only, no wildcards unrestricted
destructive ops (delete, rotate, wipe) requires opt-in escalation to personal account default privilege
auditability per-host reference file in git ad-hoc, by admin discretion

What “production-grade” means in practice

The hygiene this enforces is observable, not aspirational. Two cron jobs on the bastion run every day and post to Discord on any deviation:

sudoers drift
per host · deployed vs the in-git reference
SSH hardening
PermitRootLogin + PasswordAuthentication · per-host exceptions whitelisted inline
port baseline
loopback-aware diff vs the expected port list
failed sudo
count per host since last run
failed SSH
count per host since last run
FreePBX MCP handshake
via dedicated forced-command SSH key
FreePBX permissions
.pm2 directory mode
SUID inventory
diff vs baseline · new entries alert
cron inventory
diff vs baseline · new entries alert
cert expiry
all public-facing TLS
disk + RAM
thresholds per host
containers
restart loops + unhealthy state
services
is-active across the fleet
backup freshness
PBS snapshots within 24h
cluster quorum
Proxmox + qdevice over Tailscale
qdevice reachability
separate from quorum · cluster-survivability check
DNS resolution
both Pi-holes answering authoritatively
Tailscale node status
all nodes reachable across the mesh
git-repo update lag
tracked projects vs upstream HEAD

The reference configs themselves live in git: per-host sudoers, expected port lists, SUID baselines, cron baselines, reference compose files for tracked services. Drift = an unannounced change between deployed state and the reference. Drift fires an alert. Drift-clean across the fleet is the current state.

Throughput at full stride

A representative example of what this pattern produces, in a single sitting: eight hardening categories closed in succession — identity narrowing across the fleet, secrets hygiene, deploy-key restriction, container privilege reduction, credential rotation, cluster-quorum visibility, SSH key-only auth, and automated security updates. Each ended in its own commit with rationale. Each updated the runbook the next session reuses.

What looks like one morning’s work from the outside is a week of distributed attention compressed into a single session. The compression is the throughput. It’s also the lever: the attention budget this pattern factors out shows up directly as work shipped elsewhere on this site — five client systems, two production tools, an active writing surface — work that doesn’t exist if a personal estate is taxing weekly attention.

Why this matters: the AIOps foundation

Today’s pattern is reactive: I say, the agent does. The arc this project is bending toward is proactive: the agent watches live state continuously, sees problems first, decides what should change, asks permission before acting, and documents the outcome back into the same repo + memory layer the next session reads.

The proactive turn depends on the rails this project established. With a checked-in inventory of allowed actions, the proactive agent has a literal definition of “in-scope” to grep against. The rails are what make the proactive turn safe to take.

The first proactive components are already running:

  • A Hermes-class agent on a dedicated host — systemd service with outbound-only Discord access, persistent memory via honcho, a learning loop that distills skills out of experience.
  • A FreePBX MCP surface — the agent has scoped access to the on-prem PBX (call logs, extensions) through the same forced-command-SSH-key pattern that secures the public VPS deploy keys.
  • A multifunction-printer MCP surface — localhost-bound HTTP behind a systemd-managed scan/print service. The agent can scan documents and trigger prints, with the same containment.
  • A Notion workspace with agent-readable structure — naming prefixes, status properties, and a domain map that mirrors the agent’s internal model so the agent reads and writes into the same surface its human collaborator does.

The same agent pattern — read live state, propose action, ask permission, document outcome — transposes directly to commercial RevOps work with minimal new architecture. The homelab is the testbed where the proactive pattern gets debugged with low blast radius. The same agent that’s hardening sudoers today is the agent that, in a quarter or two, watches telemetry and proposes interventions before the Discord alert reaches me.

What this enables elsewhere

The other cases on this site exist because personal infrastructure doesn’t ask for weekly attention. Five client systems shipped this year. Two daily-use production tools maintained at production hygiene. An active writing surface. Skills shipping into the corporate kit. The attention budget that would have been spent here shows up as work shipped elsewhere on the same person’s calendar.

The pattern is what dissolves that cost. Every operational change ends in a commit with rationale; every gotcha lands as a memory note the agent re-reads next session; every risky operation has a runbook before it runs. The catalog of “things to check” lives in the agent’s working set, not in mine.

The standards applied are the ones a production team would apply to a 30-host estate: drift detection, forced-command keys, segmented identities, runbooks, blast-radius reasoning, baseline diffs, defense in depth concentrated at the boundary. The scale is one operator plus a long-context AI agent. The throughput is the proof — eight hardening categories closed in one sitting; eighty-nine commits across seven weeks; drift-clean across the fleet.

This is the foundation everything else on this site runs on top of.


This case describes the operating pattern, not the inventory. Specific sites and services hosted in this estate are deliberately left abstract; what matters is the agent-collaboration model and the governance scaffolding around it. Built from the homelab-ops repo (89 commits at HEAD 014e889), CLAUDE.md, docs/security.md, and the 2026-05-24/25 hardening session transcript.