How to build an autonomous red team, in Rust, by hand.
This is the working notebook for Kerox — a Rust-native, terminal-first, multi-agent autonomous red team. None of it is finished; this book is the design thinking as it happens. Sections that describe something still being built are marked ETA so you know what is real and what is planned.
Overview
Kerox is a Rust-native, terminal-first, vendor-neutral autonomous red team. The short version: an orchestrator reads an engagement plan and pursues an objective the way an adversary would — not the way a scanner does. It is open, it is in progress, and it is built by hand.
Most automated "offensive" tooling runs a fixed battery of checks and prints a report. That is useful, but it is not what an attacker does. An attacker has a goal, improvises a path toward it, and chains small wins into a big one. Kerox is an attempt to build a system that works that second way — under tight discipline, against authorized scope only.
What you will find in this book
- ›The agent architecture — one orchestrator, a roster of specialists
- ›How the orchestrator turns an engagement plan into a real attack chain
- ›
spearhead, the LLM/AI red-team agent, and how findings map to OWASP LLM Top 10 and MITRE ATLAS - ›The engagement package — RoE, ConOps, Deconfliction, OPPLAN — and why it comes before any packet
- ›The human gate, the dry-run default, and the sandbox the whole thing runs inside
- ›The attack → defend → verify loop that points offense back at defense
- ›The
krxCLI, the planned tech stack, and how to contribute
How to read it
Read top to bottom for the first 30 minutes to get the shape of the system. After that, jump by chapter — each section stands on its own. Where a chapter shows config or commands, treat it as a design sketch, not a shipped interface; the names will move before they settle.
Architecture
The architecture is deliberately small: one orchestrator that owns the plan and the decisions, and a set of specialist agents it dispatches to do the actual work. The orchestrator never touches a target directly — it reasons, it sequences, and it hands concrete tasks to agents that know one domain well.
engagement plan (authorized scope)
│
┌───────▼────────┐
│ ORCHESTRATOR │ reads plan · sequences chain
│ (human gate) │ holds every live action
└───────┬────────┘
│ dispatch
┌─────────┬───────┼────────┬──────────┐
▼ ▼ ▼ ▼ ▼
spearhead network report web (more…)
LLM/AI surface write-up apps
│ │ │ │ │
└─────────┴───────┴────────┴──────────┘
│
isolated Kali sandbox
(dedicated op network)Why split it this way
- ›The orchestrator carries the goal and the rules; agents carry the craft. Neither leaks into the other.
- ›Agents are replaceable. A better network agent drops in without the orchestrator noticing.
- ›Every live action funnels back through one place — the gate — so there is exactly one chokepoint to audit.
- ›Vendor-neutral by design: nothing is tied to a single model provider or a single C2.
Written in Rust, run from a terminal
The control plane is Rust — for the type system, the error discipline, and a single static binary that is easy to reason about. The interface is a terminal, because that is where this work actually happens and because a TUI is honest about what it is doing.
The orchestrator
The orchestrator is the brain. It reads an engagement plan, fixes on an objective, and works toward it through whatever path opens up — chaining reconnaissance, exploitation, privilege escalation, lateral movement, and C2. When one route closes, it backs up and tries another. This is the part that makes Kerox an adversary and not a checklist.
The loop
- ›Orient — read the plan, the scope, and whatever the agents have learned so far
- ›Decide — pick the next technique that moves the objective forward
- ›Gate — if the step is a live action, stop and ask a human
- ›Act — dispatch the approved task to the right specialist agent
- ›Observe — fold the result back in, then orient again
Mapped to ATT&CK as it goes
Every step the orchestrator plans is tagged with the MITRE ATT&CK technique it corresponds to, so the chain reads like a real operation and so the eventual report speaks the language a blue team already uses.
# illustrative — names and IDs will move recon T1595 active scanning [plan] access T1190 exploit public app [plan] privesc T1068 exploit for escalation [plan] lateral T1021 remote services [plan] collect T1119 automated collection [plan] c2 T1071 application-layer C2 [plan] # nothing fires until a human approves the step
Spearhead — LLM red team
Spearhead is the lead agent and the reason Kerox exists in the shape it does. It is pointed at the AI in the stack — the chatbots, copilots, and tool-using agents that are now wired into real systems — and it probes the failure modes that are unique to language models.
What it probes
- ›Prompt injection — getting the model to follow attacker text instead of its instructions
- ›System-prompt leakage — pulling the hidden instructions and configuration back out
- ›Guardrail bypass — routing around the safety layer to reach restricted behavior
- ›Tool-call exfiltration — abusing the model's tools to move data it should never move
Mapped to the frameworks defenders use
A finding that nobody can act on is noise. Spearhead is designed to report every result against the OWASP LLM Top 10 and MITRE ATLAS, so it lands in vocabulary a security team already has policies and detections for.
# illustrative mapping prompt injection LLM01 AML.T0051 system-prompt leak LLM07 AML.T0054 guardrail bypass LLM02 AML.T0054 tool-call exfil LLM06 AML.T0057
Spearhead leads; the supporting agents follow it onto the rest of the attack surface once it has found a way in — or once it has proven there isn't one.
Supporting agents
Spearhead handles the model. The rest of the roster handles everything around it — the conventional surface, and turning the run into something a defender can act on. Each one is a specialist with a narrow brief.
network
Maps the attack surface and works it. Recon and enumeration first — the handful of openings that matter, not a thousand low-signal findings — then services, trust paths, and lateral movement once a foothold lands.
report
Turns the engagement into a deliverable. Narrative plus findings, mapped to MITRE ATT&CK and ATLAS, rendered as Markdown, JSON, or SARIF — including the dropped and ruled-out reasons, which are half the value.
web, appsec — later
The web surface (injection, access-control, logic flaws) and the source-code surface are designed into the roster and stubbed for now. They ship once the wedge, the network agent, and the report agent are solid.
Engagement discipline
This is the part that separates a red team from a vandal. Before a packet leaves the wire, Kerox writes the engagement down — and then is built to refuse to step outside what it wrote.
The engagement package
- ›Rules of Engagement (RoE) — what is in scope, what is off-limits, the hours, the hard stops
- ›ConOps — the concept of operations: what we are trying to achieve and how, in plain language
- ›Deconfliction Plan — how to tell our activity apart from a real incident, and who to call
- ›OPPLAN — the operational plan, with each intended action mapped to MITRE ATT&CK
The package is generated first, reviewed by a human, and then treated as the boundary for everything that follows. An action that is not covered by the package does not run.
# illustrative scope file — the boundary, not a suggestion
scope:
in: ["10.10.0.0/24", "app.example-lab.internal"]
out: ["*.prod.example.com", "anything not listed"]
hours: "Mon–Fri 09:00–17:00, operator timezone"
hard_stops:
- "any sign of real user impact"
- "any host outside 'in'"The human gate
Autonomy without a brake is just a liability. Kerox is autonomous in how it reasons and plans, but every live action is dry-run by default and stops at an explicit human approval before it touches anything real.
Dry-run is the default, not an option
In its planning mode, Kerox produces the full chain — every technique, every command, every target — without sending a thing. You read the plan. You approve the steps you want. Only then does a live action go out, and only the steps you approved.
$ krx run --plan engagement.kx # build the chain, send nothing → 11 steps planned · 0 executed · review required $ krx approve --step 03 # arm a single live action → step 03 armed · scope check passed $ krx run --execute --step 03 # fire only what was approved → [HOLD] confirm: live action against 10.10.0.14 ? [y/N]
- ›Scope is re-checked at execution time, not just at plan time
- ›Approvals are per-step, not a single blanket "go"
- ›The default answer to every prompt is no
Interactive tooling
Real offensive tools are interactive. They drop you into a console, ask questions, and keep state across a session. A lot of automation pretends this isn't true and scripts around it; Kerox is being built to sit inside it instead.
Persistent terminal sessions
The agents drive tools like msfconsole, sliver-client, and evil-winrm inside persistent terminal sessions — sending input, reading output, and answering interactive prompts the way a person at the keyboard would. A session that is established stays established; a pivot does not drop because the automation forgot to hold the handle.
- ›Tools run in their native console, not behind a brittle one-shot wrapper
- ›Interactive prompts are handled, not avoided
- ›Session state survives across steps in the chain
The sandbox
Offense stays in a box. Operations are designed to run inside an isolated Kali sandbox on a dedicated operational network, kept separate from the management plane that drives the engagement.
Two planes, on purpose
- ›Management plane — where you read plans, give approvals, and read results. It never touches a target.
- ›Operational plane — the Kali sandbox on its own network, where the agents and their tools actually run.
The split means the machine you sit at is not the machine that runs exploits, and a target can never reach back past the sandbox to the place where decisions are made.
Attack → defend → verify
The reason to build a disciplined attacker is to make defense better. Kerox is designed around a loop that turns each finding into a concrete defensive improvement and then proves the improvement actually holds.
The loop
- ›Attack — the chain reaches the objective, or gets as far as it can, and records exactly how
- ›Defend — each step becomes a candidate fix: a detection, a control, a config change
- ›Verify — re-run the same step against the hardened system and confirm it now fails
A finding you cannot reproduce is a rumor; a fix you cannot verify is a hope. Closing the loop is what makes the offense worth doing.
The krx CLI
The whole thing is driven from one terminal command: krx. The verbs below are a design sketch — they will change — but they show the intended shape: plan, review, approve, run, report.
- ›
krx plan— generate the engagement package and the dry-run attack chain - ›
krx scope— validate a target against the authorized scope file - ›
krx spearhead— point the LLM agent at a target and map findings to ATLAS - ›
krx approve— arm a specific step for live execution - ›
krx run— execute approved steps, or the whole chain in dry-run - ›
krx verify— re-run a step against the hardened system to confirm the fix - ›
krx report— write the engagement up, ATT&CK- and ATLAS-mapped
krx yet. The commands here describe where it is headed, not something you can install today.Tech stack
Planned, and subject to change as the build teaches us things. The bias is toward a small number of well-understood pieces over a large framework.
- ›Rust — the control plane, the orchestrator, and the agent harness; one static binary
- ›Terminal-first — a TUI for plans, approvals, and live session output
- ›Vendor-neutral LLM access — Spearhead is not tied to a single model provider
- ›Kali — the operational sandbox image and its tooling
- ›Existing offensive tools — driven, not reinvented: msfconsole, sliver-client, evil-winrm, and friends
- ›MITRE ATT&CK + ATLAS, OWASP LLM Top 10 — the mapping vocabulary, not dependencies
Authorization & ethics
Kerox is an offensive tool, and offensive tools have to be honest about what they are for. It is built to be run by people with permission, against systems they are allowed to test, inside a scope they wrote down. The discipline chapters above are not decoration — they are the whole design constraint.
What that means in practice
- ›Authorized scope only. No scope file, no run.
- ›Dry-run by default. Live actions need an explicit human yes.
- ›Operations stay in the sandbox, on the operational network.
- ›Findings exist to be fixed, not collected.
Reporting an issue in Kerox itself
When there is something to report, security contact goes to security@kerox.dev, with a PGP key published at kerox.dev/.well-known/pgp-key.txt. Responsible disclosure, the same way we would want it.
Contributing & RFCs
KeroxLabs builds in the open. Patches, bug reports, and hard questions about doing offense responsibly are all welcome. The bar is technical; the response is fast.
RFC process
- ›Anything that touches the agent protocol, the engagement-package format, or the gate requires an RFC
- ›RFCs live in
docs/rfcs/RFC-XXXX-<slug>.md - ›Discuss first in the forum (coming soon) or via issue; open a draft PR with the RFC
- ›Maintainer sign-off + a comment window before merge
Code style
- ›
cargo clippy -- -D warningsis mandatory - ›
cargo fmt --checkis mandatory - ›No
unsafewithout a// SAFETY:comment - ›Anything that can take a live action gets a test that proves it cannot without approval
Glossary
- ›
orchestrator— the component that reads the plan, sequences the attack chain, and owns the human gate. - ›
agent— a specialist that runs one domain of the work (spearhead, network, report; web and appsec later). - ›
attack chain— an ordered sequence of techniques that move from entry toward an objective. - ›
RoE— Rules of Engagement. What is in scope, what is off-limits, when, and the hard stops. - ›
ConOps— Concept of Operations. What the engagement is trying to do, in plain language. - ›
deconfliction— telling authorized test activity apart from a real incident. - ›
OPPLAN— the operational plan, with each intended action mapped to MITRE ATT&CK. - ›
dry-run— plan and display an action without executing it. The default mode. - ›
MITRE ATT&CK— the framework of adversary techniques used to tag the conventional chain. - ›
MITRE ATLAS— the same idea for attacks against AI/ML systems; Spearhead maps to it. - ›
OWASP LLM Top 10— the common catalog of LLM application risks Spearhead reports against.
That is the whole book — for now.
The rest is being written as we build. Subscribe to the forum when it opens, follow the org, or just check back. Notes that sharpen something here are welcome — this book is in the repo too.