KEROXLABS · RESEARCH

How to build an autonomous red team, in Rust, by hand.

This is the working notebook for Kerox — a Rust-native, terminal-first, multi-agent autonomous red team. None of it is finished; this book is the design thinking as it happens. Sections that describe something still being built are marked ETA so you know what is real and what is planned.

EDITION · LIVESTARTED · JAN 2026STATUS · WORK IN PROGRESS

CHAPTER 00

Overview

Kerox is a Rust-native, terminal-first, vendor-neutral autonomous red team. The short version: an orchestrator reads an engagement plan and pursues an objective the way an adversary would — not the way a scanner does. It is open, it is in progress, and it is built by hand.

Most automated "offensive" tooling runs a fixed battery of checks and prints a report. That is useful, but it is not what an attacker does. An attacker has a goal, improvises a path toward it, and chains small wins into a big one. Kerox is an attempt to build a system that works that second way — under tight discipline, against authorized scope only.

What you will find in this book

›The agent architecture — one orchestrator, a roster of specialists
›How the orchestrator turns an engagement plan into a real attack chain
›spearhead, the LLM/AI red-team agent, and how findings map to OWASP LLM Top 10 and MITRE ATLAS
›The engagement package — RoE, ConOps, Deconfliction, OPPLAN — and why it comes before any packet
›The human gate, the dry-run default, and the sandbox the whole thing runs inside
›The attack → defend → verify loop that points offense back at defense
›The krx CLI, the planned tech stack, and how to contribute

How to read it

Read top to bottom for the first 30 minutes to get the shape of the system. After that, jump by chapter — each section stands on its own. Where a chapter shows config or commands, treat it as a design sketch, not a shipped interface; the names will move before they settle.

NOTENothing here is released, downloadable, or production-ready yet. This is a vision document with running notes. If a sentence reads like a promise, read it as an intention.

CHAPTER 01

Architecture

ETA · Q1 2026

The architecture is deliberately small: one orchestrator that owns the plan and the decisions, and a set of specialist agents it dispatches to do the actual work. The orchestrator never touches a target directly — it reasons, it sequences, and it hands concrete tasks to agents that know one domain well.

            engagement plan (authorized scope)
                          │
                  ┌───────▼────────┐
                  │  ORCHESTRATOR  │  reads plan · sequences chain
                  │  (human gate)  │  holds every live action
                  └───────┬────────┘
                          │ dispatch
        ┌─────────┬───────┼────────┬──────────┐
        ▼         ▼       ▼        ▼          ▼
   spearhead   network  report    web      (more…)
    LLM/AI     surface  write-up  apps
        │         │       │        │          │
        └─────────┴───────┴────────┴──────────┘
                          │
                    isolated Kali sandbox
                    (dedicated op network)

Why split it this way

›The orchestrator carries the goal and the rules; agents carry the craft. Neither leaks into the other.
›Agents are replaceable. A better network agent drops in without the orchestrator noticing.
›Every live action funnels back through one place — the gate — so there is exactly one chokepoint to audit.
›Vendor-neutral by design: nothing is tied to a single model provider or a single C2.

Written in Rust, run from a terminal

The control plane is Rust — for the type system, the error discipline, and a single static binary that is easy to reason about. The interface is a terminal, because that is where this work actually happens and because a TUI is honest about what it is doing.

CHAPTER 02

The orchestrator

ETA · Q1 2026

The orchestrator is the brain. It reads an engagement plan, fixes on an objective, and works toward it through whatever path opens up — chaining reconnaissance, exploitation, privilege escalation, lateral movement, and C2. When one route closes, it backs up and tries another. This is the part that makes Kerox an adversary and not a checklist.

The loop

›Orient — read the plan, the scope, and whatever the agents have learned so far
›Decide — pick the next technique that moves the objective forward
›Gate — if the step is a live action, stop and ask a human
›Act — dispatch the approved task to the right specialist agent
›Observe — fold the result back in, then orient again

Mapped to ATT&CK as it goes

Every step the orchestrator plans is tagged with the MITRE ATT&CK technique it corresponds to, so the chain reads like a real operation and so the eventual report speaks the language a blue team already uses.

# illustrative — names and IDs will move
recon     T1595  active scanning        [plan]
access    T1190  exploit public app     [plan]
privesc   T1068  exploit for escalation [plan]
lateral   T1021  remote services        [plan]
collect   T1119  automated collection   [plan]
c2        T1071  application-layer C2    [plan]
# nothing fires until a human approves the step

NOTEThe chain above is a plan, not a run. The orchestrator is designed to produce the full chain first, in dry-run, and only execute steps a human has explicitly signed off on.

CHAPTER 03

Spearhead — LLM red team

ETA · Q2 2026

Spearhead is the lead agent and the reason Kerox exists in the shape it does. It is pointed at the AI in the stack — the chatbots, copilots, and tool-using agents that are now wired into real systems — and it probes the failure modes that are unique to language models.

What it probes

›Prompt injection — getting the model to follow attacker text instead of its instructions
›System-prompt leakage — pulling the hidden instructions and configuration back out
›Guardrail bypass — routing around the safety layer to reach restricted behavior
›Tool-call exfiltration — abusing the model's tools to move data it should never move

Mapped to the frameworks defenders use

A finding that nobody can act on is noise. Spearhead is designed to report every result against the OWASP LLM Top 10 and MITRE ATLAS, so it lands in vocabulary a security team already has policies and detections for.

# illustrative mapping
prompt injection      LLM01   AML.T0051
system-prompt leak    LLM07   AML.T0054
guardrail bypass      LLM02   AML.T0054
tool-call exfil       LLM06   AML.T0057

Spearhead leads; the supporting agents follow it onto the rest of the attack surface once it has found a way in — or once it has proven there isn't one.

CHAPTER 04

Supporting agents

ETA · Q2 2026

Spearhead handles the model. The rest of the roster handles everything around it — the conventional surface, and turning the run into something a defender can act on. Each one is a specialist with a narrow brief.

network

Maps the attack surface and works it. Recon and enumeration first — the handful of openings that matter, not a thousand low-signal findings — then services, trust paths, and lateral movement once a foothold lands.

report

Turns the engagement into a deliverable. Narrative plus findings, mapped to MITRE ATT&CK and ATLAS, rendered as Markdown, JSON, or SARIF — including the dropped and ruled-out reasons, which are half the value.

web, appsec — later

The web surface (injection, access-control, logic flaws) and the source-code surface are designed into the roster and stubbed for now. They ship once the wedge, the network agent, and the report agent are solid.

NOTEThe agent roster is open-ended on purpose. The orchestrator does not care how many agents exist or what they are called; it only cares that each one advertises what techniques it can run.

CHAPTER 05

Engagement discipline

ETA · Q1 2026

This is the part that separates a red team from a vandal. Before a packet leaves the wire, Kerox writes the engagement down — and then is built to refuse to step outside what it wrote.

The engagement package

›Rules of Engagement (RoE) — what is in scope, what is off-limits, the hours, the hard stops
›ConOps — the concept of operations: what we are trying to achieve and how, in plain language
›Deconfliction Plan — how to tell our activity apart from a real incident, and who to call
›OPPLAN — the operational plan, with each intended action mapped to MITRE ATT&CK

The package is generated first, reviewed by a human, and then treated as the boundary for everything that follows. An action that is not covered by the package does not run.

# illustrative scope file — the boundary, not a suggestion
scope:
  in:   ["10.10.0.0/24", "app.example-lab.internal"]
  out:  ["*.prod.example.com", "anything not listed"]
  hours: "Mon–Fri 09:00–17:00, operator timezone"
  hard_stops:
    - "any sign of real user impact"
    - "any host outside 'in'"

NOTEAuthorization is not a formality here. Kerox is meant to be run against systems you are explicitly permitted to test, inside the scope you wrote down. That constraint is load-bearing.

CHAPTER 06

The human gate

ETA · Q1 2026

Autonomy without a brake is just a liability. Kerox is autonomous in how it reasons and plans, but every live action is dry-run by default and stops at an explicit human approval before it touches anything real.

Dry-run is the default, not an option

In its planning mode, Kerox produces the full chain — every technique, every command, every target — without sending a thing. You read the plan. You approve the steps you want. Only then does a live action go out, and only the steps you approved.

$ krx run --plan engagement.kx        # build the chain, send nothing
  → 11 steps planned · 0 executed · review required

$ krx approve --step 03               # arm a single live action
  → step 03 armed · scope check passed

$ krx run --execute --step 03         # fire only what was approved
  → [HOLD] confirm: live action against 10.10.0.14 ? [y/N]

›Scope is re-checked at execution time, not just at plan time
›Approvals are per-step, not a single blanket "go"
›The default answer to every prompt is no

CHAPTER 07

Interactive tooling

ETA · Q3 2026

Real offensive tools are interactive. They drop you into a console, ask questions, and keep state across a session. A lot of automation pretends this isn't true and scripts around it; Kerox is being built to sit inside it instead.

Persistent terminal sessions

The agents drive tools like msfconsole, sliver-client, and evil-winrm inside persistent terminal sessions — sending input, reading output, and answering interactive prompts the way a person at the keyboard would. A session that is established stays established; a pivot does not drop because the automation forgot to hold the handle.

›Tools run in their native console, not behind a brittle one-shot wrapper
›Interactive prompts are handled, not avoided
›Session state survives across steps in the chain

NOTEThis is harder than shelling out once and parsing stdout, and that is the point. The tools were built to be driven by a human at a prompt; Kerox is built to be that driver.

CHAPTER 08

The sandbox

ETA · Q3 2026

Offense stays in a box. Operations are designed to run inside an isolated Kali sandbox on a dedicated operational network, kept separate from the management plane that drives the engagement.

Two planes, on purpose

›Management plane — where you read plans, give approvals, and read results. It never touches a target.
›Operational plane — the Kali sandbox on its own network, where the agents and their tools actually run.

The split means the machine you sit at is not the machine that runs exploits, and a target can never reach back past the sandbox to the place where decisions are made.

NOTEIsolation is a design goal, not a finished guarantee. Treat the sandbox like any other lab boundary: necessary, and not a substitute for running only against scope you are authorized to touch.

CHAPTER 09

Attack → defend → verify

ETA · Q4 2026

The reason to build a disciplined attacker is to make defense better. Kerox is designed around a loop that turns each finding into a concrete defensive improvement and then proves the improvement actually holds.

The loop

›Attack — the chain reaches the objective, or gets as far as it can, and records exactly how
›Defend — each step becomes a candidate fix: a detection, a control, a config change
›Verify — re-run the same step against the hardened system and confirm it now fails

A finding you cannot reproduce is a rumor; a fix you cannot verify is a hope. Closing the loop is what makes the offense worth doing.

CHAPTER 10

The krx CLI

ETA · Q4 2026

The whole thing is driven from one terminal command: krx. The verbs below are a design sketch — they will change — but they show the intended shape: plan, review, approve, run, report.

›krx plan — generate the engagement package and the dry-run attack chain
›krx scope — validate a target against the authorized scope file
›krx spearhead — point the LLM agent at a target and map findings to ATLAS
›krx approve — arm a specific step for live execution
›krx run — execute approved steps, or the whole chain in dry-run
›krx verify — re-run a step against the hardened system to confirm the fix
›krx report — write the engagement up, ATT&CK- and ATLAS-mapped

NOTEThere is no public release of krx yet. The commands here describe where it is headed, not something you can install today.

CHAPTER 11

Tech stack

Planned, and subject to change as the build teaches us things. The bias is toward a small number of well-understood pieces over a large framework.

›Rust — the control plane, the orchestrator, and the agent harness; one static binary
›Terminal-first — a TUI for plans, approvals, and live session output
›Vendor-neutral LLM access — Spearhead is not tied to a single model provider
›Kali — the operational sandbox image and its tooling
›Existing offensive tools — driven, not reinvented: msfconsole, sliver-client, evil-winrm, and friends
›MITRE ATT&CK + ATLAS, OWASP LLM Top 10 — the mapping vocabulary, not dependencies

CHAPTER 12

Authorization & ethics

Kerox is an offensive tool, and offensive tools have to be honest about what they are for. It is built to be run by people with permission, against systems they are allowed to test, inside a scope they wrote down. The discipline chapters above are not decoration — they are the whole design constraint.

What that means in practice

›Authorized scope only. No scope file, no run.
›Dry-run by default. Live actions need an explicit human yes.
›Operations stay in the sandbox, on the operational network.
›Findings exist to be fixed, not collected.

Reporting an issue in Kerox itself

When there is something to report, security contact goes to security@kerox.dev, with a PGP key published at kerox.dev/.well-known/pgp-key.txt. Responsible disclosure, the same way we would want it.

CHAPTER 13

Contributing & RFCs

KeroxLabs builds in the open. Patches, bug reports, and hard questions about doing offense responsibly are all welcome. The bar is technical; the response is fast.

RFC process

›Anything that touches the agent protocol, the engagement-package format, or the gate requires an RFC
›RFCs live in docs/rfcs/RFC-XXXX-<slug>.md
›Discuss first in the forum (coming soon) or via issue; open a draft PR with the RFC
›Maintainer sign-off + a comment window before merge

Code style

›cargo clippy -- -D warnings is mandatory
›cargo fmt --check is mandatory
›No unsafe without a // SAFETY: comment
›Anything that can take a live action gets a test that proves it cannot without approval

CHAPTER 14

Glossary

›orchestrator — the component that reads the plan, sequences the attack chain, and owns the human gate.
›agent — a specialist that runs one domain of the work (spearhead, network, report; web and appsec later).
›attack chain — an ordered sequence of techniques that move from entry toward an objective.
›RoE — Rules of Engagement. What is in scope, what is off-limits, when, and the hard stops.
›ConOps — Concept of Operations. What the engagement is trying to do, in plain language.
›deconfliction — telling authorized test activity apart from a real incident.
›OPPLAN — the operational plan, with each intended action mapped to MITRE ATT&CK.
›dry-run — plan and display an action without executing it. The default mode.
›MITRE ATT&CK — the framework of adversary techniques used to tag the conventional chain.
›MITRE ATLAS — the same idea for attacks against AI/ML systems; Spearhead maps to it.
›OWASP LLM Top 10 — the common catalog of LLM application risks Spearhead reports against.

That is the whole book — for now.

The rest is being written as we build. Subscribe to the forum when it opens, follow the org, or just check back. Notes that sharpen something here are welcome — this book is in the repo too.

← BACK TO LANDING GITHUB →CONTACT@KEROX.DEV