— Product · Runtime Defense

The node between your agent
and the model.

RAE runs on-prem as a Docker container inside your network. Between your application and your LLM provider. Raw prompts never leave. Only metadata reaches the cloud.

On-prem

Runs in your network

5

Detector heads

<50ms

p95 overhead

4

Actions

— Section I · Request Flow

Every prompt passes
through RAE.

RAE sits in-line as a Docker node inside your network boundary. Every prompt your application sends — and every response the model returns — passes through the three-tier pipeline before reaching its destination.

Request topology

Step 01

End user

passes through

Step 02

Customer application

passes through

Step 03

RAE node — on-prem Docker

Your network
boundary

passes through

Step 04

LLM provider

Fig. 01 — in-line proxy topology

Because RAE is an HTTP proxy, integration takes minutes. Your application points at the RAE node instead of the LLM provider — everything else stays the same. Raw prompts stay inside your perimeter; the cloud control plane receives only metadata: category, confidence, detector votes, timestamp, latency.

Three integration paths

OpenAI-compatible proxy

Change your base URL. No other code changes.

Primary

TypeScript / Python SDK

Wraps your agent calls directly.

Sidecar mode

Non-HTTP agents and custom orchestration.

— Section II · Detection Tiers

Three tiers. One decision.

Most traffic exits at the hot tier without touching the model. The cold tier is used sparingly — only when the warm tier explicitly escalates.

1

Hot

Rule engine

Known attack signatures. Most traffic exits here.

μs

microseconds

2

Warm

Small LLM · 5 detectors

Consensus check. Acts when 3 of 5 detector heads agree.

ms

milliseconds

3

Cold

Full LLM reasoning

Edge cases the warm tier escalates. Used sparingly.

s

seconds

— Section III · Actions

Four actions. One layer.

RAE acts on every prompt and response. What it does depends on what the detectors find — and your configuration.

No. 01

Observe.

every request.

Let it through, log metadata. Shadow rollout mode — RAE watches before it acts, building a baseline of your agent's normal behaviour.

No. 02

Block.

the threat.

Reject the prompt, return a safe refusal before it reaches your model. 3 of 5 detector consensus required before RAE acts.

No. 03

Live-time correction

Correct.

in real time.

Rewrite the prompt or response to neutralize the attack while preserving legitimate intent. The live-time correction that firewalls cannot do — your user gets a response, not a refusal.

No. 04

Harden.

for next time.

Generate a defense overlay prepended to your system prompt at runtime. Stored separately, versioned, reversible. Every blocked attack makes your agent stronger.

— Section IV · Flywheel

Each blocked attack
makes the next
stronger.

Every blocked attack generates a metadata signal — not the raw prompt, never your data. That signal feeds back into the detector training pipeline.

01

RAE blocks an attack

Metadata recorded — attack category, confidence, timestamp. Raw prompt stays on-prem. Your data never leaves your network.

02

Anonymised metadata joins the next training run

No customer data shared. No raw prompts leave your infrastructure. Only signal: category, confidence, detector votes.

03

Five detector heads retrain on cadence

Coverage expands. New attack variants are absorbed automatically. The warm tier gets sharper on every cycle.

04

Every deployment benefits

Your RAE node learns from every threat every customer has ever faced — without any of their data leaving their network.

— Get started

Run a free audit on your agent.

Run free audit →Book a demo