Every prompt passes
through RAE.
RAE sits in-line as a Docker node inside your network boundary. Every prompt your application sends — and every response the model returns — passes through the three-tier pipeline before reaching its destination.
Request topology
Step 01
End user
Step 02
Customer application
Step 03
RAE node — on-prem Docker
Your network
boundary
Step 04
LLM provider
Fig. 01 — in-line proxy topology
Because RAE is an HTTP proxy, integration takes minutes. Your application points at the RAE node instead of the LLM provider — everything else stays the same. Raw prompts stay inside your perimeter; the cloud control plane receives only metadata: category, confidence, detector votes, timestamp, latency.
Three integration paths
OpenAI-compatible proxy
Change your base URL. No other code changes.
TypeScript / Python SDK
Wraps your agent calls directly.
Sidecar mode
Non-HTTP agents and custom orchestration.
Three tiers. One decision.
Most traffic exits at the hot tier without touching the model. The cold tier is used sparingly — only when the warm tier explicitly escalates.
1
Hot
Rule engine
Known attack signatures. Most traffic exits here.
μs
microseconds
2
Warm
Small LLM · 5 detectors
Consensus check. Acts when 3 of 5 detector heads agree.
ms
milliseconds
3
Cold
Full LLM reasoning
Edge cases the warm tier escalates. Used sparingly.
s
seconds
Four actions. One layer.
RAE acts on every prompt and response. What it does depends on what the detectors find — and your configuration.
No. 01
Observe.
every request.
Let it through, log metadata. Shadow rollout mode — RAE watches before it acts, building a baseline of your agent's normal behaviour.
No. 02
Block.
the threat.
Reject the prompt, return a safe refusal before it reaches your model. 3 of 5 detector consensus required before RAE acts.
No. 03
Live-time correctionCorrect.
in real time.
Rewrite the prompt or response to neutralize the attack while preserving legitimate intent. The live-time correction that firewalls cannot do — your user gets a response, not a refusal.
No. 04
Harden.
for next time.
Generate a defense overlay prepended to your system prompt at runtime. Stored separately, versioned, reversible. Every blocked attack makes your agent stronger.
Each blocked attack
makes the next
stronger.
Every blocked attack generates a metadata signal — not the raw prompt, never your data. That signal feeds back into the detector training pipeline.
01
RAE blocks an attack
Metadata recorded — attack category, confidence, timestamp. Raw prompt stays on-prem. Your data never leaves your network.
02
Anonymised metadata joins the next training run
No customer data shared. No raw prompts leave your infrastructure. Only signal: category, confidence, detector votes.
03
Five detector heads retrain on cadence
Coverage expands. New attack variants are absorbed automatically. The warm tier gets sharper on every cycle.
04
Every deployment benefits
Your RAE node learns from every threat every customer has ever faced — without any of their data leaving their network.