Gateway proof

Fortress puts a boundary in front of the model.

podman-llama-fortress exists because local models still need a gate. Prompt handling, policy checks, and audit visibility should not be left to the model runtime itself.

Why it exists

A local-first model stack is still a software boundary problem. The useful question is not whether the model is smart, it is whether the path into it is controlled.

Threat model

Unbounded prompts create invisible risk.

Direct model access makes it too easy to blur application intent, operator intent, and model behavior. Fortress exists to separate those concerns before any output is trusted.

Countermeasure

Put a governed gateway in the middle.

Requests come through one path, get checked, and either pass or stop. The point is not theatre, it is to make the control point visible and testable.

Architecture block

The important shape is simple: prompt in, gate first, model second, audit always.

User or system prompt The request enters through one visible ingress.
Fortress gateway Policy checks, metadata capture, and routing logic happen here.
Local model via Podman Only approved requests reach the runtime.
Audited response Allowed output returns with a trace, blocked output stops with evidence.

Concrete artifacts

The point of a gateway is that it produces evidence you can actually inspect.

Representative audit event
{
  "request_id": "fortress-demo-014",
  "route": "/v1/generate",
  "policy_result": "blocked",
  "reason": "unsafe_prompt_pattern",
  "model_called": false
}
Operator outcome

The value is not abstract “safety”. It is the ability to prove what was allowed, what was refused, and where the boundary lives in the stack.