Rules & Policies

Guardrail Types

Building robust guardrails around AI agents require a combination of precise, deterministic, "online" (always updated) rules and AI-evaluated, intent-based policies. Simple, one-off command blacklists are insufficient for the nuanced, context-aware risk management required for modern agentic workflows, so we created our own library of operations and matchers to cover the most common risks, while exposing a flexible purely AI-inference based "policy" interface for broader, more complex cases.

When recommending new guardrails, our AI selects whether to apply a rule or policy based on the context of your tool calls, identifies which rule (if any) from our pre-built library is the best match, then determines the correct arguments to run that rule with. This observability first approach means we're automatically working in the background to keep your guardrails up to date as your agent responsibilities and codebases evolve– this can mean updating regular expressions, adding new intent-based policies, or consolidating multiple similar rules.

Guardrail Types

Deterministic Rules

These examples show what Stoplight's recommendation engine might suggest based on real agent telemetry. Each rule bundles multiple matchers that target the same business risk.

1 / 5

Rules determine whether to require approval or deny a tool call based on if any one of a set of deterministic conditions is satisfied. Because a single business intent can surface through many different tools, good rules expand into multiple execution paths so agents cannot trivially route around one blocked command. To help our system recommend robust rules, Stoplight's operation library packages common intents like file access, uploads, database changes, and destructive actions into reusable coverage matchers that serve as the preferred building block.

You can view the full list of matchers in the reference page.

Failure modes

A single blocked command is rarely enough

Agents can adapt unpredictably. If a narrow control only blocks one command prefix, the model can often pivot to a different CLI, a web tool, or an MCP integration that accomplishes the same result.

Too narrow

Naive rule

Blocking only `curl` is not a reliable way to stop network egress. An agent can reach for other tools immediately.

command_starts_with: "curl"
action: "deny"

✕wget and other download clients
✕scp, rsync, or SSH-based uploads
✕Language runtimes like python -c or node -e
✕Provider-native web tools and MCP integrations

Preferred

Broad rule

A reusable operation backed by multiple matchers covers the higher-level intent instead of one literal command prefix.

operation: "net.upload"
target: "api.example.com"
action: "ask"

Guardrail Types

Intent-based Policies

Policies are natural-language guardrails evaluated by AI at runtime. Each policy carries a prompt that an LLM uses to decide whether an action should be allowed, escalated, or denied.

1 / 5

Policies are purely AI-evaluated guardrails made available to express higher-level intent, and decide intelligently whether to approve, ask, or deny a particular tool call at runtime, based on a short history of prior tool calls to grasp the agent's overall intent. This guardrail type is most useful when deterministic matching is too brittle, too narrow, or too expensive to maintain for every edge case.

To ensure policies do not bog down agent performance, particularly as they grow in number over time, policies can be scoped to trigger only on specific operations or events, leveraging a trigger library similar to matchers for rules. Policy execution also leverages fast inference to appear nearly invisible to end users.

Knowledge gathering

Improved guardrail recommendations over time

Stoplight's AI learns from which guardrails you reject, and uses that feedback to make more targeted recommendations over time. This feedback loop helps the AI understand your agent's scope and organization's risk tolerance, ensuring noisy, repeated guardrail suggestions are minimized.

Recommendation engine

Stoplight selects the best guardrail type for you

The recommendation pipeline will automatically decide that a control belongs in a deterministic rule or in an AI policy. Over time, rules may be merged or resurfaced as policy recommendations as the system explores better ways to cover your agent's behavior.