Security Model
CKP applies a defense-in-depth approach to agent security. Seven layers work together so that no single compromise leads to full agent takeover. Each layer narrows what the agent can do, and permissions can only flow downward.
The 7 layers
Section titled “The 7 layers”Layer 1 Channel ── Access control (who can talk to the agent)Layer 2 Policy ── Rule engine (what is allowed / denied / requires approval)Layer 3 Sandbox ── Isolation (where tools execute)Layer 4 Provider ── Token governance (how much the agent can spend)Layer 5 Memory ── Data scoping (what the agent remembers and where)Layer 6 Swarm ── Trust boundaries (which agents can collaborate)Layer 7 Identity ── Autonomy level (how much freedom the agent has)Each layer grants permissions that its parent allows. Tool permissions cannot exceed Sandbox permissions, Sandbox cannot exceed Policy, Policy cannot exceed Identity, and so on.
Trust hierarchy
Section titled “Trust hierarchy”Human (highest trust) └── Channel (authenticated session) └── Identity (declared autonomy level) └── Policy (behavioral rules) └── Sandbox (execution constraints) └── Tool (lowest trust, most constrained)A Tool can never do more than its Sandbox allows. A Sandbox can never permit more than its Policy allows. This ensures that adding a new tool cannot accidentally bypass security boundaries.
Threat model
Section titled “Threat model”| Threat | Layer | Mitigation |
|---|---|---|
| Unauthorized human access | Channel | Allowlists, authentication, rate limiting |
| Prompt injection | Policy | Pattern-based detection + LLM-based detection |
| Secret exfiltration | Sandbox + Policy | Leak scanning, network restrictions |
| SSRF attacks | Sandbox | DNS pinning, IP blocking, allowlisted domains |
| Destructive execution | Policy | Approval gates (require-approval action) |
| Cost runaway | Provider | Token limits, cost caps, daily quotas |
| Cross-agent contamination | Swarm + Memory | Memory scoping per agent, trust boundaries |
| Malicious imported skills | Skill + Sandbox | Skill permissions declaration, sandboxed execution |
| Privilege escalation | Sandbox | Container/WASM isolation, resource limits |
Key mechanisms
Section titled “Key mechanisms”Access control (Channel)
Section titled “Access control (Channel)”Channels define who can interact with the agent. Each channel supports an access_control block:
channels: - name: slack-team kind: slack access_control: mode: allowlist identifiers: ["U12345", "U67890"]Modes: allowlist (explicit allow), denylist (explicit deny), or implementation-defined defaults.
Rule engine (Policy)
Section titled “Rule engine (Policy)”Policies define what the agent can do. Rules match patterns and apply actions:
policies: - name: security rules: - pattern: "tool:file-delete" action: deny - pattern: "tool:web-fetch" action: require-approval - pattern: "tool:calculator" action: allow prompt_injection: detection: pattern action: denyActions: allow, deny, require-approval, audit-only.
Execution isolation (Sandbox)
Section titled “Execution isolation (Sandbox)”Sandboxes define where and how tools execute:
sandbox: level: container limits: memory_mb: 512 cpu_shares: 256 timeout_ms: 30000 network: mode: restricted allowed_domains: ["api.example.com"]Levels (ascending isolation): none, process, wasm, container, vm.
Token governance (Provider)
Section titled “Token governance (Provider)”Providers enforce how much the agent can spend:
providers: - name: claude kind: anthropic model: claude-sonnet-4-20250514 limits: max_tokens_per_request: 4096 max_tokens_per_day: 100000 max_cost_per_day_usd: 10.00Approval flow
Section titled “Approval flow”When a policy rule specifies require-approval, the runtime pauses tool execution and requests human confirmation:
1. Agent decides to call tool2. Policy evaluates → require-approval3. Runtime sends approval request to Channel4. Human approves (claw.tool.approve) or denies (claw.tool.deny)5. Runtime proceeds or aborts6. If no response within timeout → error -32012 (Approval timeout)This ensures humans remain in the loop for sensitive operations without blocking autonomous execution of safe tools.
Defense composition
Section titled “Defense composition”The layers compose multiplicatively. An agent with:
- A Channel allowlist of 3 users
- A Policy that denies file deletion
- A Sandbox with no network access
- A Provider capped at 10K tokens/day
…has a very narrow attack surface. Each layer independently reduces risk, and an attacker must compromise all layers simultaneously to gain full control.