LM Review Response: AIP-1, ADP-1, PVS-1, CTX-1
Reviewer: Claude (Opus 4) Date: 2025-12-10 Scope: Design review of ACL's emerging standards layer
Overall
The four specs (AIP-1, ADP-1, PVS-1, CTX-1) form a coherent and well-architected foundation for an agent control plane. The decision to build on X.509 rather than inventing a novel identity scheme is strategically sound - it provides immediate interoperability with existing PKI tooling and enterprise infrastructure. The short-lived certificate model (5-15 minute lifetimes) elegantly sidesteps the revocation problem that plagues traditional PKI deployments.
The Action -> Observation -> Reflection model in ADP-1 is framework-agnostic enough to accommodate LangChain, CrewAI, AutoGen, and custom implementations without forcing awkward mappings. The explicit linkage between ADP-1 runs and AIP-1 identity (agent.aip.cert_fingerprint) creates a clean audit trail from "who is this agent" to "what did they do."
The biggest risk I see is scope creep before launch. The specs are deliberately minimal - this is correct. The temptation will be to add "just one more field" repeatedly. I recommend hard constraints on what ships in v1 vs. what gets explicitly deferred to v2. The recommendations below focus on ~5-7 small, high-leverage changes that significantly improve the specs without expanding scope.
AIP-1 - Agent Identity Protocol
Strengths
- Pragmatic PKI reuse: X.509 is battle-tested, widely understood, and supported by every TLS library. Smart choice.
- Blast radius minimization: 5-15 minute cert lifetimes make CRL infrastructure unnecessary and limit exposure from key compromise.
- Blockchain anchoring as optional verification: The Root CA hash can be verified against Ethereum without requiring on-chain operations in the hot path. Good separation of concerns.
- Clean OID structure: The
1.3.6.1.4.1.59999.1.*arc is well-organized and leaves room for future extensions.
Gaps / Risks
-
Missing
audience/intended_verifierfield: When Agent A presents its cert to Service B, there's no way to express "this cert is specifically for Service B." This opens the door to replay attacks where a cert issued for one service is presented to another. OIDC solved this withaudclaims. -
No environment indicator: Production vs. staging vs. development certs are indistinguishable. An agent could accidentally (or maliciously) use a dev cert against production infrastructure.
-
Clock skew vulnerability: With 5-minute lifetimes, even modest clock drift (>30 seconds) between issuer and verifier can cause spurious rejections. The spec should document recommended NTP synchronization and acceptable skew tolerances.
-
Algorithm agility not specified: The spec mentions RSA/ECDSA and future Dilithium/Kyber, but doesn't specify how algorithm negotiation works or how verifiers know which algorithms are acceptable.
-
Partial implementation gap: The code in
ca.tsdoesn't yet populate all AIP fields consistently. For example,agentRoleis only set when explicitly provided via metadata, but the spec implies it should always be present.
Concrete Recommendations
1. Add AIP-Audience extension (OID .6):
| OID | Name | Data Type | Description |
| 1.3.6.1.4.1.59999.1.6 | AIP-Audience | UTF8String | Intended verifier service/domain (e.g., "api.acme.com") |Implementation in ca.ts:
if (metadata?.audience) {
extensions.push({
id: `${AIP_OID_BASE}.6`,
value: metadata.audience,
});
}2. Add AIP-Environment extension (OID .7):
| OID | Name | Data Type | Description |
| 1.3.6.1.4.1.59999.1.7 | AIP-Environment | UTF8String | Deployment environment ("production", "staging", "development") |This is cheap to add now and prevents cross-environment confusion.
3. Document clock skew tolerance in spec:
Add to Section 2.1.3:
Clock Skew Tolerance: Verifiers SHOULD accept certificates with a grace period of ±60 seconds beyond the stated validity period. Issuers and verifiers MUST synchronize to NTP with drift <30 seconds.
4. Start populating Agent-Role consistently:
In ca.ts, default to the agent key when no explicit role is provided:
extensions.push({
id: AIP_OIDS.agentRole,
value: metadata?.agentRole || agentId, // Default to agentId if no role specified
});ADP-1 - Agent Data Protocol
Strengths
- Framework-agnostic by design: The Action/Observation/Reflection model maps cleanly to diverse paradigms - LangChain's agent loops, CrewAI's task delegation, AutoGen's conversation turns.
- Explicit AIP linkage: The
agent.aipblock provides a clear bridge to identity without requiring deep PKI knowledge. - Streaming-friendly structure: Steps can be emitted incrementally, enabling real-time monitoring.
- Sensible defaults for optional fields: Required fields are minimal; rich metadata is optional but encouraged.
Gaps / Risks
-
No parent/child step relationships: Multi-agent orchestration (e.g., CrewAI manager delegating to workers) cannot be represented. You need
parent_step_indexor similar to show "Step 5 was spawned by Step 2." -
No span/trace ID for distributed tracing: Integration with OpenTelemetry/Jaeger/Zipkin requires a
trace_idandspan_id. Without these, correlating ADP runs with infrastructure traces is manual. -
Cancellation reason not captured:
status: "cancelled"exists, but there's no field explaining why (user abort, timeout, budget exceeded, policy rejection). -
framework_run_idsemantics unclear: Is this the ID from the framework's perspective, or ACL's internal mapping? Document the expected semantics. -
Token counting inconsistency:
metadata.total_tokensat run level vs. per-stepprompt_tokens/completion_tokens. Should run-level be the sum, or can they diverge?
Concrete Recommendations
1. Add parent_step_index to AgentStep:
{
"index": 5,
"parent_step_index": 2, // NEW: This step was spawned by step 2
"timestamp": "...",
"action": { ... }
}This enables tree visualization of multi-agent delegation without breaking the linear step array.
2. Add trace context to AgentRun:
{
"version": "adp-1",
"run_id": "...",
"trace": { // NEW: OpenTelemetry-compatible trace context
"trace_id": "abc123def456...",
"span_id": "789xyz...",
"parent_span_id": "parent..." // optional, for sub-runs
},
...
}3. Add cancellation details:
{
"status": "cancelled",
"cancellation": { // NEW: Only present when status=cancelled
"reason": "budget_exceeded", // "user_abort" | "timeout" | "budget_exceeded" | "policy_rejection" | "other"
"message": "Run exceeded $0.50 budget limit",
"cancelled_by": "system" // "user" | "system" | "policy"
}
}4. Clarify framework_run_id in spec:
Add to Section 2.1:
framework_run_id— Optional identifier from the underlying framework (e.g., LangChain'srun_id, CrewAI'stask_id). This is the framework's native identifier, not ACL's. Useful for correlating ADP records with framework-specific logs.
PVS-1 - Policy Verdict Schema
Strengths
- Simple and actionable: The
approvedboolean makes enforcement trivial. No ambiguity. - Human-readable reasoning: The
reasoningfield enables audit and explanation without requiring policy expertise. - Fail-closed default: The Gavel implementation correctly rejects on error - security-first design.
- Clean embedding in ADP-1: Verdicts fit naturally as
observation.outputin policy_judge steps.
Gaps / Risks
-
No severity levels: A PII leak and a minor formatting guideline violation both result in
approved: false. Downstream systems can't triage. -
No violation IDs:
policy_violations: ["No PII"]is human-readable but not machine-actionable. You can't build dashboards that track "Violation X occurred Y times this week." -
No remediation hints: When rejected, agents have no guidance on how to fix the issue. This leads to retry loops or giving up.
-
Confidence score semantics unclear: Is 0.97 confidence in "approved" the same as 0.97 confidence in "rejected"? Or is it always confidence in the specific verdict?
Concrete Recommendations
1. Add severity to violations:
{
"policy_violations": [
{
"id": "pii_leak", // NEW: Machine-readable ID
"description": "No PII leakage",
"severity": "critical", // NEW: "critical" | "high" | "medium" | "low"
"category": "privacy" // NEW: "privacy" | "safety" | "compliance" | "business"
}
]
}Backward compatibility: Implementations MAY return string arrays for v1, but SHOULD migrate to objects.
2. Add remediation hints:
{
"remediation": { // NEW: Optional guidance for fixing violations
"suggestion": "Redact the SSN pattern (XXX-XX-XXXX) before retrying",
"auto_fixable": true, // Hint that automated remediation is possible
"retry_recommended": true
}
}3. Clarify confidence semantics in spec:
Add to Section 1.1:
confidence_score— Number between 0.0 and 1.0 representing the engine's confidence that the verdict is correct. A score of 0.95 onapproved: truemeans "95% confident this should be approved." A score of 0.95 onapproved: falsemeans "95% confident this should be rejected."
CTX-1 - Capability & Trust Context
Strengths
- Simple, composable naming:
<namespace>:<name>:<qualifier>is easy to understand and parse. - Reserved prefixes are well-chosen:
agent:,tenant:,perm:,budget:cover the main use cases. - RBAC mapping is straightforward:
{ resource: "workflows", action: "read" }->perm:workflows:readis mechanical. - Fail-safe semantics: "Unknown capabilities MUST be ignored" is the right default.
Gaps / Risks
-
No environment scoping:
perm:workflows:readdoesn't distinguish between production and staging workflows. An agent with this capability can read workflows in any environment. -
No data classification: There's no way to express "can read PII" vs. "can read non-PII data" vs. "can read aggregate metrics only."
-
Static capabilities vs. dynamic context: Capabilities are burned into the cert at issuance time. If an agent's permissions change mid-run (e.g., elevated for a specific task), there's no mechanism to reflect this.
-
Revocation foot-gun: Capabilities in a cert cannot be revoked without revoking the entire cert. The spec should warn against encoding too-granular permissions.
-
deny:capabilities not specified: The future extensions mention negative capabilities but don't define semantics. Are denies evaluated before allows? Can you have both?
Concrete Recommendations
1. Add environment qualifier convention:
Document in Section 1:
perm:<resource>:<action>[@<environment>]
Examples:
- perm:workflows:read@production — Can read production workflows
- perm:workflows:read@* — Can read workflows in any environment
- perm:workflows:read — (Default) Equivalent to @* for backward compatibility2. Add data classification prefix:
data:<classification>:<operation>
Reserved classifications:
- data:pii:read — Can access PII
- data:sensitive:read — Can access sensitive business data
- data:public:read — Can access public data only3. Document the revocation foot-gun explicitly:
Add to Section 1:
Warning: Capabilities encoded in AIP certificates cannot be individually revoked. If granular, dynamic permissions are required, implementations SHOULD encode only coarse-grained identity capabilities (e.g.,
agent:coach,tenant:acme) in the certificate and resolve fine-grained permissions via an external policy engine at runtime. The certificate'sCapability-Setis best used for static, long-lived permissions that rarely change.
4. Define deny: semantics now (even if implementation is deferred):
Add to Section 6:
Deny Capabilities (Future): When implemented, deny capabilities MUST be evaluated after allow capabilities. A
deny:files:writeoverrides aperm:files:write. Implementations MUST NOT issue certificates with bothperm:X:Yanddeny:X:Yfor the same resource/action pair.
Cross-Cutting / Big Picture
Cohesion Assessment
The four specs are well-aligned conceptually:
- AIP-1 answers "who"
- ADP-1 answers "what"
- PVS-1 answers "is it allowed"
- CTX-1 answers "what can they do"
Minor terminology inconsistencies to fix:
| Location | Current | Suggested | Reason |
|---|---|---|---|
AIP-1 Tenant-ID OID | tenantId | tenant_id | Match ADP-1's tenant_id |
ADP-1 agent.aip.tenant_id | tenant_id | Keep | Already correct |
PVS-1 metadata.tenant_id | tenant_id | Keep | Already correct |
CTX-1 tenant: prefix | tenant:<id> | Keep | Already correct |
The only real inconsistency is the OID extension name using camelCase (tenantId in the OID description) while JSON fields use snake_case. Recommend standardizing on snake_case everywhere since that's what the JSON payloads use.
Highest-Leverage Pre-Launch Additions
In priority order:
- AIP Audience field (prevents cross-service replay attacks) - 2 hours of work
- ADP parent_step_index (enables multi-agent visualization) - 1 hour of work
- PVS violation severity (enables triage and prioritization) - 2 hours of work
- AIP Environment field (prevents prod/staging confusion) - 1 hour of work
- ADP trace context (enables OpenTelemetry integration) - 2 hours of work
Total: ~8 hours of spec + implementation work for significant capability uplift.
Explicit v2 Deferrals
The following should be explicitly marked as "Not in v1, Roadmap for v2" to prevent scope creep:
-
Signed ADP Streams: Signing individual steps with the agent's cert for tamper-evidence. Complex key management implications.
-
Full mTLS between agents: AIP currently enables verification, but actual mTLS handshakes between agents require significant infrastructure (service mesh, cert distribution).
-
Global Policy Language: A declarative DSL for policies (like OPA/Rego). The Gavel's natural language approach is sufficient for v1.
-
Capability Bundles / Roles:
role:analystexpanding to multipleperm:*capabilities. Keep it flat for v1. -
Cross-Organization Trust Federations: Multiple Root CAs trusting each other. Single-CA model is sufficient for v1.
-
ZK-Proofs in Capability-Set: Privacy-preserving capability verification. Interesting but complex.
-
Real-time Capability Revocation: Individual capability revocation without cert rotation. Short cert lifetimes make this less critical.
Implementation Alignment Check
Reviewing ca.ts, verifier.ts, the-gavel.ts, and agent-executor.ts:
Good alignment:
- Certificate issuance follows AIP-1 profile
- Fingerprint calculation is consistent
- The Gavel output matches PVS-1 structure (minus
versionfield) - Capability-Set encoding follows CTX-1 conventions
Gaps to address:
- The Gavel should add
version: "pvs-1"to its output (trivial fix) verifyAgentIdentitydoesn't extract/validate AIP extension OIDs - it only checks signature and extracts CNagent-executor.tsdoesn't emit full ADP-1AgentRunrecords yet - it logs individual executions but not the structured format
Summary Table
| Spec | Maturity | Blockers for Launch | Recommended Adds |
|---|---|---|---|
| AIP-1 | Ready | None | Audience, Environment fields |
| ADP-1 | Ready | None | parent_step_index, trace context |
| PVS-1 | Ready | None | Violation severity |
| CTX-1 | Ready | None | Environment qualifier docs |
Overall verdict: Ship it. The specs are solid foundations. The recommendations above are improvements, not blockers. Launch with what you have, iterate based on real-world feedback.
Review completed by Claude (Opus 4) on 2025-12-10