You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Disclaimer: This document is an internal self-assessment mapping, NOT a validated certification or third-party audit. It documents how the toolkit's capabilities align with the referenced standard. Organizations must perform their own compliance assessments with qualified auditors.
The Agent Governance Toolkit provides runtime governance infrastructure that addresses SOC 2 Type II controls across Security, Availability, and Processing Integrity criteria. The toolkit's strongest coverage is in Security (CC1–CC9), where the policy engine, RBAC, cryptographic identity, execution rings, and audit logging provide a defense-in-depth enforcement stack. Availability (A1) is well-supported through circuit breakers, SLO enforcement, and chaos testing primitives. Processing Integrity (PI1) benefits from deterministic policy evaluation, Merkle audit chains, and input validation — though several audit chain implementations have integrity defects.
Confidentiality (C1) has partial coverage through egress controls, PII pattern detection, and cryptographic identity — but lacks at-rest encryption, key rotation, and audit log redaction. Privacy (P1–P8) is the largest gap area: the toolkit detects only 2 PII patterns (SSN, credit card), has no consent management, no data subject access request support, and no retention enforcement. Organizations deploying this toolkit in SOC 2 scope must supplement Privacy controls with external tooling.
Important: This mapping documents what the toolkit provides as infrastructure. SOC 2 Type II requires evidence of operating effectiveness over a review period — policies followed, controls monitored, exceptions investigated. The toolkit provides the enforcement mechanisms; the operating procedures, organizational policies, and evidence collection are the deployer's responsibility. "Partial" coverage means the toolkit provides building blocks but does not satisfy the control independently.
The system is protected against unauthorized access, unauthorized disclosure of information, and damage to systems that could compromise the availability, integrity, confidentiality, and privacy of information or systems and affect the entity's ability to meet its objectives.
What Exists
CC1: Control Environment
Control
Feature
Location
Coverage
CC1.1 Commitment to integrity
STRIDE-oriented threat model
docs/THREAT_MODEL.md
⚠️ Partial — documents threats, no control ownership
CC1.4 Accountability
RBAC with 4 roles (READER, WRITER, ADMIN, AUDITOR)
# CC7.1 in action: Governance Audit Loggingfromagent_os.audit_loggerimportGovernanceAuditLogger, JsonlFileBackendaudit=GovernanceAuditLogger()
audit.add_backend(JsonlFileBackend("governance_audit.jsonl"))
audit.log_decision(
agent_id="finance-bot",
action="transfer",
decision="deny",
reason="Policy violation: amount exceeds role limit",
)
Reference implementation: The finance-soc2 example demonstrates CC6.1, CC6.3, CC7.1, CC7.2, CC7.3, and CC8.1 using real Agent OS governance APIs with role-based separation of duties, approval workflows, and immutable audit trails.
Security Gaps
Kill switch is placeholder (CC7.4): KillSwitch.kill() at kill_switch.py:86 returns structured KillResult objects but does not terminate agent processes. handoff_success_count is hardcoded to 0. All in-flight saga steps are auto-marked COMPENSATED without actual compensation.
Detection modules not wired to enforcement (CC6.8): PromptInjectionDetector, RateLimiter, BoundedSemaphore, ScopeGuard, SupplyChainGuard, and MCPSecurityScanner exist as standalone utilities but are not auto-wired into the BaseIntegration enforcement lifecycle. 6 of 10 OWASP risks share this structural gap.
MCP scanner acknowledges incompleteness (CC7.3): Line 287 of mcp_security.py warns it "uses built-in sample rules that may not cover all MCP tool poisoning techniques."
Regex-only prompt injection detection (CC6.8): No semantic or multilingual detection. English-only regex patterns can be bypassed via paraphrasing.
No network-level security enforcement (CC6.6): TLS enforcement and certificate pinning are deferred to deployment configuration.
Organizational controls not addressed (CC1–CC4): Board oversight, personnel policies, risk assessment governance, and monitoring activities are organizational obligations outside the toolkit's scope.
Recommended Controls
Wire detection modules into BaseIntegration.pre_execute() via GovernancePolicy flags (closes CC6.8 gaps across multiple OWASP risks).
Implement actual process termination in KillSwitch (CC7.4).
Deploy each agent in a separate container with governance middleware inside for defense-in-depth (CC6.7).
Add network policies for cross-agent communication control (CC6.6).
Integrate LlamaFirewall for semantic prompt injection detection (CC6.8).
Availability (A1)
The system is available for operation and use as committed or agreed.
What Exists
Control
Feature
Location
Coverage
A1.1 System capacity
Policy enforcement at sub-millisecond latency; 47K ops/sec at 1,000 concurrent agents
Chaos engine is framework-only (A1.2): ChaosExperiment.inject_fault() records that a fault was injected but does not modify system behavior. Callers must implement actual fault injection externally.
RateLimiter not wired (A1.2): RateLimiter at rate_limiter.py:93-101 has a correct token-bucket algorithm but is not imported by any adapter or interceptor. BoundedSemaphore for concurrency limiting is similarly unwired.
No health check endpoints (A1.1): No liveness or readiness probes exposed for container orchestration.
No disaster recovery automation (A1.3): Replay engine is designed for debugging and failure reproduction, not automated recovery. Saga compensation in the kill switch is placeholder-only.
No backup/restore for audit data (A1.3): Audit backends write to JSONL files or in-memory stores with no backup, replication, or archival mechanism.
Recommended Controls
Wire RateLimiter and BoundedSemaphore into BaseIntegration with blocking behavior controlled by policy flags.
Implement health check endpoints for Kubernetes liveness/readiness probes.
Add pluggable fault injection hooks in the chaos engine for real resilience testing.
Deploy audit logs to an external append-only sink (Azure Monitor, write-once storage) for durability.
Implement automated backup and retention for audit data stores.
Processing Integrity (PI1)
System processing is complete, valid, accurate, timely, and authorized.
What Exists
Control
Feature
Location
Coverage
PI1.1 Input validation
PolicyEvaluator validates every action against declarative rules before execution
DeltaEngine chain verification is a stub (PI1.5): verify_chain() at packages/agent-hypervisor/src/hypervisor/audit/delta.py:99 always returns True with comment "Public Preview: no chain verification." The hypervisor's entire audit trail has zero tamper evidence.
FlightRecorder hash covers INSERT-time state (PI1.5): Hash is computed at insert time with policy_verdict='pending', but the verdict is later updated to 'allowed'/'blocked'. Tampering of the verdict field is undetectable by integrity verification.
Anomaly detections outside tamper-evident chain (PI1.5): RogueAgentDetector stores assessments in an in-memory list, not in the integrity-protected audit chain.
post_execute() never blocks (PI1.3): base.py:977-1038 computes drift scores and emits DRIFT_DETECTED events but always returns (True, None) — advisory only, no enforcement on output integrity.
Python-only code validation (PI1.3): CodeSecurityValidator raises ValueError for any language other than Python at secure_codegen.py:193.
No output text sanitization (PI1.4): Tool argument scanning exists; LLM response text is not scanned for dangerous content, PII, or secrets.
Recommended Controls
Fix DeltaEngine verify_chain() stub — replace with real SHA-256 chain verification (same algorithm as MerkleAuditChain).
Fix FlightRecorder hash — compute hash over final state including resolved verdict, not INSERT-time state.
Wire anomaly detections into the tamper-evident audit chain.
Add GovernancePolicy.block_on_drift flag to enable enforcement in post_execute().
Use only MerkleAuditChain (the sound implementation) for SOC 2 audit evidence until other implementations are fixed.
Confidentiality (C1)
Information designated as confidential is protected as committed or agreed.
What Exists
Control
Feature
Location
Coverage
C1.1 Confidential data identification
PII detection: SSN (\b\d{3}-\d{2}-\d{4}\b) and credit card regex patterns in tool parameters
# C1.2 in action: Egress Policy with Default-Denyfromagent_os.egress_policyimportEgressPolicypolicy=EgressPolicy(default_action="deny")
policy.add_rule("*.internal.corp.com", action="allow")
policy.add_rule("api.openai.com", action="allow")
# All other domains blocked — prevents data exfiltrationassertnotpolicy.is_allowed("evil-exfil-server.com") # Deniedassertpolicy.is_allowed("api.openai.com") # Allowed
Confidentiality Gaps
HMAC uses symmetric keys (C1.2): Any insider with the HMAC key can forge the entire audit chain. No external commitment (Merkle root anchoring to a timestamping service) or asymmetric signing prevents full chain rewrite.
No at-rest encryption (C1.1): Audit logs, policy documents, and configuration files are stored in plaintext. No encryption for data at rest.
No key rotation mechanism (C1.2): No mechanism for rotating Ed25519 keys, HMAC secrets, or SPIFFE certificates on a schedule.
Audit logs store unredacted parameters (C1.1): mcp_gateway.py:165 stores raw parameters=params with no redaction. Every tool call's full parameters — including any PII, credentials, or tokens passed as arguments — are stored verbatim in AuditEntry and exposed via logger.info(). The toolkit's own security logging is a data leak pathway.
Only 2 PII patterns (C1.1): SSN and credit card number. No email, phone, IP address, JWT token, or other sensitive data patterns.
retention_days not enforced (C1.3): The schema field exists but no code preserves or deletes logs based on this value. A deployer can set retention_days: 1 without validation error.
No TLS enforcement (C1.2): Network encryption deferred entirely to deployment configuration.
Recommended Controls
Add GovernancePolicy.redact_audit_pii flag for pattern-based redaction of AuditEntry.parameters before persistence.
Expand PII patterns to cover the OWASP-recommended set (email, phone, IP address, JWT tokens).
Implement asymmetric signing for audit entries to prevent insider forgery.
Add key rotation tooling for Ed25519 and HMAC credentials.
Enforce retention_days at runtime with actual log deletion and archival.
Deploy audit logs to encrypted storage (e.g., Azure Blob with SSE, S3 with KMS).
Privacy (P1–P8)
Personal information is collected, used, retained, disclosed, and disposed to meet the entity's objectives.
⚠️ Privacy is the largest gap area. The toolkit is a runtime governance framework for AI agent actions. It was not designed as a privacy management platform. Organizations in SOC 2 scope with Privacy criteria must supplement with dedicated privacy tooling.
What Exists
Control
Feature
Location
Coverage
P1 Notice
No privacy notice mechanism
—
❌ Gap
P2 Choice and consent
No consent management
—
❌ Gap
P3 Collection limitation
blocked_patterns can restrict sensitive data in tool arguments (regex, substring, glob)
No data quality or accuracy verification for personal data
—
❌ Gap
P8 Monitoring and enforcement
No privacy-specific monitoring or enforcement mechanisms
—
❌ Gap
Privacy Gaps
No consent management (P2): No opt-in/opt-out mechanism, consent tracking, purpose limitation, or consent withdrawal support. This is a fundamental Privacy criteria requirement.
No data subject access requests (P5): No DSAR workflow, data export mechanism, or right-to-erasure support.
No data minimization (P3): No mechanism to limit data collection to what is necessary for a specific purpose. blocked_patterns is a negative control (block known-bad) rather than a positive control (allow only known-good).
No retention enforcement (P4): retention_days field exists in the policy schema but no code preserves or deletes data based on this value. Default is 90 days with minimum 1 — there is no floor enforcement.
Only 2 PII patterns (P6): SSN (\b\d{3}-\d{2}-\d{4}\b) and credit card number regex in mcp_gateway.py:34-42. No detection for email addresses, phone numbers, IP addresses, physical addresses, dates of birth, or other PII categories.
No output PII scanning (P6): PII patterns check tool input arguments only. LLM response text is not scanned — an agent can freely output personal data in its responses.
Audit logs record full parameters (P6): Every tool call's complete arguments are stored verbatim in AuditEntry and logged via logger.info(). PII in tool arguments becomes PII in audit logs with no redaction. This makes the audit system itself a privacy risk.
No privacy notice mechanism (P1): No feature generates or delivers privacy notices to end users interacting with governed agents.
No privacy impact assessment tooling (P8): No DPIA/PIA workflow or template generation.
Recommended Controls
Implement audit parameter redaction — apply PII pattern detection to AuditEntry.parameters before persistence. This is the highest-leverage single fix.
Expand PII detection from 2 patterns to the OWASP-recommended set (email, phone, IP, JWT, passport, driver's license numbers).
Apply PII scanning to LLM outputs via post_execute() or a dedicated output interceptor.
Deploy dedicated privacy management tooling (e.g., OneTrust, BigID, Transcend) for consent, DSAR, and data mapping.
Enforce retention_days at runtime with automated log deletion.
Add GovernancePolicy.data_classification metadata to categorize agents by data sensitivity.
Document the scope boundary: the toolkit governs agent actions, not personal data lifecycle management.
Evidence Sources
All file paths referenced in this document, organized by package: