13 tests:
- Allowlist linter: every entry has 4 required fields, no duplicates,
justification length > 20 chars
- Deny-list verification: dangerous methods (Runtime.evaluate, Page.navigate,
Network.getResponseBody, Browser.close, Target.attachToTarget, etc.) are
NOT allowed (Codex T2 categories 4-7)
- Per-tab mutex serializes ops on same tab
- Per-tab mutex allows parallel ops across different tabs
- Global lock blocks tab locks; tab locks block global lock
- Acquire timeout yields CDPMutexAcquireTimeout (no silent hang)
- Timeout error names the tab id and the timeout budget
Also extends Network.disable justification to satisfy linter.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codex T2: flip CDP posture to deny-default. Allowed methods enumerated in
cdp-allowlist.ts with (scope: tab|browser, output: trusted|untrusted,
justification) per entry.
Initial allowlist (~25 methods) covers:
- Accessibility tree extraction (read-only)
- DOM/CSS inspection (read-only)
- Performance metrics
- Tracing
- Emulation viewport/UA override
- Page screenshot/PDF capture (output is binary, no marker injection vector)
- Network.enable/disable (no bodies/cookies — those are exfil surfaces)
- Runtime.getProperties (NO evaluate/callFunctionOn — those would be RCE)
Page.navigate is INTENTIONALLY NOT allowed; agents use $B goto which
goes through the URL blocklist.
Codex T7: two-tier mutex. tab-scoped methods take per-tab lock; browser-
scoped take global lock that blocks all tab locks. 5s acquire timeout
yields CDPMutexAcquireTimeout (no silent hangs). All lock acquires use
try/finally so errors don't leak the lock.
Path A from spike: uses Playwright's newCDPSession() per page. No second
WebSocket, no need for --remote-debugging-port. CDPSession is cached
per page in a WeakMap and cleared on page close.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>