mirror of https://github.com/garrytan/gstack.git
222 lines
9.9 KiB
Cheetah
222 lines
9.9 KiB
Cheetah
---
|
|
name: ios-qa
|
|
preamble-tier: 3
|
|
version: 1.0.0
|
|
description: |
|
|
Live-device iOS QA for SwiftUI apps. Connects to a real iPhone via USB
|
|
CoreDevice IPv6 tunnel, reads Swift source to understand every screen, then
|
|
runs a vision-driven agent loop: screenshot → analyze → decide → act →
|
|
verify → repeat. All interaction happens via HTTP to an embedded
|
|
StateServer in the app under test. Optionally exposes the device over
|
|
Tailscale so remote agents (OpenClaw, Codex, any HTTP-capable agent) can
|
|
run iOS QA from anywhere without touching the hardware.
|
|
Use when asked to "ios qa", "test my iPhone app", "find bugs on the device",
|
|
or "qa the iOS app". (gstack)
|
|
voice-triggers:
|
|
- "iOS quality check"
|
|
- "test the iPhone app"
|
|
- "run iOS QA"
|
|
allowed-tools:
|
|
- Bash
|
|
- Read
|
|
- Write
|
|
- Edit
|
|
- Grep
|
|
- Glob
|
|
- AskUserQuestion
|
|
triggers:
|
|
- ios qa
|
|
- test the iphone app
|
|
- test my ios app
|
|
- find bugs on the device
|
|
- qa the ios app
|
|
---
|
|
|
|
{{PREAMBLE}}
|
|
|
|
# Live-device iOS QA
|
|
|
|
This skill drives a real iPhone via USB. The agent reads your Swift source,
|
|
generates typed state accessors, deploys a debug bridge, and runs a closed
|
|
find→fix→verify loop. No simulator, no XCTest, no WebDriverAgent.
|
|
|
|
## Architecture
|
|
|
|
```
|
|
┌──────────────────────┐ USB CoreDevice (IPv6) ┌──────────────────┐
|
|
│ gstack-ios-qa daemon │ ────────────────────────▶ │ iOS app │
|
|
│ (Mac, bun/TS) │ bearer + X-Session-Id │ StateServer │
|
|
│ │ │ (loopback only) │
|
|
│ - boot token rotate │ │ - /tap /swipe │
|
|
│ - session minting │ │ - /type /state │
|
|
│ - audit + redact │ │ - /snapshot │
|
|
└──────────────────────┘ └──────────────────┘
|
|
▲
|
|
│ Tailscale (optional, --tailnet)
|
|
│
|
|
┌──────────────────────┐
|
|
│ Remote agent │
|
|
│ (OpenClaw, etc.) │
|
|
└──────────────────────┘
|
|
```
|
|
|
|
The iOS app's `StateServer` binds loopback only (`::1` + `127.0.0.1`). Tailnet
|
|
ingress is exclusively the Mac daemon's job. The daemon validates Tailscale
|
|
identities via the local `tailscaled` socket and mints short-lived session
|
|
tokens (default 1h) for remote agents.
|
|
|
|
## Prerequisites
|
|
|
|
- macOS (the daemon uses `devicectl` from Xcode).
|
|
- iPhone connected via USB, paired and trusted.
|
|
- Xcode + Swift toolchain installed (`swift --version` reports >= 5.9).
|
|
- App source available on disk, with at least one `@Observable` class.
|
|
- For remote-control mode: Tailscale installed and the user logged in.
|
|
|
|
## Phase 0: Session warm-start (optional)
|
|
|
|
If `~/.gstack/ios-qa-session.json` exists and the device is still connected,
|
|
skip Phase 1-2 and jump to Phase 3. The session cache holds the rotated token,
|
|
UDID, tunnel address, and accessor hash. Invalidate the cache when:
|
|
|
|
- The user passes `--cold` to force a full bootstrap.
|
|
- The accessor hash mismatch is detected on first state query.
|
|
- The daemon reports the cached UDID is no longer connected.
|
|
|
|
```bash
|
|
SESSION="$HOME/.gstack/ios-qa-session.json"
|
|
if [ -f "$SESSION" ] && [ "$COLD" != "1" ]; then
|
|
CACHED_UDID=$(python3 -c "import json,os; d=json.load(open(os.path.expanduser('$SESSION'))); print(d['udid'])")
|
|
CACHED_PORT=$(python3 -c "import json,os; d=json.load(open(os.path.expanduser('$SESSION'))); print(d['daemon_port'])")
|
|
if curl -sf "http://127.0.0.1:$CACHED_PORT/healthz" > /dev/null; then
|
|
echo "Warm start: daemon alive, device $CACHED_UDID connected"
|
|
fi
|
|
fi
|
|
```
|
|
|
|
## Phase 1: Read source, plan codegen
|
|
|
|
1. Walk the app source (passed as `--source <dir>`) and identify all `@Observable`
|
|
classes. Note any property marked with the `@Snapshotable` wrapper — those
|
|
are the snapshot-eligible fields.
|
|
2. Run `swift run --package-path $GSTACK_HOME/ios-qa/scripts/gen-accessors-tool gen-accessors --input <source-dir>`.
|
|
First invocation builds the swift-syntax dependency tree (cold: 2-5 min).
|
|
Subsequent runs are content-hash-cached and finish in ~50ms.
|
|
3. Show the user the accessor list and ask whether to install the DebugBridge
|
|
SPM dependency into their `Package.swift` (one AskUserQuestion).
|
|
|
|
## Phase 2: Bootstrap the device bridge
|
|
|
|
1. Add the `DebugBridge` SPM dependency to the app's `Package.swift`. The package
|
|
ships three Debug-config-only library products:
|
|
- `DebugBridgeCore` (Swift, cross-platform) — StateServer + bridge protocols.
|
|
- `DebugBridgeTouch` (Objective-C, iOS-only) — KIF-derived in-process touch
|
|
synthesis with iOS 18+ `_UIHitTestContext` SwiftUI hit-testing.
|
|
- `DebugBridgeUI` (Swift, iOS-only) — Screenshot / Elements / Mutation
|
|
bridge implementations.
|
|
The app target depends on `DebugBridgeUI` with `.when(configuration: .debug)`
|
|
(transitively pulls in Core + Touch). Release builds refuse to link these
|
|
targets.
|
|
2. Wire the bridges from the `@main` App init, gated on `#if DEBUG`:
|
|
```swift
|
|
#if DEBUG
|
|
import DebugBridgeCore
|
|
StateServer.shared.start()
|
|
#if canImport(UIKit)
|
|
import DebugBridgeUI
|
|
DebugBridgeUIWiring.installAll()
|
|
#endif
|
|
#endif
|
|
```
|
|
3. Build + deploy to the device with `xcodebuild -scheme <SchemeName>
|
|
-destination 'platform=iOS,id=<UDID>' build install`.
|
|
4. Launch via `devicectl device process launch --device <UDID> --console <bundle-id>`.
|
|
Capture the boot token printed to `os_log` on first run.
|
|
5. Spawn the Mac-side daemon (on-demand) — `gstack-ios-qa-daemon`. Daemon
|
|
acquires an exclusive flock on `~/.gstack/ios-qa-daemon.pid`. If another
|
|
daemon is alive, the second invocation discovers its port and connects.
|
|
6. Daemon immediately calls `POST /auth/rotate` on the iOS StateServer with a
|
|
fresh in-memory-only token. The boot token becomes useless ~5s later.
|
|
Anything scraping `os_log` past this point sees a dead credential.
|
|
|
|
## Phase 3: Vision-driven agent loop
|
|
|
|
Each iteration:
|
|
|
|
1. `GET /screenshot` (via daemon) → save PNG.
|
|
2. `GET /elements` → accessibility tree.
|
|
3. `GET /state/snapshot` (only `@Snapshotable` fields) → current state.
|
|
4. Decide next action based on what's on the screen vs the test goal.
|
|
5. `POST /session/acquire` to grab the device lock.
|
|
6. Execute `POST /tap`, `/swipe`, `/type`, or `POST /state/<key>` write.
|
|
7. Re-screenshot; compare; record finding if buggy.
|
|
8. `POST /session/release` once the iteration is done.
|
|
|
|
Each authenticated mutating request through the tailnet listener (if remote
|
|
mode is active) writes an audit row to
|
|
`~/.gstack/security/ios-qa-audit.jsonl`.
|
|
|
|
## Modes
|
|
|
|
**Local-USB mode (default).** Daemon binds loopback only; no Tailscale
|
|
required. The spawning skill gets full-surface access. Best for solo
|
|
development.
|
|
|
|
**Tailnet mode (`--tailnet`).** Daemon additionally binds the Tailscale
|
|
interface (never `0.0.0.0`). Requires `tailscaled` to be running locally and
|
|
the daemon to be able to read `/var/run/tailscale.sock`. Fails closed if the
|
|
socket is missing, permission-denied, or returns an unparseable WhoIs
|
|
response. Remote agents hit `POST /auth/mint` over tailnet, daemon
|
|
canonicalizes identity via WhoIs, checks the allowlist file, mints a
|
|
session token. See `ios-qa/docs/tailscale-acl-example.md`.
|
|
|
|
**Capability tiers (tailnet mode).** Minted tokens default to `interact`
|
|
(taps, swipes, types). Higher tiers require explicit owner mint:
|
|
|
|
- **observe:** `/screenshot`, `/elements`, `GET /state/*`, `/healthz`,
|
|
`/session/heartbeat`.
|
|
- **interact:** observe + `/tap`, `/swipe`, `/type`.
|
|
- **mutate:** interact + `POST /state/<key>`.
|
|
- **restore:** mutate + `POST /state/restore`.
|
|
|
|
Owner mints via `gstack-ios-qa-mint --remote <identity> --capability <tier>`
|
|
on the Mac. Self-service mint over tailnet only succeeds for already-allowlisted
|
|
identities.
|
|
|
|
**Recording mode (`--recording`).** DebugOverlay renders a small diagonal
|
|
"AGENT DEMO" watermark in a corner so screencasts are unambiguous about the
|
|
device being agent-driven.
|
|
|
|
## Demo mode
|
|
|
|
If the user says "demo", "demo mode", "show me", or "I want to see it
|
|
working", run in **DEMO MODE**. This changes how the agent interacts with
|
|
the app:
|
|
|
|
**DEMO MODE OVERRIDES ALL OTHER RULES.** When demo mode is active, the
|
|
agent MUST drive every action through visible UI (`/tap`, `/swipe`, `/type`)
|
|
and NEVER use `POST /state/*` writes to skip steps. Viewers see the agent
|
|
type every key, tap every button. The on-device DebugOverlay attribution
|
|
chip shows "Driven by Claude Code (demo)" or the remote agent identity.
|
|
|
|
In demo mode, the screencap rate is bumped to 4fps so the recording feels
|
|
live.
|
|
|
|
## Failure modes + recovery
|
|
|
|
| Symptom | Likely cause | Action |
|
|
|---|---|---|
|
|
| `curl: connection refused` to daemon | daemon crashed | Re-run `/ios-qa`; spawn-race lock will fail closed |
|
|
| `403 identity_not_allowed` from `/auth/mint` | identity missing from allowlist | Run `gstack-ios-qa-mint --remote <identity>` on the Mac |
|
|
| `409 schema_mismatch` on `/state/restore` | snapshot from older app build | Discard the snapshot; re-capture |
|
|
| `503 device_disconnected` from proxy | USB tunnel dropped | Reconnect device; daemon auto-reconnects within 30s |
|
|
| `429 rate_limited` from `/auth/mint` | >10 mints/min from one identity | Wait 60s; check audit log for anomalies |
|
|
| `413 body_too_large` on `/state/restore` | snapshot >1MB | Increase `--max-body` or trim snapshot |
|
|
|
|
## Cleanup
|
|
|
|
Use `/ios-clean` to remove the DebugBridge SPM dependency and all `#if DEBUG`
|
|
wiring before a Release build. This is a convenience flow; the structural
|
|
Release-build guard (Package.swift `.when(configuration: .debug)` + CI
|
|
`swift build -c release` check) is the safety-critical path.
|