# ADR-005: Git Sandbox Execution Model ## Status Accepted ## Context Tasks that modify source code need a safe execution environment. Running agents directly in the canonical working copy creates several problems: 1. **Concurrent corruption**: multiple agents running in the same directory stomp on each other's changes. 2. **Partial work leaks**: if a task is cancelled mid-run, half-written files remain in the working tree, blocking other work. 3. **No rollback**: a failed agent may leave the codebase in a broken state with no clean way to recover without manual `git reset`. 4. **Audit trail**: changes made by an agent should be visible as discrete, attributable git commits — not as an anonymous blob of working-tree noise. ## Decision When a task has `agent.project_dir` set, `ClaudeRunner.Run` executes the agent in an isolated git clone (a "sandbox") rather than in the project directory directly. ### Sandbox lifecycle ``` project_dir (canonical working copy) | | git clone --no-hardlinks /tmp/claudomator-sandbox-* | v sandbox (temp clone) | | agent runs here; commits its changes | | git push (to bare repo "local" or "origin") | teardown: verify no uncommitted changes, remove sandbox dir ``` ### Clone source `sandboxCloneSource` prefers a remote named `"local"` (a local bare repo). If not present it falls back to the `"origin"` remote. Using a bare repo accepts pushes cleanly; pushing to a non-bare working copy fails when the receiving branch is checked out. ### Uncommitted-change enforcement Before teardown, the runner runs `git status --porcelain` in the sandbox. If any uncommitted changes are detected, the task is failed with an error message listing the files. The sandbox is **preserved** so the operator can inspect or recover the partial work. The error message includes the sandbox path. ### Concurrent push conflicts If two sandboxes try to push at the same time, the second push is rejected (`"fetch first"` or `"non-fast-forward"` in the error output). The runner detects these signals and performs a fetch → rebase → retry sequence, up to a fixed retry limit, before giving up. ### BLOCKED state and sandbox preservation When an agent exits with a `question.json` file (entering the `BLOCKED` state), the sandbox is **not** torn down. The preserved sandbox allows the resumed execution to pick up the same working tree state, including any in-progress file changes made before the agent asked its question. **Known Risk: Resume skips sandbox.** Current implementation of Resume executions (`SubmitResume`) skips sandbox setup entirely and runs directly in `project_dir`. This is a significant behavioral divergence: if a resumed task makes further changes, they land directly in the canonical working copy, reintroducing the concurrent corruption and partial-work leak risks identified in the Context section. A future iteration should ensure resumed tasks pick up the preserved sandbox instead. ### Session ID propagation on resume A subtle bug was found and fixed: when a resumed execution is itself blocked again (a second BLOCKED→answer→resume cycle), the new execution record must carry the **original** `ResumeSessionID`, not the new execution's own UUID. If the wrong ID is used, `claude --resume` fails with "No conversation found". The fix is in `ClaudeRunner.Run`: if `e.ResumeSessionID != ""`, use it as `e.SessionID` rather than `e.ID`. ## Rationale - **`--no-hardlinks`**: git defaults to hardlinking objects between clone and source when both are on the same filesystem. This causes permission errors when the source is owned by a different user (e.g. `www-data` vs. `root`). The flag forces a full copy. - **Bare repo for push target**: non-bare repos reject pushes to checked-out branches. A bare repo (`git init --bare`) accepts all pushes safely. - **Preserve sandbox on failure**: partial agent work may be valuable for debugging or resumption. Destroying it immediately on failure was considered and rejected. - **Agent must commit**: requiring the agent to commit all changes before exiting ensures git history is always the source of truth. The enforcement check (uncommitted files → FAILED) makes this invariant observable. ## Alternatives Considered - **Run directly in working copy**: rejected because of concurrent corruption and partial-work leakage. - **Copy files instead of git clone**: rejected because the agent needs a working git history (for `git log`, `git blame`, and to push commits back). - **Docker/container isolation**: considered for stronger isolation but rejected due to operational complexity, dependency on container runtime, and inability to use the host's claude/gemini credentials. ## Consequences - Tasks without `project_dir` are unaffected; they run in whatever working directory the server process inherited. - If a sandbox's push repeatedly fails (e.g. due to a bare repo that is itself broken), the task is failed with the sandbox preserved. - **If `/tmp` runs out of space** (many large sandboxes), tasks will fail at clone time. This is a known operational risk. Mitigations such as periodic cleanup of old sandboxes (cron) or pre-clone disk space checks are required as follow-up items. - **The `project_dir` field in task YAML** must point to a git repository with a configured `"local"` or `"origin"` remote that accepts pushes. If neither remote exists or the push is rejected for other reasons, the task will be marked as `FAILED` and the sandbox will be preserved for manual recovery. ## Relevant Code Locations | Concern | File | |---|---| | Sandbox setup/teardown | `internal/executor/claude.go` | | `setupSandbox`, `teardownSandbox` | `internal/executor/claude.go` | | `sandboxCloneSource` | `internal/executor/claude.go` | | Resume skips sandbox | `internal/executor/claude.go` (Run) | | Session ID propagation fix | `internal/executor/claude.go` (Run) | | Sandbox tests | `internal/executor/claude_test.go` |