claudomator.git - claudomator — task automation server

Age	Commit message (Collapse)	Author
2026-03-16	fix: clean up activePerAgent before sending to resultCh	Claudomator Agent
	Move activePerAgent decrement/deletion out of execute() and executeResume() defers and into the code paths immediately before each resultCh send (handleRunResult and early-return paths). This guarantees that when a result consumer reads from the channel the map is already clean, eliminating a race between defer and result receipt. Remove the polling loop from TestPool_ActivePerAgent_DeletesZeroEntries and check the map state immediately after reading the result instead. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-16	fix: eliminate flaky race in TestPool_ActivePerAgent_DeletesZeroEntries	Peter Stone
	The deferred activePerAgent cleanup in execute() runs after resultCh is sent, so a consumer reading Results() could observe the map entry before it was removed. Poll briefly (100ms max) instead of checking immediately. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-16	fix: repair test regressions and add pre-commit/pre-push verification gates	Peter Stone
	Fix four pre-existing bugs exposed after resolving a build failure: - sandboxCloneSource: accept any URL scheme for origin remote (was filtering out https://) - setupSandbox callers: fix := shadow variable so sandboxDir is set on BlockedError - parseGeminiStream: parse result lines to return execution errors and cost - TestElaborateTask_InvalidJSONFromClaude: stub Gemini fallback so test is hermetic Add verification infrastructure: - scripts/verify: runs go build + go test -race, used by hooks and deploy - scripts/hooks/pre-commit: blocks commits that don't compile - scripts/hooks/pre-push: blocks pushes where tests fail - scripts/install-hooks: symlinks version-controlled hooks into .git/hooks/ - scripts/deploy: runs scripts/verify before building the binary Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-16	feat: add GitHub webhook endpoint for automatic CI failure task creation	Claudomator Agent
	Adds POST /api/webhooks/github that receives check_run and workflow_run events and creates a Claudomator task to investigate and fix the failure. - Config: new webhook_secret and [[projects]] fields in config.toml - HMAC-SHA256 validation when webhook_secret is configured - Ignores non-failure events (success, skipped, etc.) with 204 - Matches repo name to configured project dirs (case-insensitive) - Falls back to single project when no name match found - 11 new tests covering all acceptance criteria Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-16	feat: add deployment status endpoint for tasks	Peter Stone
	Adds GET /api/tasks/{id}/deployment-status which checks whether the currently-deployed server binary includes the fix commits from the task's latest execution. Uses git merge-base --is-ancestor to compare commit hashes against the running version. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-16	fix: permission denied and host key verification errors; add gemini ↵	Peter Stone
	elaboration fallback
2026-03-16	feat: add elaboration_input field to tasks for richer subtask placeholder	Claudomator Agent
	- Add ElaborationInput field to Task struct (task.go) - Add DB migration and update CREATE/SELECT/scan in storage/db.go - Update handleCreateTask to accept elaboration_input from API - Update renderSubtaskRollup in app.js to prefer elaboration_input over description - Capture elaborate prompt in createTask() form submission - Update subtask-placeholder tests to cover elaboration_input priority - Fix missing io import in gemini.go When a task card is waiting for subtasks, it now shows: 1. The raw user prompt from elaboration (if stored) 2. The task description truncated at word boundary (~120 chars) 3. The task name as fallback 4. 'Waiting for subtasks…' only when all fields are empty Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15	fix: promote stale BLOCKED parent tasks to READY on server startup	Peter Stone
	When the server restarts after all subtasks complete, the parent task was left stuck in BLOCKED state because maybeUnblockParent only fires during a live executor run. RecoverStaleBlocked() scans all BLOCKED tasks on startup and re-evaluates them using the existing logic. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15	feat: overhaul auto-refresh system with intelligent polling and differential ↵	Peter Stone
	updates
2026-03-15	feat: run build (Makefile, gradlew, or go build) before sandbox autocommit	Peter Stone

2026-03-15	feat: fix task failures via sandbox improvements and display commits in Web UI	Peter Stone
	- Fix ephemeral sandbox deletion issue by passing $CLAUDOMATOR_PROJECT_DIR to agents and using it for subtask project_dir. - Implement sandbox autocommit in teardown to prevent task failures from uncommitted work. - Track git commits created during executions and persist them in the DB. - Display git commits and changestats badges in the Web UI execution history. - Add badge counts to Web UI tabs for Interrupted, Ready, and Running states. - Improve scripts/next-task to handle QUEUED tasks and configurable DB path.
2026-03-14	feat(Phase4): add file changes for changestats executor wiring	Claude Sonnet 4.6
	Files changed: CLAUDE.md, internal/api/changestats.go, internal/executor/executor.go, internal/executor/executor_test.go, internal/task/changestats.go (new) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14	feat: expose changestats in API responses	Claudomator Agent
	- Add parseChangestatFromOutput/File helpers in internal/api/changestats.go to parse git diff --stat summary lines from execution stdout logs - Wire parser in processResult: after each execution completes, scan the stdout log for git diff stats and persist via UpdateExecutionChangestats - Tests: TestGetTask_IncludesChangestats (verifies processResult wiring), TestListExecutions_IncludesChangestats (verifies storage round-trip) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14	feat: add Changestats struct and storage support	Claudomator Agent
	- Add task.Changestats{FilesChanged, LinesAdded, LinesRemoved} - Add changestats_json column to executions via additive migration - Add Changestats field to storage.Execution struct - Add UpdateExecutionChangestats(execID, *task.Changestats) method - Update all SELECT/INSERT/scan paths for executions - Test: TestExecution_StoreAndRetrieveChangestats (was red, now green) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14	fix: surface agent stderr, auto-retry restart-killed tasks, handle stale ↵	Peter Stone
	sandboxes #1 - Diagnostics: tailFile() reads last 20 lines of subprocess stderr and appends to error message when claude/gemini exits non-zero. Previously all exit-1 failures were opaque; now the error_msg carries the actual subprocess output. #4 - Restart recovery: RecoverStaleRunning() now re-queues tasks after marking them FAILED, so tasks killed by a server restart automatically retry on the next boot rather than staying permanently FAILED. #2 - Stale sandbox: If a resume execution's preserved SandboxDir no longer exists (e.g. /tmp purge after reboot), clone a fresh sandbox instead of failing immediately with "no such file or directory". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14	fix: trust all directory owners in sandbox git commands	Peter Stone
	Sandbox setup runs git commands against project_dir which may be owned by a different OS user, triggering git's 'dubious ownership' error. Fix by passing -c safe.directory=* on all git commands that touch project directories. Also add wildcard to global config for immediate effect on the running server. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14	feat: persist agent assignment before task execution	Claudomator Agent
	- Add UpdateTaskAgent to Store interface and DB implementation - Call UpdateTaskAgent in Pool.execute to persist assigned agent/model to database before the runner starts - Update runTask in app.js to pass selected agent as query param Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14	test	Claudomator Agent

2026-03-14	feat: add agent selector to UI and support direct agent assignment	Peter Stone
	- Added an agent selector (Auto, Claude, Gemini) to the Start Next Task button. - Updated the backend to pass query parameters as environment variables to scripts. - Modified the executor pool to skip classification when a specific agent is requested. - Added --agent flag to claudomator start command. - Updated tests to cover the new functionality.
2026-03-14	fix: cancel blocked tasks + auto-complete completion reports	Peter Stone
	Two fixes for BLOCKED task issues: 1. Allow BLOCKED → CANCELLED state transition so users can cancel tasks stuck waiting for input. Adds Cancel button to BLOCKED task cards in the UI alongside the question/answer controls. 2. Detect when agents write completion reports to $CLAUDOMATOR_QUESTION_FILE instead of real questions. If the question JSON has no options and no "?" in the text, treat it as a summary (stored on the execution) and fall through to normal completion + sandbox teardown rather than blocking. Also tightened the preamble to make the distinction explicit. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13	test: add storage tests for UpdateTaskSummary and AppendTaskInteraction	Claudomator Agent
	The summary+interactions feature was already fully implemented but lacked storage-layer tests. Added tests covering round-trip persistence of task summaries, accumulation of Q&A interactions, and error handling for nonexistent tasks. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13	fix: only write RAW_NARRATIVE.md when user explicitly provides project_dir	Peter Stone
	Previously appendRawNarrative was called with the server's default workDir (os.Getwd()) when no project_dir was in the request, causing test runs and any elaboration without a project to pollute the repo's own RAW_NARRATIVE.md. The narrative is per-project human input — only write it when the caller explicitly specifies which project they're working in. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13	fix: enable Gemini file writing by passing --yolo and -p flags	Peter Stone
	GeminiRunner.buildArgs was missing --yolo (auto-approve all tools) so the gemini CLI only registered 3 tools (read_file, write_todos, cli_help) and write_file was not available. Agents that needed to create files silently failed (exit 0, no files written). Also switch instructions from bare positional arg to -p flag, which is required for non-interactive headless mode. Update preamble tests to match file-based summary approach (CLAUDOMATOR_SUMMARY_FILE) kept from the merge conflict resolution. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13	merge: resolve conflicts with local/master (stats tab + summary styles)	Peter Stone
	Keep file-based summary approach (CLAUDOMATOR_SUMMARY_FILE) from HEAD. Combine Q&A History and Stats tab CSS from both branches. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13	fix: resubmit QUEUED tasks on server startup to prevent them getting stuck	Peter Stone
	Add Pool.RecoverStaleQueued() that lists all QUEUED tasks from the DB on startup and re-submits them to the in-memory pool. Previously, tasks that were QUEUED when the server restarted would remain stuck indefinitely since only RUNNING tasks were recovered (and marked FAILED). Called in serve.go immediately after RecoverStaleRunning(). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13	feat: post-elaboration sanity check for tools, acceptance criteria, and dev ↵	Peter Stone
	practices Add sanitizeElaboratedTask() called after every elaboration response: - Infers missing allowed_tools from instruction keywords (Write/Edit/Read/Bash/Grep/Glob) - Auto-adds Read when Edit is present - Appends Acceptance Criteria section if none present - Appends TDD reminder for coding tasks without test mention Also tighten buildElaboratePrompt to require acceptance criteria and list concrete tool examples, reducing how often the model omits tools. Fixes class of failures where agents couldn't create files because the elaborator omitted Write from allowed_tools. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13	feat: resume support, summary extraction, and task state improvements	Peter Stone
	- Extend Resume to CANCELLED, FAILED, and BUDGET_EXCEEDED tasks - Add summary extraction from agent stdout stream-json output - Fix storage: persist stdout/stderr/artifact_dir paths in UpdateExecution - Clear question_json on ResetTaskForRetry - Resume BLOCKED tasks in preserved sandbox so Claude finds its session - Add planning preamble: CLAUDOMATOR_SUMMARY_FILE env var + summary step - Update ADR-002 with new state transitions - UI style improvements Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12	feat: add Resume support for CANCELLED, FAILED, and BUDGET_EXCEEDED tasks	Claudomator Agent
	Interrupted tasks (CANCELLED, FAILED, BUDGET_EXCEEDED) now support session resume in addition to restart. Both buttons are shown on the task card. - executor: extend resumablePoolStates to include CANCELLED, FAILED, BUDGET_EXCEEDED - api: extend handleResumeTimedOutTask to accept all resumable states with state-specific resume messages; replace hard-coded TIMED_OUT check with a resumableStates map - web: add RESUME_STATES set; render Resume + Restart buttons for interrupted states; TIMED_OUT keeps Resume only - tests: 5 new Go tests (TestResumeInterrupted_*); updated task-actions.test.mjs with 17 tests covering dual-button behaviour Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11	feat: add Stats tab with task distribution and execution health metrics	Claudomator Agent
	- Export computeTaskStats and computeExecutionStats from app.js - Add renderStatsPanel with state count grid, KPI row (total/success-rate/cost/avg-duration), and outcome bar chart - Wire stats tab into switchTab and poll for live refresh - Add Stats tab button and panel to index.html - Add CSS for .stats-counts, .stats-kpis, .stats-bar-chart using existing state color variables - Add docs/stats-tab-plan.md with component structure and data flow - 14 new unit tests in web/test/stats.test.mjs (140 total, all passing) No backend changes — derives all metrics from existing /api/tasks and /api/executions endpoints. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11	feat: require agents to write a final summary before exiting	Claudomator Agent
	Add a mandatory '## Final Summary' section to planningPreamble instructing agents to output a 2-5 sentence summary paragraph (headed by '## Summary') as their last output before exiting. Adds three tests to verify the section and its required content are present in the preamble. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11	fix: clear question_json when restarting a task via ResetTaskForRetry	Peter Stone
	A BLOCKED task that fails on resume would keep its stale question_json after being restarted. The frontend then showed "waiting for your input" with the old prompt even though the task was running fresh. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11	fix: resume BLOCKED tasks in preserved sandbox so Claude finds its session	Peter Stone
	When a task ran in a sandbox (/tmp/claudomator-sandbox-*) and went BLOCKED, Claude stored its session under the sandbox path as the project slug. The resume execution was running in project_dir, causing Claude to look for the session in the wrong project directory and fail with "No conversation found". Fix: carry SandboxDir through BlockedError → Execution → resume execution, and run the resume in that directory so the session lookup succeeds. - BlockedError gains SandboxDir field; claude.go sets it on BLOCKED exit - storage.Execution gains SandboxDir (persisted via new sandbox_dir column) - executor.go stores blockedErr.SandboxDir in the execution record - server.go copies SandboxDir from latest execution to the resume execution - claude.go uses e.SandboxDir as working dir for resume when set Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-10	test: sandbox coverage + fix WebSocket races	Peter Stone
	executor: add 7 tests for sandboxCloneSource, setupSandbox, and teardownSandbox (uncommitted-changes error, clean-no-commits removal). api: fix two data races in WebSocket tests — wsPingInterval/Deadline are now captured as locals before goroutine start; maxWsClients is moved from a package-level var into Hub.maxClients (with SetMaxClients method) so concurrent tests don't stomp each other. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-10	task: promote validTransitions to package-level var; fix ADR	Peter Stone
	Hoists the map out of ValidTransition so it's not reallocated on every call. Adds missing CANCELLED→QUEUED and BUDGET_EXCEEDED→QUEUED entries to the ADR transition table to match the implemented state machine. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-10	executor: fix session ID on second block-and-resume cycle	Peter Stone
	When a resumed execution is blocked again, SessionID was set to the new exec's own UUID instead of the original ResumeSessionID. The next resume would then pass the wrong --resume argument to claude and fail. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-10	executor: explicit load balancing — code picks agent, classifier picks model	Peter Stone
	pickAgent() deterministically selects the agent with the fewest active tasks, skipping rate-limited agents. The classifier now only selects the model for the pre-assigned agent, so Gemini gets tasks from the start rather than only as a fallback when Claude's quota is exhausted. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-10	feat: append verbatim user input to docs/RAW_NARRATIVE.md	Peter Stone
	The elaborator now logs every user prompt to docs/RAW_NARRATIVE.md within the project directory. This is done in a background goroutine to ensure it doesn't delay the response.
2026-03-10	feat: include project context in elaborator prompt	Peter Stone
	The elaborator now reads CLAUDE.md and SESSION_STATE.md from the project directory (if they exist) and prepends their content to the user prompt. This allows the AI to generate tasks that are more context-aware.
2026-03-10	fix: ensure tasks are re-classified on manual restart	Peter Stone
	Updated handleRunTask to use ResetTaskForRetry, which clears the agent type and model. This ensures that manually restarted tasks are always re-classified, allowing the system to switch to a different agent if the previous one is rate-limited. Also improved Claude quota-exhaustion detection.
2026-03-10	cli: implement --config flag to load TOML config file	Claudomator Agent
	The --config flag was registered but silently ignored. Now: - config.LoadFile loads a TOML file on top of defaults - PersistentPreRunE applies the file when --config is set - Explicit CLI flags (--data-dir, --claude-bin) take precedence over the file Tests: TestLoadFile_OverridesDefaults, TestLoadFile_MissingFile_ReturnsError, TestRootCmd_ConfigFile_Loaded, TestRootCmd_ConfigFile_CLIFlagOverrides, TestRootCmd_ConfigFile_Missing_ReturnsError Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-10	executor: extract handleRunResult to deduplicate error-classification logic	Claudomator Agent
	Both execute() and executeResume() shared ~80% identical post-run logic: error classification (BLOCKED, TIMED_OUT, CANCELLED, BUDGET_EXCEEDED, FAILED), state transitions, result emission, and UpdateExecution. Extract this into handleRunResult(ctx, t, exec, err, agentType) on *Pool. Both functions now call it after runner.Run() returns. Also adds TestHandleRunResult_SharedPath which directly exercises the new function via a minimalMockStore, covering FAILED, READY, COMPLETED, and TIMED_OUT classification paths. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-10	Remove legacy claude field and working_dir backward compat	Claudomator Agent
	- Remove Claude field alias from Task struct (already removed in sandbox) - Remove UnmarshalJSON from AgentConfig that silently accepted working_dir - Remove legacy claude fallback in scanTask (db.go) - Remove TestGetTask_BackwardCompatibility test that validated removed behavior Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09	executor: unblock parent task when all subtasks complete	Claudomator Agent
	Add maybeUnblockParent helper that transitions a BLOCKED parent task to READY once every subtask is in the COMPLETED state. Called in both execute() and executeResume() immediately after a subtask is marked COMPLETED. Any non-COMPLETED sibling (RUNNING, FAILED, etc.) keeps the parent BLOCKED. Tests added: - TestPool_Submit_LastSubtask_UnblocksParent - TestPool_Submit_NotLastSubtask_ParentStaysBlocked - TestPool_Submit_ParentNotBlocked_NoTransition
2026-03-09	executor: BLOCKED→READY for top-level tasks with subtasks	Claudomator Agent
	When a top-level task (ParentTaskID == "") finishes successfully, check for subtasks before deciding the next state: - subtasks exist → BLOCKED (waiting for subtasks to complete) - no subtasks → READY (existing behavior, unchanged) This applies to both execute() and executeResume(). Adds ListSubtasks to the Store interface. Tests: - TestPool_Submit_TopLevel_WithSubtasks_GoesBlocked - TestPool_Submit_TopLevel_NoSubtasks_GoesReady Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09	executor: log errors from all unchecked UpdateTaskState/UpdateTaskQuestion calls	Claudomator Agent
	All previously ignored errors from p.store.UpdateTaskState() and p.store.UpdateTaskQuestion() in execute() and executeResume() now log with structured context (taskID, state, error). Introduces a Store interface so tests can inject a failing mock store. Adds TestPool_UpdateTaskState_DBError_IsLoggedAndResultDelivered to verify that a DB write failure is logged and the result is still delivered to resultCh. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09	storage: fix DeleteTask atomicity and use recursive CTE	Claudomator Agent
	Replace BFS loop with a single recursive CTE to collect all descendant task IDs in one query, and wrap all DELETE statements in a transaction so a partial failure cannot leave orphaned executions. Add TestDeleteTask_DeepSubtaskCascadeAtomic: creates a 3-level task hierarchy with executions at each level, deletes the root, and verifies all tasks and executions are removed with an explicit orphan-row check. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09	storage: add missing indexes and ListRecentExecutions correctness tests	Claudomator Agent
	Add two schema indexes that were missing: - idx_executions_start_time on executions(start_time): speeds up ListRecentExecutions WHERE start_time >= ? ORDER BY start_time DESC - idx_tasks_parent_task_id on tasks(parent_task_id): speeds up ListSubtasks WHERE parent_task_id = ? Both use CREATE INDEX IF NOT EXISTS so they are safe to apply on existing databases without a migration version bump. Add TestListRecentExecutions_LargeDataset (100 rows, two tasks) covering: - returns all rows in descending start_time order - respects the limit parameter - filters correctly by since time - filters correctly by task_id Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09	api: validate ?state= param in handleListTasks; standardize operation ↵	Claudomator Agent
	response shapes - handleListTasks: validate ?state= against known states, return 400 with clear error for unrecognized values (e.g. ?state=BOGUS) - handleCancelTask: replace {"status":"cancelling"\|"cancelled"} with {"message":"...","task_id":"..."} to match run/resume shape - handleAnswerQuestion: replace {"status":"queued"} with {"message":"task queued for resume","task_id":"..."} - Tests: add TestListTasks_InvalidState_Returns400, TestListTasks_ValidState_Returns200, TestCancelTask_ResponseShape, TestAnswerQuestion_ResponseShape, TestRunTask_ResponseShape, TestResumeTimedOut_ResponseShape Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09	api: add configurable mockRunner and async error-path tests	Claudomator Agent
	Replace the no-op mockRunner in server_test.go with a configurable version that supports err and sleep fields. Add testServerWithRunner helper and a pollState utility for async assertions. Add three new tests that exercise the pool's error paths end-to-end: - TestRunTask_AgentFails_TaskSetToFailed - TestRunTask_AgentTimesOut_TaskSetToTimedOut - TestRunTask_AgentCancelled_TaskSetToCancelled Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09	executor: document kill-goroutine safety and add goroutine-leak test	Claudomator Agent
	The pgid-kill goroutine in execOnce() uses a select with both ctx.Done() and the killDone channel. Add a detailed comment explaining why the goroutine cannot block indefinitely: the killDone arm fires unconditionally when cmd.Wait() returns (whether the process exited naturally or was killed), so the goroutine always exits before execOnce() returns. Add TestExecOnce_NoGoroutineLeak_OnNaturalExit to verify this: it samples runtime.NumGoroutine() before and after execOnce() with a no-op binary ("true") and a background context (never cancelled), asserting no net goroutine growth. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>