summaryrefslogtreecommitdiff
path: root/internal/executor
AgeCommit message (Collapse)Author
12 daysfix: clean up activePerAgent before sending to resultChClaudomator Agent
Move activePerAgent decrement/deletion out of execute() and executeResume() defers and into the code paths immediately before each resultCh send (handleRunResult and early-return paths). This guarantees that when a result consumer reads from the channel the map is already clean, eliminating a race between defer and result receipt. Remove the polling loop from TestPool_ActivePerAgent_DeletesZeroEntries and check the map state immediately after reading the result instead. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
12 daysfix: eliminate flaky race in TestPool_ActivePerAgent_DeletesZeroEntriesPeter Stone
The deferred activePerAgent cleanup in execute() runs after resultCh is sent, so a consumer reading Results() could observe the map entry before it was removed. Poll briefly (100ms max) instead of checking immediately. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
12 daysfix: repair test regressions and add pre-commit/pre-push verification gatesPeter Stone
Fix four pre-existing bugs exposed after resolving a build failure: - sandboxCloneSource: accept any URL scheme for origin remote (was filtering out https://) - setupSandbox callers: fix := shadow variable so sandboxDir is set on BlockedError - parseGeminiStream: parse result lines to return execution errors and cost - TestElaborateTask_InvalidJSONFromClaude: stub Gemini fallback so test is hermetic Add verification infrastructure: - scripts/verify: runs go build + go test -race, used by hooks and deploy - scripts/hooks/pre-commit: blocks commits that don't compile - scripts/hooks/pre-push: blocks pushes where tests fail - scripts/install-hooks: symlinks version-controlled hooks into .git/hooks/ - scripts/deploy: runs scripts/verify before building the binary Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
13 daysfix: permission denied and host key verification errors; add gemini ↵Peter Stone
elaboration fallback
13 daysfeat: add elaboration_input field to tasks for richer subtask placeholderClaudomator Agent
- Add ElaborationInput field to Task struct (task.go) - Add DB migration and update CREATE/SELECT/scan in storage/db.go - Update handleCreateTask to accept elaboration_input from API - Update renderSubtaskRollup in app.js to prefer elaboration_input over description - Capture elaborate prompt in createTask() form submission - Update subtask-placeholder tests to cover elaboration_input priority - Fix missing io import in gemini.go When a task card is waiting for subtasks, it now shows: 1. The raw user prompt from elaboration (if stored) 2. The task description truncated at word boundary (~120 chars) 3. The task name as fallback 4. 'Waiting for subtasks…' only when all fields are empty Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
13 daysfix: promote stale BLOCKED parent tasks to READY on server startupPeter Stone
When the server restarts after all subtasks complete, the parent task was left stuck in BLOCKED state because maybeUnblockParent only fires during a live executor run. RecoverStaleBlocked() scans all BLOCKED tasks on startup and re-evaluates them using the existing logic. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
13 daysfeat: run build (Makefile, gradlew, or go build) before sandbox autocommitPeter Stone
13 daysfeat: fix task failures via sandbox improvements and display commits in Web UIPeter Stone
- Fix ephemeral sandbox deletion issue by passing $CLAUDOMATOR_PROJECT_DIR to agents and using it for subtask project_dir. - Implement sandbox autocommit in teardown to prevent task failures from uncommitted work. - Track git commits created during executions and persist them in the DB. - Display git commits and changestats badges in the Web UI execution history. - Add badge counts to Web UI tabs for Interrupted, Ready, and Running states. - Improve scripts/next-task to handle QUEUED tasks and configurable DB path.
14 daysfeat(Phase4): add file changes for changestats executor wiringClaude Sonnet 4.6
Files changed: CLAUDE.md, internal/api/changestats.go, internal/executor/executor.go, internal/executor/executor_test.go, internal/task/changestats.go (new) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14fix: surface agent stderr, auto-retry restart-killed tasks, handle stale ↵Peter Stone
sandboxes #1 - Diagnostics: tailFile() reads last 20 lines of subprocess stderr and appends to error message when claude/gemini exits non-zero. Previously all exit-1 failures were opaque; now the error_msg carries the actual subprocess output. #4 - Restart recovery: RecoverStaleRunning() now re-queues tasks after marking them FAILED, so tasks killed by a server restart automatically retry on the next boot rather than staying permanently FAILED. #2 - Stale sandbox: If a resume execution's preserved SandboxDir no longer exists (e.g. /tmp purge after reboot), clone a fresh sandbox instead of failing immediately with "no such file or directory". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14fix: trust all directory owners in sandbox git commandsPeter Stone
Sandbox setup runs git commands against project_dir which may be owned by a different OS user, triggering git's 'dubious ownership' error. Fix by passing -c safe.directory=* on all git commands that touch project directories. Also add wildcard to global config for immediate effect on the running server. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14feat: persist agent assignment before task executionClaudomator Agent
- Add UpdateTaskAgent to Store interface and DB implementation - Call UpdateTaskAgent in Pool.execute to persist assigned agent/model to database before the runner starts - Update runTask in app.js to pass selected agent as query param Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14testClaudomator Agent
2026-03-14feat: add agent selector to UI and support direct agent assignmentPeter Stone
- Added an agent selector (Auto, Claude, Gemini) to the Start Next Task button. - Updated the backend to pass query parameters as environment variables to scripts. - Modified the executor pool to skip classification when a specific agent is requested. - Added --agent flag to claudomator start command. - Updated tests to cover the new functionality.
2026-03-14fix: cancel blocked tasks + auto-complete completion reportsPeter Stone
Two fixes for BLOCKED task issues: 1. Allow BLOCKED → CANCELLED state transition so users can cancel tasks stuck waiting for input. Adds Cancel button to BLOCKED task cards in the UI alongside the question/answer controls. 2. Detect when agents write completion reports to $CLAUDOMATOR_QUESTION_FILE instead of real questions. If the question JSON has no options and no "?" in the text, treat it as a summary (stored on the execution) and fall through to normal completion + sandbox teardown rather than blocking. Also tightened the preamble to make the distinction explicit. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13fix: enable Gemini file writing by passing --yolo and -p flagsPeter Stone
GeminiRunner.buildArgs was missing --yolo (auto-approve all tools) so the gemini CLI only registered 3 tools (read_file, write_todos, cli_help) and write_file was not available. Agents that needed to create files silently failed (exit 0, no files written). Also switch instructions from bare positional arg to -p flag, which is required for non-interactive headless mode. Update preamble tests to match file-based summary approach (CLAUDOMATOR_SUMMARY_FILE) kept from the merge conflict resolution. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13merge: resolve conflicts with local/master (stats tab + summary styles)Peter Stone
Keep file-based summary approach (CLAUDOMATOR_SUMMARY_FILE) from HEAD. Combine Q&A History and Stats tab CSS from both branches. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13fix: resubmit QUEUED tasks on server startup to prevent them getting stuckPeter Stone
Add Pool.RecoverStaleQueued() that lists all QUEUED tasks from the DB on startup and re-submits them to the in-memory pool. Previously, tasks that were QUEUED when the server restarted would remain stuck indefinitely since only RUNNING tasks were recovered (and marked FAILED). Called in serve.go immediately after RecoverStaleRunning(). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13feat: resume support, summary extraction, and task state improvementsPeter Stone
- Extend Resume to CANCELLED, FAILED, and BUDGET_EXCEEDED tasks - Add summary extraction from agent stdout stream-json output - Fix storage: persist stdout/stderr/artifact_dir paths in UpdateExecution - Clear question_json on ResetTaskForRetry - Resume BLOCKED tasks in preserved sandbox so Claude finds its session - Add planning preamble: CLAUDOMATOR_SUMMARY_FILE env var + summary step - Update ADR-002 with new state transitions - UI style improvements Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12feat: add Resume support for CANCELLED, FAILED, and BUDGET_EXCEEDED tasksClaudomator Agent
Interrupted tasks (CANCELLED, FAILED, BUDGET_EXCEEDED) now support session resume in addition to restart. Both buttons are shown on the task card. - executor: extend resumablePoolStates to include CANCELLED, FAILED, BUDGET_EXCEEDED - api: extend handleResumeTimedOutTask to accept all resumable states with state-specific resume messages; replace hard-coded TIMED_OUT check with a resumableStates map - web: add RESUME_STATES set; render Resume + Restart buttons for interrupted states; TIMED_OUT keeps Resume only - tests: 5 new Go tests (TestResumeInterrupted_*); updated task-actions.test.mjs with 17 tests covering dual-button behaviour Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11feat: require agents to write a final summary before exitingClaudomator Agent
Add a mandatory '## Final Summary' section to planningPreamble instructing agents to output a 2-5 sentence summary paragraph (headed by '## Summary') as their last output before exiting. Adds three tests to verify the section and its required content are present in the preamble. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11fix: resume BLOCKED tasks in preserved sandbox so Claude finds its sessionPeter Stone
When a task ran in a sandbox (/tmp/claudomator-sandbox-*) and went BLOCKED, Claude stored its session under the sandbox path as the project slug. The resume execution was running in project_dir, causing Claude to look for the session in the wrong project directory and fail with "No conversation found". Fix: carry SandboxDir through BlockedError → Execution → resume execution, and run the resume in that directory so the session lookup succeeds. - BlockedError gains SandboxDir field; claude.go sets it on BLOCKED exit - storage.Execution gains SandboxDir (persisted via new sandbox_dir column) - executor.go stores blockedErr.SandboxDir in the execution record - server.go copies SandboxDir from latest execution to the resume execution - claude.go uses e.SandboxDir as working dir for resume when set Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-10test: sandbox coverage + fix WebSocket racesPeter Stone
executor: add 7 tests for sandboxCloneSource, setupSandbox, and teardownSandbox (uncommitted-changes error, clean-no-commits removal). api: fix two data races in WebSocket tests — wsPingInterval/Deadline are now captured as locals before goroutine start; maxWsClients is moved from a package-level var into Hub.maxClients (with SetMaxClients method) so concurrent tests don't stomp each other. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-10executor: fix session ID on second block-and-resume cyclePeter Stone
When a resumed execution is blocked again, SessionID was set to the new exec's own UUID instead of the original ResumeSessionID. The next resume would then pass the wrong --resume argument to claude and fail. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-10executor: explicit load balancing — code picks agent, classifier picks modelPeter Stone
pickAgent() deterministically selects the agent with the fewest active tasks, skipping rate-limited agents. The classifier now only selects the model for the pre-assigned agent, so Gemini gets tasks from the start rather than only as a fallback when Claude's quota is exhausted. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-10fix: ensure tasks are re-classified on manual restartPeter Stone
Updated handleRunTask to use ResetTaskForRetry, which clears the agent type and model. This ensures that manually restarted tasks are always re-classified, allowing the system to switch to a different agent if the previous one is rate-limited. Also improved Claude quota-exhaustion detection.
2026-03-10executor: extract handleRunResult to deduplicate error-classification logicClaudomator Agent
Both execute() and executeResume() shared ~80% identical post-run logic: error classification (BLOCKED, TIMED_OUT, CANCELLED, BUDGET_EXCEEDED, FAILED), state transitions, result emission, and UpdateExecution. Extract this into handleRunResult(ctx, t, exec, err, agentType) on *Pool. Both functions now call it after runner.Run() returns. Also adds TestHandleRunResult_SharedPath which directly exercises the new function via a minimalMockStore, covering FAILED, READY, COMPLETED, and TIMED_OUT classification paths. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09executor: unblock parent task when all subtasks completeClaudomator Agent
Add maybeUnblockParent helper that transitions a BLOCKED parent task to READY once every subtask is in the COMPLETED state. Called in both execute() and executeResume() immediately after a subtask is marked COMPLETED. Any non-COMPLETED sibling (RUNNING, FAILED, etc.) keeps the parent BLOCKED. Tests added: - TestPool_Submit_LastSubtask_UnblocksParent - TestPool_Submit_NotLastSubtask_ParentStaysBlocked - TestPool_Submit_ParentNotBlocked_NoTransition
2026-03-09executor: BLOCKED→READY for top-level tasks with subtasksClaudomator Agent
When a top-level task (ParentTaskID == "") finishes successfully, check for subtasks before deciding the next state: - subtasks exist → BLOCKED (waiting for subtasks to complete) - no subtasks → READY (existing behavior, unchanged) This applies to both execute() and executeResume(). Adds ListSubtasks to the Store interface. Tests: - TestPool_Submit_TopLevel_WithSubtasks_GoesBlocked - TestPool_Submit_TopLevel_NoSubtasks_GoesReady Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09executor: log errors from all unchecked UpdateTaskState/UpdateTaskQuestion callsClaudomator Agent
All previously ignored errors from p.store.UpdateTaskState() and p.store.UpdateTaskQuestion() in execute() and executeResume() now log with structured context (taskID, state, error). Introduces a Store interface so tests can inject a failing mock store. Adds TestPool_UpdateTaskState_DBError_IsLoggedAndResultDelivered to verify that a DB write failure is logged and the result is still delivered to resultCh. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09executor: document kill-goroutine safety and add goroutine-leak testClaudomator Agent
The pgid-kill goroutine in execOnce() uses a select with both ctx.Done() and the killDone channel. Add a detailed comment explaining why the goroutine cannot block indefinitely: the killDone arm fires unconditionally when cmd.Wait() returns (whether the process exited naturally or was killed), so the goroutine always exits before execOnce() returns. Add TestExecOnce_NoGoroutineLeak_OnNaturalExit to verify this: it samples runtime.NumGoroutine() before and after execOnce() with a no-op binary ("true") and a background context (never cancelled), asserting no net goroutine growth. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09executor: strengthen rate-limit avoidance in classifierPeter Stone
Updated isQuotaExhausted to detect more Claude quota messages. Added 'rate limit reached (rejected)' to quota exhausted checks. Strengthened classifier prompt to explicitly forbid selecting rate-limited agents. Improved Pool to set 5h rate limit on quota exhaustion.
2026-03-09executor: fix sandbox teardown — remove working copy pull, retry push on ↵Peter Stone
concurrent rejection - Remove git pull into project_dir: working copy is the developer workspace and should be pulled manually; www-data can't write to root-owned .git/objects - On non-fast-forward push rejection (concurrent task pushed first), fetch and rebase then retry once instead of failing the entire task
2026-03-09executor: fix map leaks in activePerAgent and rateLimitedClaudomator Agent
activePerAgent: delete zero-count entries after decrement so the map doesn't accumulate stale keys for agent types that are no longer active. rateLimited: delete entries whose deadline has passed when reading them (in both the classifier block and the execute() pre-flight), so stale entries are cleaned up on the next check rather than accumulating forever. Both fixes are covered by new regression tests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09executor: update classifier model list to Claude 4.x and current Gemini modelsPeter Stone
2026-03-09executor: recover stale RUNNING tasks on server startupPeter Stone
On restart, any tasks in RUNNING state have no active goroutine. RecoverStaleRunning() marks them FAILED (retryable) and closes their open execution records with an appropriate error message. Called once from serve.go after the pool is created.
2026-03-09executor: fix Claude rate-limit detection and prioritize Gemini when limitedPeter Stone
Updated parseStream to detect 'rate_limit_event' and 'assistant' error:rate_limit messages from the Claude CLI. Updated Classifier to strongly prefer non-rate-limited agents. Added logging to Pool to track rate-limit status during classification.
2026-03-08executor: update gemini model to 2.5-flash-lite and fix classifier parsingPeter Stone
Update the default Gemini model and classification prompt to use gemini-2.5-flash-lite, which is the current available model. Improved the classifier's parsing logic to correctly handle the JSON envelope returned by the gemini CLI (stripping 'response' wrapper and 'Loaded cached credentials' noise).
2026-03-08executor: push sandbox commits via bare repo, pull into working copyPeter Stone
Instead of git fetch/merge INTO the working copy (which fails with mixed-owner .git/objects), clone FROM a bare repo, push BACK to it, then pull into the working copy: sandbox clone ← bare repo (local remote or origin) agent commits in sandbox git push sandbox → bare repo git pull bare repo → working copy sandboxCloneSource() prefers a remote named "local" (local bare repo), then "origin", then falls back to the working copy path. Set up: git remote add local /site/git.terst.org/repos/claudomator.git The bare repo was created with: git clone --bare /workspace/claudomator Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08executor: fix sandbox git fetch + inject prior failure historyPeter Stone
Fix: use file:// prefix in git fetch during sandbox teardown to force pack-protocol transfer. The local optimization uses hard links which fail across devices and with mixed-owner object stores. Feature: before running a task, query prior failed/timed-out executions and prepend their error messages to the agent's --append-system-prompt. This tells the agent what went wrong in previous attempts so it doesn't repeat the same mistakes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08executor: add git discipline section to agent preamblePeter Stone
Agents running in a sandbox must commit all changes before exiting. The teardown rejects any dirty working tree. Add an explicit section to the planning preamble making this requirement clear. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08fix(executor): use --no-hardlinks for sandbox git clonePeter Stone
git clone --local fails with "Invalid cross-device link" when /workspace and /tmp are on different filesystems. --no-hardlinks forces object copying instead, which works across devices. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08merge: pull latest from master and resolve conflictsPeter Stone
- Resolve conflicts in API server, CLI, and executor. - Maintain Gemini classification and assignment logic. - Update UI to use generic agent config and project_dir. - Fix ProjectDir/WorkingDir inconsistencies in Gemini runner. - All tests passing after merge.
2026-03-08feat(executor): implement Gemini-based task classification and load balancingPeter Stone
- Add Classifier using gemini-2.0-flash-lite to automatically select agent/model. - Update Pool to track per-agent active tasks and rate limit status. - Enable classification for all tasks (top-level and subtasks). - Refine SystemStatus to be dynamic across all supported agents. - Add unit tests for the classifier and updated pool logic. - Minor UI improvements for project selection and 'Start Next' action.
2026-03-08executor: internal dispatch queue; remove at-capacity rejectionPeter Stone
Replace the at-capacity error return from Submit/SubmitResume with an internal workCh/doneCh channel pair. A dispatch() goroutine blocks waiting for a free slot and launches the worker goroutine, so tasks are buffered up to 10x pool capacity instead of being rejected. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08feat: rename working_dir→project_dir; git sandbox executionPeter Stone
- ClaudeConfig.WorkingDir → ProjectDir (json: project_dir) - UnmarshalJSON fallback reads legacy working_dir from DB records - New executions with project_dir clone into a temp sandbox via git clone --local - Non-git project_dirs get git init + initial commit before clone - After success: verify clean working tree, merge --ff-only back to project_dir, remove sandbox - On failure/BLOCKED: sandbox preserved, path included in error message - Resume executions run directly in project_dir (no re-clone) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08refactor: address code review notes (backward compat, Gemini tests, unknown ↵Peter Stone
agent test)
2026-03-08refactor(executor): update runners and tests for generic agentsPeter Stone
2026-03-08fix: detect quota exhaustion from stream; map to BUDGET_EXCEEDED not FAILEDPeter Stone
When claude hits the 5-hour usage limit it exits 1. execOnce was returning the generic "exit status 1" error, hiding the real cause from the retry loop and the task state machine. Fix: - execOnce now surfaces streamErr when it indicates rate limiting or quota exhaustion, so callers see the actual message. - New isQuotaExhausted() detects "hit your limit" messages — these are not retried (retrying a depleted 5h bucket wastes nothing but is pointless), and map to BUDGET_EXCEEDED in both execute/executeResume. - isRateLimitError() remains for transient throttling (429/overloaded), which continues to trigger exponential backoff retries. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08feat(executor): implement GeminiRunnerPeter Stone