summaryrefslogtreecommitdiff
path: root/internal/executor/executor_test.go
AgeCommit message (Collapse)Author
4 daysfeat: merge story branch to master before deploy, add doot project to registryPeter Stone
- triggerStoryDeploy: fetch/checkout/merge --no-ff/push before running deploy script (ADR-007) - executor_test: TestPool_StoryDeploy_MergesStoryBranch proves merge happens - seed.go: add doot project with deploy script; wire claudomator deploy script Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
4 daysMerge branch 'master' of /site/git.terst.org/repos/claudomatorPeter Stone
4 daysfeat: validation result transitions story to REVIEW_READY or NEEDS_FIX (ADR-007)Claudomator Agent
Add checkValidationResult which inspects the final task.State of a completed validation task and updates the story to REVIEW_READY (pass) or NEEDS_FIX (fail). Wire into handleRunResult so stories in VALIDATING state are dispatched to checkValidationResult instead of checkStoryCompletion, covering both success and FAILED terminal paths. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
4 daysfeat: auto-create validation task on story DEPLOYED (ADR-007)Claudomator Agent
4 daysfeat: trigger deploy script on SHIPPABLE → DEPLOYED (ADR-007)Claudomator Agent
Add triggerStoryDeploy to Pool: fetches story's project, runs its DeployScript via exec.CommandContext, and advances story to DEPLOYED on success. Wire into checkStoryCompletion with go p.triggerStoryDeploy after the SHIPPABLE transition. Covered by TestPool_StoryDeploy_RunsDeployScript. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
4 daysfix: resolve merge conflict — integrate agent's story-aware ContainerRunnerPeter Stone
Agent added: Store on ContainerRunner (direct story/project lookup), --reference clone for speed, explicit story branch push, checkStoryCompletion → SHIPPABLE. My additions: BranchName on Task as fallback when Store is nil, tests updated to match checkout-after-clone approach. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
4 daysfeat: clone story branch in ContainerRunner (ADR-007)Peter Stone
- Add BranchName field to task.Task (populated from story at execution time) - Add GetStory to executor Store interface; resolve BranchName from story in both execute() and executeResume() parallel to RepositoryURL resolution - Pass --branch <name> to git clone when BranchName is set; default clone otherwise Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
5 daysfeat: Phase 4 — story-aware execution, branch clone, story completion ↵Claudomator Agent
check, deployment status - ContainerRunner: add Store field; clone with --reference when story has a local project path; checkout story branch after clone; push to story branch instead of HEAD - executor.Store interface: add GetStory, ListTasksByStory, UpdateStoryStatus - Pool.handleRunResult: trigger checkStoryCompletion when a story task succeeds - Pool.checkStoryCompletion: transitions story to SHIPPABLE when all tasks done - serve.go: wire Store into each ContainerRunner - stories.go: update createStoryBranch to fetch+checkout from origin/master base; add GET /api/stories/{id}/deployment-status endpoint - server.go: register deployment-status route - Tests: TestPool_CheckStoryCompletion_AllComplete/PartialComplete, TestHandleStoryDeploymentStatus Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
6 daysfeat: populate RepositoryURL from project registry in executor (ADR-007)Peter Stone
- Add GetProject to Store interface used by executor - Resolve RepositoryURL from project registry when task.RepositoryURL is empty - Call SeedProjects at server startup so the project registry is populated - Add GetProject stub to minimalMockStore in executor tests Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
7 daysfix: make requeueDelay configurable to fix test timeout in ↵Peter Stone
TestPool_MaxPerAgent_BlocksSecondTask
7 daysfeat: executor reliability — per-agent limit, drain gate, pre-flight ↵Claudomator Agent
creds, auth recovery - maxPerAgent=1: only 1 in-flight execution per agent type at a time; excess tasks are requeued after 30s - Drain gate: after 2 consecutive failures the agent is drained and a question is set on the task; reset on first success; POST /api/pool/agents/{agent}/undrain to acknowledge - Pre-flight credential check: verify .credentials.json and .claude.json exist in agentHome before spinning up a container - Auth error auto-recovery: detect auth errors (Not logged in, OAuth token has expired, etc.) and retry once after running sync-credentials and re-copying fresh credentials - Extracted runContainer() helper from ContainerRunner.Run() to support the retry flow - Wire CredentialSyncCmd in serve.go for all three ContainerRunner instances - Tests: TestPool_MaxPerAgent_*, TestPool_ConsecutiveFailures_*, TestPool_Undrain_*, TestContainerRunner_Missing{Credentials,Settings}_FailsFast, TestIsAuthError_*, TestContainerRunner_AuthError_SyncsAndRetries Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
9 daysfeat: agent status dashboard with availability timeline and Gemini quota ↵Peter Stone
detection - Detect Gemini TerminalQuotaError (daily quota) as BUDGET_EXCEEDED, not generic FAILED - Surface container stderr tail in error so quota/rate-limit classifiers can match it - Add agent_events table to persist rate-limit start/recovery events across restarts - Add GET /api/agents/status endpoint returning live agent state + 24h event history - Stats dashboard: agent status cards, 24h availability timeline, per-run execution table Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
10 daysfix: comprehensive addressing of container execution review feedbackPeter Stone
- Fix Critical Bug 1: Only remove workspace on success, preserve on failure/BLOCKED. - Fix Critical Bug 2: Use correct Claude flag (--resume) and pass instructions via file. - Fix Critical Bug 3: Actually mount and use the instructions file in the container. - Address Design Issue 4: Implement Resume/BLOCKED detection and host-side workspace re-use. - Address Design Issue 5: Consolidate RepositoryURL to Task level and fix API fallback. - Address Design Issue 6: Make agent images configurable per runner type via CLI flags. - Address Design Issue 7: Secure API keys via .claudomator-env file and --env-file flag. - Address Code Quality 8: Add unit tests for ContainerRunner arg construction. - Address Code Quality 9: Fix indentation regression in app.js. - Address Code Quality 10: Clean up orphaned Claude/Gemini runner files and move helpers. - Fix tests: Update server_test.go and executor_test.go to work with new model.
12 daysfix: clean up activePerAgent before sending to resultChClaudomator Agent
Move activePerAgent decrement/deletion out of execute() and executeResume() defers and into the code paths immediately before each resultCh send (handleRunResult and early-return paths). This guarantees that when a result consumer reads from the channel the map is already clean, eliminating a race between defer and result receipt. Remove the polling loop from TestPool_ActivePerAgent_DeletesZeroEntries and check the map state immediately after reading the result instead. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
12 daysfix: eliminate flaky race in TestPool_ActivePerAgent_DeletesZeroEntriesPeter Stone
The deferred activePerAgent cleanup in execute() runs after resultCh is sent, so a consumer reading Results() could observe the map entry before it was removed. Poll briefly (100ms max) instead of checking immediately. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
13 daysfix: promote stale BLOCKED parent tasks to READY on server startupPeter Stone
When the server restarts after all subtasks complete, the parent task was left stuck in BLOCKED state because maybeUnblockParent only fires during a live executor run. RecoverStaleBlocked() scans all BLOCKED tasks on startup and re-evaluates them using the existing logic. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14feat(Phase4): add file changes for changestats executor wiringClaude Sonnet 4.6
Files changed: CLAUDE.md, internal/api/changestats.go, internal/executor/executor.go, internal/executor/executor_test.go, internal/task/changestats.go (new) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14fix: surface agent stderr, auto-retry restart-killed tasks, handle stale ↵Peter Stone
sandboxes #1 - Diagnostics: tailFile() reads last 20 lines of subprocess stderr and appends to error message when claude/gemini exits non-zero. Previously all exit-1 failures were opaque; now the error_msg carries the actual subprocess output. #4 - Restart recovery: RecoverStaleRunning() now re-queues tasks after marking them FAILED, so tasks killed by a server restart automatically retry on the next boot rather than staying permanently FAILED. #2 - Stale sandbox: If a resume execution's preserved SandboxDir no longer exists (e.g. /tmp purge after reboot), clone a fresh sandbox instead of failing immediately with "no such file or directory". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14testClaudomator Agent
2026-03-14feat: add agent selector to UI and support direct agent assignmentPeter Stone
- Added an agent selector (Auto, Claude, Gemini) to the Start Next Task button. - Updated the backend to pass query parameters as environment variables to scripts. - Modified the executor pool to skip classification when a specific agent is requested. - Added --agent flag to claudomator start command. - Updated tests to cover the new functionality.
2026-03-13fix: resubmit QUEUED tasks on server startup to prevent them getting stuckPeter Stone
Add Pool.RecoverStaleQueued() that lists all QUEUED tasks from the DB on startup and re-submits them to the in-memory pool. Previously, tasks that were QUEUED when the server restarted would remain stuck indefinitely since only RUNNING tasks were recovered (and marked FAILED). Called in serve.go immediately after RecoverStaleRunning(). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-10executor: explicit load balancing — code picks agent, classifier picks modelPeter Stone
pickAgent() deterministically selects the agent with the fewest active tasks, skipping rate-limited agents. The classifier now only selects the model for the pre-assigned agent, so Gemini gets tasks from the start rather than only as a fallback when Claude's quota is exhausted. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-10executor: extract handleRunResult to deduplicate error-classification logicClaudomator Agent
Both execute() and executeResume() shared ~80% identical post-run logic: error classification (BLOCKED, TIMED_OUT, CANCELLED, BUDGET_EXCEEDED, FAILED), state transitions, result emission, and UpdateExecution. Extract this into handleRunResult(ctx, t, exec, err, agentType) on *Pool. Both functions now call it after runner.Run() returns. Also adds TestHandleRunResult_SharedPath which directly exercises the new function via a minimalMockStore, covering FAILED, READY, COMPLETED, and TIMED_OUT classification paths. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09executor: unblock parent task when all subtasks completeClaudomator Agent
Add maybeUnblockParent helper that transitions a BLOCKED parent task to READY once every subtask is in the COMPLETED state. Called in both execute() and executeResume() immediately after a subtask is marked COMPLETED. Any non-COMPLETED sibling (RUNNING, FAILED, etc.) keeps the parent BLOCKED. Tests added: - TestPool_Submit_LastSubtask_UnblocksParent - TestPool_Submit_NotLastSubtask_ParentStaysBlocked - TestPool_Submit_ParentNotBlocked_NoTransition
2026-03-09executor: BLOCKED→READY for top-level tasks with subtasksClaudomator Agent
When a top-level task (ParentTaskID == "") finishes successfully, check for subtasks before deciding the next state: - subtasks exist → BLOCKED (waiting for subtasks to complete) - no subtasks → READY (existing behavior, unchanged) This applies to both execute() and executeResume(). Adds ListSubtasks to the Store interface. Tests: - TestPool_Submit_TopLevel_WithSubtasks_GoesBlocked - TestPool_Submit_TopLevel_NoSubtasks_GoesReady Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09executor: log errors from all unchecked UpdateTaskState/UpdateTaskQuestion callsClaudomator Agent
All previously ignored errors from p.store.UpdateTaskState() and p.store.UpdateTaskQuestion() in execute() and executeResume() now log with structured context (taskID, state, error). Introduces a Store interface so tests can inject a failing mock store. Adds TestPool_UpdateTaskState_DBError_IsLoggedAndResultDelivered to verify that a DB write failure is logged and the result is still delivered to resultCh. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09executor: fix map leaks in activePerAgent and rateLimitedClaudomator Agent
activePerAgent: delete zero-count entries after decrement so the map doesn't accumulate stale keys for agent types that are no longer active. rateLimited: delete entries whose deadline has passed when reading them (in both the classifier block and the execute() pre-flight), so stale entries are cleaned up on the next check rather than accumulating forever. Both fixes are covered by new regression tests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09executor: recover stale RUNNING tasks on server startupPeter Stone
On restart, any tasks in RUNNING state have no active goroutine. RecoverStaleRunning() marks them FAILED (retryable) and closes their open execution records with an appropriate error message. Called once from serve.go after the pool is created.
2026-03-08executor: fix sandbox git fetch + inject prior failure historyPeter Stone
Fix: use file:// prefix in git fetch during sandbox teardown to force pack-protocol transfer. The local optimization uses hard links which fail across devices and with mixed-owner object stores. Feature: before running a task, query prior failed/timed-out executions and prepend their error messages to the agent's --append-system-prompt. This tells the agent what went wrong in previous attempts so it doesn't repeat the same mistakes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08merge: pull latest from master and resolve conflictsPeter Stone
- Resolve conflicts in API server, CLI, and executor. - Maintain Gemini classification and assignment logic. - Update UI to use generic agent config and project_dir. - Fix ProjectDir/WorkingDir inconsistencies in Gemini runner. - All tests passing after merge.
2026-03-08executor: internal dispatch queue; remove at-capacity rejectionPeter Stone
Replace the at-capacity error return from Submit/SubmitResume with an internal workCh/doneCh channel pair. A dispatch() goroutine blocks waiting for a free slot and launches the worker goroutine, so tasks are buffered up to 10x pool capacity instead of being rejected. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08refactor: address code review notes (backward compat, Gemini tests, unknown ↵Peter Stone
agent test)
2026-03-08refactor(executor): update runners and tests for generic agentsPeter Stone
2026-03-06fix: implement cancel endpoint and pool cancel mechanismPeter Stone
POST /api/tasks/{id}/cancel now works. Pool tracks a cancel func per running task ID; Cancel(taskID) calls it and returns false if the task isn't running. The execute goroutine registers/deregisters the cancel func around the runner call. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05executor: persist log paths at execution create time, not just at endPeter Stone
Add LogPather interface; ClaudeRunner implements it via ExecLogDir(). Pool pre-populates stdout_path/stderr_path/artifact_dir on the execution record before CreateExecution, so paths are in the DB from the moment a task starts running. ClaudeRunner.Run() skips path assignment when already set by the pool. Also update scripts/debug-execution to derive paths from the known convention (<data-dir>/executions/<exec-id>/) as a fallback for historical records that predate this change. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-04Add READY state for human-in-the-loop verificationPeter Stone
Top-level tasks now land in READY after successful execution instead of going directly to COMPLETED. Subtasks (with parent_task_id) skip the gate and remain COMPLETED. Users accept or reject via new API endpoints: POST /api/tasks/{id}/accept → READY → COMPLETED POST /api/tasks/{id}/reject → READY → PENDING (with rejection_comment) - task: add StateReady, RejectionComment field, update ValidTransition - storage: migrate rejection_comment column, add RejectTask method - executor: route top-level vs subtask to READY vs COMPLETED - api: /accept and /reject handlers with 409 on invalid state Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-08Rename Go module to github.com/thepeterstone/claudomatorPeter Stone
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08Initial project: task model, executor, API server, CLI, storage, reporterPeter Stone
Claudomator automation toolkit for Claude Code with: - Task model with YAML parsing, validation, state machine (49 tests, 0 races) - SQLite storage for tasks and executions - Executor pool with bounded concurrency, timeout, cancellation - REST API + WebSocket for mobile PWA integration - Webhook/multi-notifier system - CLI: init, run, serve, list, status commands - Console, JSON, HTML reporters with cost tracking Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>