summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
12 daysfix: deploy script skips scripts/hooks/ subdirectory when copyingPeter Stone
cp without -r fails on directories. Use find -maxdepth 1 -type f to copy only files, since hooks/ is for local dev only. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
12 daysfix: eliminate flaky race in TestPool_ActivePerAgent_DeletesZeroEntriesPeter Stone
The deferred activePerAgent cleanup in execute() runs after resultCh is sent, so a consumer reading Results() could observe the map entry before it was removed. Poll briefly (100ms max) instead of checking immediately. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
12 daysfix: repair test regressions and add pre-commit/pre-push verification gatesPeter Stone
Fix four pre-existing bugs exposed after resolving a build failure: - sandboxCloneSource: accept any URL scheme for origin remote (was filtering out https://) - setupSandbox callers: fix := shadow variable so sandboxDir is set on BlockedError - parseGeminiStream: parse result lines to return execution errors and cost - TestElaborateTask_InvalidJSONFromClaude: stub Gemini fallback so test is hermetic Add verification infrastructure: - scripts/verify: runs go build + go test -race, used by hooks and deploy - scripts/hooks/pre-commit: blocks commits that don't compile - scripts/hooks/pre-push: blocks pushes where tests fail - scripts/install-hooks: symlinks version-controlled hooks into .git/hooks/ - scripts/deploy: runs scripts/verify before building the binary Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
12 daysfeat: add GitHub webhook endpoint for automatic CI failure task creationClaudomator Agent
Adds POST /api/webhooks/github that receives check_run and workflow_run events and creates a Claudomator task to investigate and fix the failure. - Config: new webhook_secret and [[projects]] fields in config.toml - HMAC-SHA256 validation when webhook_secret is configured - Ignores non-failure events (success, skipped, etc.) with 204 - Matches repo name to configured project dirs (case-insensitive) - Falls back to single project when no name match found - 11 new tests covering all acceptance criteria Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
13 daysfeat: add deployment status endpoint for tasksPeter Stone
Adds GET /api/tasks/{id}/deployment-status which checks whether the currently-deployed server binary includes the fix commits from the task's latest execution. Uses git merge-base --is-ancestor to compare commit hashes against the running version. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
13 daysfeat: improve next-task selection and rejection UXPeter Stone
- next-task script: exclude rejected tasks from fallback selection; only pick PENDING tasks with no rejection comment and no prior executions, or QUEUED tasks (e.g. BUDGET_EXCEEDED retries) - web/app.js: prompt for optional rejection comment when rejecting a task, passing it through to the API instead of always sending an empty string Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
13 daysfix: permission denied and host key verification errors; add gemini ↵Peter Stone
elaboration fallback
13 daysfeat: add elaboration_input field to tasks for richer subtask placeholderClaudomator Agent
- Add ElaborationInput field to Task struct (task.go) - Add DB migration and update CREATE/SELECT/scan in storage/db.go - Update handleCreateTask to accept elaboration_input from API - Update renderSubtaskRollup in app.js to prefer elaboration_input over description - Capture elaborate prompt in createTask() form submission - Update subtask-placeholder tests to cover elaboration_input priority - Fix missing io import in gemini.go When a task card is waiting for subtasks, it now shows: 1. The raw user prompt from elaboration (if stored) 2. The task description truncated at word boundary (~120 chars) 3. The task name as fallback 4. 'Waiting for subtasks…' only when all fields are empty Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
13 daysfeat: replace static subtask placeholder with task descriptionClaudomator Agent
When a BLOCKED/READY task has no subtasks yet, show the task description (truncated to ~120 chars at a word boundary) instead of the generic 'Waiting for subtasks…' text. Falls back to task.name if no description, and finally to the original generic text if neither is present. - Add truncateToWordBoundary(text, maxLen=120) helper - Update renderSubtaskRollup(task, footer) to use task object instead of taskId - Update both READY and BLOCKED call sites - Add web/test/subtask-placeholder.test.mjs with 11 tests Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
13 daysfix: restore running tab rendering and throttle history fetchPeter Stone
- poll() now calls renderActiveTab(cache) on early-return so switching tabs always renders immediately instead of leaving the panel blank - renderRunningView unchanged check now requires running.length > 0, fixing the empty-state message never appearing when no tasks run - Extract renderActiveTab() to avoid duplicating the tab switch logic - Throttle execution history fetch to once per 60s (was every poll) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
13 daysfix: promote stale BLOCKED parent tasks to READY on server startupPeter Stone
When the server restarts after all subtasks complete, the parent task was left stuck in BLOCKED state because maybeUnblockParent only fires during a live executor run. RecoverStaleBlocked() scans all BLOCKED tasks on startup and re-evaluates them using the existing logic. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
13 daysfeat: overhaul auto-refresh system with intelligent polling and differential ↵Peter Stone
updates
13 daysfeat: run build (Makefile, gradlew, or go build) before sandbox autocommitPeter Stone
13 daysfeat: show subtask rollup on READY task cardsClaudomator Agent
READY tasks now call renderSubtaskRollup identical to BLOCKED tasks (without a question). The rollup appears above Accept/Reject buttons. New test: web/test/ready-subtasks.test.mjs (10 assertions, all pass).
13 daysMerge remote-tracking branch 'local/master'Peter Stone
14 daysfeat: fix task failures via sandbox improvements and display commits in Web UIPeter Stone
- Fix ephemeral sandbox deletion issue by passing $CLAUDOMATOR_PROJECT_DIR to agents and using it for subtask project_dir. - Implement sandbox autocommit in teardown to prevent task failures from uncommitted work. - Track git commits created during executions and persist them in the DB. - Display git commits and changestats badges in the Web UI execution history. - Add badge counts to Web UI tabs for Interrupted, Ready, and Running states. - Improve scripts/next-task to handle QUEUED tasks and configurable DB path.
14 daysfeat: add task count badges to interrupted, ready, and running tabsClaudomator Agent
- Add computeTabBadgeCounts(tasks) exported pure function - Add updateTabBadges(tasks) that updates badge spans in tab buttons - Call updateTabBadges on every poll regardless of active tab - Add .tab-count-badge spans to interrupted/ready/running tab buttons in HTML - Add CSS for .tab-count-badge pill styling (hidden when count is zero) - Add 11 tests in web/test/tab-badges.test.mjs covering all states Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
14 daysfeat(Phase4): add file changes for changestats executor wiringClaude Sonnet 4.6
Files changed: CLAUDE.md, internal/api/changestats.go, internal/executor/executor.go, internal/executor/executor_test.go, internal/task/changestats.go (new) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
14 daysfeat: extract and store changestats from execution outputClaude Sonnet 4.6
- Move git diff stat parser to internal/task/changestats.go (shared) - Add UpdateExecutionChangestats to executor.Store interface - Extract changestats in Pool.handleRunResult after every execution - Add three TDD tests: ExtractAndStore, NoChangestats, MalformedChangestats - Update CLAUDE.md Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
14 daysfeat: expose changestats in API responsesClaudomator Agent
- Add parseChangestatFromOutput/File helpers in internal/api/changestats.go to parse git diff --stat summary lines from execution stdout logs - Wire parser in processResult: after each execution completes, scan the stdout log for git diff stats and persist via UpdateExecutionChangestats - Tests: TestGetTask_IncludesChangestats (verifies processResult wiring), TestListExecutions_IncludesChangestats (verifies storage round-trip) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14feat: add Changestats struct and storage supportClaudomator Agent
- Add task.Changestats{FilesChanged, LinesAdded, LinesRemoved} - Add changestats_json column to executions via additive migration - Add Changestats field to storage.Execution struct - Add UpdateExecutionChangestats(execID, *task.Changestats) method - Update all SELECT/INSERT/scan paths for executions - Test: TestExecution_StoreAndRetrieveChangestats (was red, now green) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14fix: surface agent stderr, auto-retry restart-killed tasks, handle stale ↵Peter Stone
sandboxes #1 - Diagnostics: tailFile() reads last 20 lines of subprocess stderr and appends to error message when claude/gemini exits non-zero. Previously all exit-1 failures were opaque; now the error_msg carries the actual subprocess output. #4 - Restart recovery: RecoverStaleRunning() now re-queues tasks after marking them FAILED, so tasks killed by a server restart automatically retry on the next boot rather than staying permanently FAILED. #2 - Stale sandbox: If a resume execution's preserved SandboxDir no longer exists (e.g. /tmp purge after reboot), clone a fresh sandbox instead of failing immediately with "no such file or directory". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14fix: trust all directory owners in sandbox git commandsPeter Stone
Sandbox setup runs git commands against project_dir which may be owned by a different OS user, triggering git's 'dubious ownership' error. Fix by passing -c safe.directory=* on all git commands that touch project directories. Also add wildcard to global config for immediate effect on the running server. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14feat: persist agent assignment before task executionClaudomator Agent
- Add UpdateTaskAgent to Store interface and DB implementation - Call UpdateTaskAgent in Pool.execute to persist assigned agent/model to database before the runner starts - Update runTask in app.js to pass selected agent as query param Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14testClaudomator Agent
2026-03-14feat: add agent selector to UI and support direct agent assignmentPeter Stone
- Added an agent selector (Auto, Claude, Gemini) to the Start Next Task button. - Updated the backend to pass query parameters as environment variables to scripts. - Modified the executor pool to skip classification when a specific agent is requested. - Added --agent flag to claudomator start command. - Updated tests to cover the new functionality.
2026-03-14feat: show subtask rollup on BLOCKED tasks waiting for subtasksPeter Stone
When a task is BLOCKED due to spawned subtasks (no question), the card footer now fetches and renders a list of subtask names with their state emoji instead of showing the question/answer input UI. The Cancel button remains in both cases. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14fix: cancel blocked tasks + auto-complete completion reportsPeter Stone
Two fixes for BLOCKED task issues: 1. Allow BLOCKED → CANCELLED state transition so users can cancel tasks stuck waiting for input. Adds Cancel button to BLOCKED task cards in the UI alongside the question/answer controls. 2. Detect when agents write completion reports to $CLAUDOMATOR_QUESTION_FILE instead of real questions. If the question JSON has no options and no "?" in the text, treat it as a summary (stored on the execution) and fall through to normal completion + sandbox teardown rather than blocking. Also tightened the preamble to make the distinction explicit. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13test: add storage tests for UpdateTaskSummary and AppendTaskInteractionClaudomator Agent
The summary+interactions feature was already fully implemented but lacked storage-layer tests. Added tests covering round-trip persistence of task summaries, accumulation of Q&A interactions, and error handling for nonexistent tasks. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13fix: space tabs equally across full tab bar widthPeter Stone
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13feat: replace tab labels with emoji iconsPeter Stone
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13feat: reorganize web UI to 7-tab layout (Queue, Interrupted, Ready, Running, ↵Claudomator Agent
All, Stats, Settings) - Replace Tasks/Active tabs with Queue (QUEUED+PENDING), Interrupted, Ready top-level tabs - Add All tab (COMPLETED, TIMED_OUT, BUDGET_EXCEEDED within last 24h) and Settings placeholder - Export filterQueueTasks, filterReadyTasks, filterAllDoneTasks from app.js - Refactor poll() to dispatch to active tab's render function instead of always rendering all panels - Add renderQueuePanel, renderInterruptedPanel, renderReadyPanel, renderAllPanel helpers - Add tests in web/test/tab-filters.test.mjs covering all new filter functions (16 tests) - All 165 JS tests and all Go tests pass Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13fix: only write RAW_NARRATIVE.md when user explicitly provides project_dirPeter Stone
Previously appendRawNarrative was called with the server's default workDir (os.Getwd()) when no project_dir was in the request, causing test runs and any elaboration without a project to pollute the repo's own RAW_NARRATIVE.md. The narrative is per-project human input — only write it when the caller explicitly specifies which project they're working in. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13Merge branch 'master' of /site/git.terst.org/repos/claudomatorPeter Stone
2026-03-13fix: enable Gemini file writing by passing --yolo and -p flagsPeter Stone
GeminiRunner.buildArgs was missing --yolo (auto-approve all tools) so the gemini CLI only registered 3 tools (read_file, write_todos, cli_help) and write_file was not available. Agents that needed to create files silently failed (exit 0, no files written). Also switch instructions from bare positional arg to -p flag, which is required for non-interactive headless mode. Update preamble tests to match file-based summary approach (CLAUDOMATOR_SUMMARY_FILE) kept from the merge conflict resolution. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13fix: add .btn-resume CSS styling matching .btn-restartClaudomator Agent
2026-03-13feat: show New Task button on all tabsClaudomator Agent
Removed the switchTab() logic that hid btn-new-task on non-tasks tabs. The button lives in the global header so no structural changes were needed. Added new-task-button.test.mjs to contract-test the always-visible behavior.
2026-03-13merge: resolve conflicts with local/master (stats tab + summary styles)Peter Stone
Keep file-based summary approach (CLAUDOMATOR_SUMMARY_FILE) from HEAD. Combine Q&A History and Stats tab CSS from both branches. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13fix: resubmit QUEUED tasks on server startup to prevent them getting stuckPeter Stone
Add Pool.RecoverStaleQueued() that lists all QUEUED tasks from the DB on startup and re-submits them to the in-memory pool. Previously, tasks that were QUEUED when the server restarted would remain stuck indefinitely since only RUNNING tasks were recovered (and marked FAILED). Called in serve.go immediately after RecoverStaleRunning(). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13feat: post-elaboration sanity check for tools, acceptance criteria, and dev ↵Peter Stone
practices Add sanitizeElaboratedTask() called after every elaboration response: - Infers missing allowed_tools from instruction keywords (Write/Edit/Read/Bash/Grep/Glob) - Auto-adds Read when Edit is present - Appends Acceptance Criteria section if none present - Appends TDD reminder for coding tasks without test mention Also tighten buildElaboratePrompt to require acceptance criteria and list concrete tool examples, reducing how often the model omits tools. Fixes class of failures where agents couldn't create files because the elaborator omitted Write from allowed_tools. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13feat: resume support, summary extraction, and task state improvementsPeter Stone
- Extend Resume to CANCELLED, FAILED, and BUDGET_EXCEEDED tasks - Add summary extraction from agent stdout stream-json output - Fix storage: persist stdout/stderr/artifact_dir paths in UpdateExecution - Clear question_json on ResetTaskForRetry - Resume BLOCKED tasks in preserved sandbox so Claude finds its session - Add planning preamble: CLAUDOMATOR_SUMMARY_FILE env var + summary step - Update ADR-002 with new state transitions - UI style improvements Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12feat: add Resume support for CANCELLED, FAILED, and BUDGET_EXCEEDED tasksClaudomator Agent
Interrupted tasks (CANCELLED, FAILED, BUDGET_EXCEEDED) now support session resume in addition to restart. Both buttons are shown on the task card. - executor: extend resumablePoolStates to include CANCELLED, FAILED, BUDGET_EXCEEDED - api: extend handleResumeTimedOutTask to accept all resumable states with state-specific resume messages; replace hard-coded TIMED_OUT check with a resumableStates map - web: add RESUME_STATES set; render Resume + Restart buttons for interrupted states; TIMED_OUT keeps Resume only - tests: 5 new Go tests (TestResumeInterrupted_*); updated task-actions.test.mjs with 17 tests covering dual-button behaviour Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11feat: add Stats tab with task distribution and execution health metricsClaudomator Agent
- Export computeTaskStats and computeExecutionStats from app.js - Add renderStatsPanel with state count grid, KPI row (total/success-rate/cost/avg-duration), and outcome bar chart - Wire stats tab into switchTab and poll for live refresh - Add Stats tab button and panel to index.html - Add CSS for .stats-counts, .stats-kpis, .stats-bar-chart using existing state color variables - Add docs/stats-tab-plan.md with component structure and data flow - 14 new unit tests in web/test/stats.test.mjs (140 total, all passing) No backend changes — derives all metrics from existing /api/tasks and /api/executions endpoints. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11feat: add Summary and Q&A History sections to task detail panelClaudomator Agent
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11feat: require agents to write a final summary before exitingClaudomator Agent
Add a mandatory '## Final Summary' section to planningPreamble instructing agents to output a 2-5 sentence summary paragraph (headed by '## Summary') as their last output before exiting. Adds three tests to verify the section and its required content are present in the preamble. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11Merge remote-tracking branch 'local/master'Peter Stone
2026-03-11fix: clear question_json when restarting a task via ResetTaskForRetryPeter Stone
A BLOCKED task that fails on resume would keep its stale question_json after being restarted. The frontend then showed "waiting for your input" with the old prompt even though the task was running fresh. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11docs: add discovery notes and implementation plan for summary/QA featureClaudomator Agent
2026-03-11fix: resume BLOCKED tasks in preserved sandbox so Claude finds its sessionPeter Stone
When a task ran in a sandbox (/tmp/claudomator-sandbox-*) and went BLOCKED, Claude stored its session under the sandbox path as the project slug. The resume execution was running in project_dir, causing Claude to look for the session in the wrong project directory and fail with "No conversation found". Fix: carry SandboxDir through BlockedError → Execution → resume execution, and run the resume in that directory so the session lookup succeeds. - BlockedError gains SandboxDir field; claude.go sets it on BLOCKED exit - storage.Execution gains SandboxDir (persisted via new sandbox_dir column) - executor.go stores blockedErr.SandboxDir in the execution record - server.go copies SandboxDir from latest execution to the resume execution - claude.go uses e.SandboxDir as working dir for resume when set Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11docs: add executor package documentationClaudomator Agent