summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
13 daysfeat: overhaul auto-refresh system with intelligent polling and differential ↵Peter Stone
updates
13 daysfeat: run build (Makefile, gradlew, or go build) before sandbox autocommitPeter Stone
13 daysfeat: show subtask rollup on READY task cardsClaudomator Agent
READY tasks now call renderSubtaskRollup identical to BLOCKED tasks (without a question). The rollup appears above Accept/Reject buttons. New test: web/test/ready-subtasks.test.mjs (10 assertions, all pass).
13 daysMerge remote-tracking branch 'local/master'Peter Stone
14 daysfeat: fix task failures via sandbox improvements and display commits in Web UIPeter Stone
- Fix ephemeral sandbox deletion issue by passing $CLAUDOMATOR_PROJECT_DIR to agents and using it for subtask project_dir. - Implement sandbox autocommit in teardown to prevent task failures from uncommitted work. - Track git commits created during executions and persist them in the DB. - Display git commits and changestats badges in the Web UI execution history. - Add badge counts to Web UI tabs for Interrupted, Ready, and Running states. - Improve scripts/next-task to handle QUEUED tasks and configurable DB path.
14 daysfeat: add task count badges to interrupted, ready, and running tabsClaudomator Agent
- Add computeTabBadgeCounts(tasks) exported pure function - Add updateTabBadges(tasks) that updates badge spans in tab buttons - Call updateTabBadges on every poll regardless of active tab - Add .tab-count-badge spans to interrupted/ready/running tab buttons in HTML - Add CSS for .tab-count-badge pill styling (hidden when count is zero) - Add 11 tests in web/test/tab-badges.test.mjs covering all states Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
14 daysfeat(Phase4): add file changes for changestats executor wiringClaude Sonnet 4.6
Files changed: CLAUDE.md, internal/api/changestats.go, internal/executor/executor.go, internal/executor/executor_test.go, internal/task/changestats.go (new) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
14 daysfeat: extract and store changestats from execution outputClaude Sonnet 4.6
- Move git diff stat parser to internal/task/changestats.go (shared) - Add UpdateExecutionChangestats to executor.Store interface - Extract changestats in Pool.handleRunResult after every execution - Add three TDD tests: ExtractAndStore, NoChangestats, MalformedChangestats - Update CLAUDE.md Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
14 daysfeat: expose changestats in API responsesClaudomator Agent
- Add parseChangestatFromOutput/File helpers in internal/api/changestats.go to parse git diff --stat summary lines from execution stdout logs - Wire parser in processResult: after each execution completes, scan the stdout log for git diff stats and persist via UpdateExecutionChangestats - Tests: TestGetTask_IncludesChangestats (verifies processResult wiring), TestListExecutions_IncludesChangestats (verifies storage round-trip) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
14 daysfeat: add Changestats struct and storage supportClaudomator Agent
- Add task.Changestats{FilesChanged, LinesAdded, LinesRemoved} - Add changestats_json column to executions via additive migration - Add Changestats field to storage.Execution struct - Add UpdateExecutionChangestats(execID, *task.Changestats) method - Update all SELECT/INSERT/scan paths for executions - Test: TestExecution_StoreAndRetrieveChangestats (was red, now green) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14fix: surface agent stderr, auto-retry restart-killed tasks, handle stale ↵Peter Stone
sandboxes #1 - Diagnostics: tailFile() reads last 20 lines of subprocess stderr and appends to error message when claude/gemini exits non-zero. Previously all exit-1 failures were opaque; now the error_msg carries the actual subprocess output. #4 - Restart recovery: RecoverStaleRunning() now re-queues tasks after marking them FAILED, so tasks killed by a server restart automatically retry on the next boot rather than staying permanently FAILED. #2 - Stale sandbox: If a resume execution's preserved SandboxDir no longer exists (e.g. /tmp purge after reboot), clone a fresh sandbox instead of failing immediately with "no such file or directory". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14fix: trust all directory owners in sandbox git commandsPeter Stone
Sandbox setup runs git commands against project_dir which may be owned by a different OS user, triggering git's 'dubious ownership' error. Fix by passing -c safe.directory=* on all git commands that touch project directories. Also add wildcard to global config for immediate effect on the running server. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14feat: persist agent assignment before task executionClaudomator Agent
- Add UpdateTaskAgent to Store interface and DB implementation - Call UpdateTaskAgent in Pool.execute to persist assigned agent/model to database before the runner starts - Update runTask in app.js to pass selected agent as query param Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14testClaudomator Agent
2026-03-14feat: add agent selector to UI and support direct agent assignmentPeter Stone
- Added an agent selector (Auto, Claude, Gemini) to the Start Next Task button. - Updated the backend to pass query parameters as environment variables to scripts. - Modified the executor pool to skip classification when a specific agent is requested. - Added --agent flag to claudomator start command. - Updated tests to cover the new functionality.
2026-03-14feat: show subtask rollup on BLOCKED tasks waiting for subtasksPeter Stone
When a task is BLOCKED due to spawned subtasks (no question), the card footer now fetches and renders a list of subtask names with their state emoji instead of showing the question/answer input UI. The Cancel button remains in both cases. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14fix: cancel blocked tasks + auto-complete completion reportsPeter Stone
Two fixes for BLOCKED task issues: 1. Allow BLOCKED → CANCELLED state transition so users can cancel tasks stuck waiting for input. Adds Cancel button to BLOCKED task cards in the UI alongside the question/answer controls. 2. Detect when agents write completion reports to $CLAUDOMATOR_QUESTION_FILE instead of real questions. If the question JSON has no options and no "?" in the text, treat it as a summary (stored on the execution) and fall through to normal completion + sandbox teardown rather than blocking. Also tightened the preamble to make the distinction explicit. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13test: add storage tests for UpdateTaskSummary and AppendTaskInteractionClaudomator Agent
The summary+interactions feature was already fully implemented but lacked storage-layer tests. Added tests covering round-trip persistence of task summaries, accumulation of Q&A interactions, and error handling for nonexistent tasks. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13fix: space tabs equally across full tab bar widthPeter Stone
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13feat: replace tab labels with emoji iconsPeter Stone
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13feat: reorganize web UI to 7-tab layout (Queue, Interrupted, Ready, Running, ↵Claudomator Agent
All, Stats, Settings) - Replace Tasks/Active tabs with Queue (QUEUED+PENDING), Interrupted, Ready top-level tabs - Add All tab (COMPLETED, TIMED_OUT, BUDGET_EXCEEDED within last 24h) and Settings placeholder - Export filterQueueTasks, filterReadyTasks, filterAllDoneTasks from app.js - Refactor poll() to dispatch to active tab's render function instead of always rendering all panels - Add renderQueuePanel, renderInterruptedPanel, renderReadyPanel, renderAllPanel helpers - Add tests in web/test/tab-filters.test.mjs covering all new filter functions (16 tests) - All 165 JS tests and all Go tests pass Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13fix: only write RAW_NARRATIVE.md when user explicitly provides project_dirPeter Stone
Previously appendRawNarrative was called with the server's default workDir (os.Getwd()) when no project_dir was in the request, causing test runs and any elaboration without a project to pollute the repo's own RAW_NARRATIVE.md. The narrative is per-project human input — only write it when the caller explicitly specifies which project they're working in. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13Merge branch 'master' of /site/git.terst.org/repos/claudomatorPeter Stone
2026-03-13fix: enable Gemini file writing by passing --yolo and -p flagsPeter Stone
GeminiRunner.buildArgs was missing --yolo (auto-approve all tools) so the gemini CLI only registered 3 tools (read_file, write_todos, cli_help) and write_file was not available. Agents that needed to create files silently failed (exit 0, no files written). Also switch instructions from bare positional arg to -p flag, which is required for non-interactive headless mode. Update preamble tests to match file-based summary approach (CLAUDOMATOR_SUMMARY_FILE) kept from the merge conflict resolution. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13fix: add .btn-resume CSS styling matching .btn-restartClaudomator Agent
2026-03-13feat: show New Task button on all tabsClaudomator Agent
Removed the switchTab() logic that hid btn-new-task on non-tasks tabs. The button lives in the global header so no structural changes were needed. Added new-task-button.test.mjs to contract-test the always-visible behavior.
2026-03-13merge: resolve conflicts with local/master (stats tab + summary styles)Peter Stone
Keep file-based summary approach (CLAUDOMATOR_SUMMARY_FILE) from HEAD. Combine Q&A History and Stats tab CSS from both branches. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13fix: resubmit QUEUED tasks on server startup to prevent them getting stuckPeter Stone
Add Pool.RecoverStaleQueued() that lists all QUEUED tasks from the DB on startup and re-submits them to the in-memory pool. Previously, tasks that were QUEUED when the server restarted would remain stuck indefinitely since only RUNNING tasks were recovered (and marked FAILED). Called in serve.go immediately after RecoverStaleRunning(). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13feat: post-elaboration sanity check for tools, acceptance criteria, and dev ↵Peter Stone
practices Add sanitizeElaboratedTask() called after every elaboration response: - Infers missing allowed_tools from instruction keywords (Write/Edit/Read/Bash/Grep/Glob) - Auto-adds Read when Edit is present - Appends Acceptance Criteria section if none present - Appends TDD reminder for coding tasks without test mention Also tighten buildElaboratePrompt to require acceptance criteria and list concrete tool examples, reducing how often the model omits tools. Fixes class of failures where agents couldn't create files because the elaborator omitted Write from allowed_tools. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13feat: resume support, summary extraction, and task state improvementsPeter Stone
- Extend Resume to CANCELLED, FAILED, and BUDGET_EXCEEDED tasks - Add summary extraction from agent stdout stream-json output - Fix storage: persist stdout/stderr/artifact_dir paths in UpdateExecution - Clear question_json on ResetTaskForRetry - Resume BLOCKED tasks in preserved sandbox so Claude finds its session - Add planning preamble: CLAUDOMATOR_SUMMARY_FILE env var + summary step - Update ADR-002 with new state transitions - UI style improvements Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12feat: add Resume support for CANCELLED, FAILED, and BUDGET_EXCEEDED tasksClaudomator Agent
Interrupted tasks (CANCELLED, FAILED, BUDGET_EXCEEDED) now support session resume in addition to restart. Both buttons are shown on the task card. - executor: extend resumablePoolStates to include CANCELLED, FAILED, BUDGET_EXCEEDED - api: extend handleResumeTimedOutTask to accept all resumable states with state-specific resume messages; replace hard-coded TIMED_OUT check with a resumableStates map - web: add RESUME_STATES set; render Resume + Restart buttons for interrupted states; TIMED_OUT keeps Resume only - tests: 5 new Go tests (TestResumeInterrupted_*); updated task-actions.test.mjs with 17 tests covering dual-button behaviour Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11feat: add Stats tab with task distribution and execution health metricsClaudomator Agent
- Export computeTaskStats and computeExecutionStats from app.js - Add renderStatsPanel with state count grid, KPI row (total/success-rate/cost/avg-duration), and outcome bar chart - Wire stats tab into switchTab and poll for live refresh - Add Stats tab button and panel to index.html - Add CSS for .stats-counts, .stats-kpis, .stats-bar-chart using existing state color variables - Add docs/stats-tab-plan.md with component structure and data flow - 14 new unit tests in web/test/stats.test.mjs (140 total, all passing) No backend changes — derives all metrics from existing /api/tasks and /api/executions endpoints. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11feat: add Summary and Q&A History sections to task detail panelClaudomator Agent
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11feat: require agents to write a final summary before exitingClaudomator Agent
Add a mandatory '## Final Summary' section to planningPreamble instructing agents to output a 2-5 sentence summary paragraph (headed by '## Summary') as their last output before exiting. Adds three tests to verify the section and its required content are present in the preamble. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11Merge remote-tracking branch 'local/master'Peter Stone
2026-03-11fix: clear question_json when restarting a task via ResetTaskForRetryPeter Stone
A BLOCKED task that fails on resume would keep its stale question_json after being restarted. The frontend then showed "waiting for your input" with the old prompt even though the task was running fresh. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11docs: add discovery notes and implementation plan for summary/QA featureClaudomator Agent
2026-03-11fix: resume BLOCKED tasks in preserved sandbox so Claude finds its sessionPeter Stone
When a task ran in a sandbox (/tmp/claudomator-sandbox-*) and went BLOCKED, Claude stored its session under the sandbox path as the project slug. The resume execution was running in project_dir, causing Claude to look for the session in the wrong project directory and fail with "No conversation found". Fix: carry SandboxDir through BlockedError → Execution → resume execution, and run the resume in that directory so the session lookup succeeds. - BlockedError gains SandboxDir field; claude.go sets it on BLOCKED exit - storage.Execution gains SandboxDir (persisted via new sandbox_dir column) - executor.go stores blockedErr.SandboxDir in the execution record - server.go copies SandboxDir from latest execution to the resume execution - claude.go uses e.SandboxDir as working dir for resume when set Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11docs: add executor package documentationClaudomator Agent
2026-03-11docs: add system architecture overviewClaudomator Agent
2026-03-10test: sandbox coverage + fix WebSocket racesPeter Stone
executor: add 7 tests for sandboxCloneSource, setupSandbox, and teardownSandbox (uncommitted-changes error, clean-no-commits removal). api: fix two data races in WebSocket tests — wsPingInterval/Deadline are now captured as locals before goroutine start; maxWsClients is moved from a package-level var into Hub.maxClients (with SetMaxClients method) so concurrent tests don't stomp each other. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-10docs: add development narrative and ADRs 004-005Peter Stone
RAW_NARRATIVE.md: comprehensive chronological engineering history reconstructed from the git log covering all 45 major milestones. ADR-004: multi-agent routing — explicit load balancing in code (pickAgent) plus Gemini-based model classification (Classifier), and why the two decisions are intentionally separated. ADR-005: git sandbox execution model — clone isolation, bare-repo push, uncommitted-change enforcement, BLOCKED preservation, and session ID propagation on second resume cycle. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-10task: promote validTransitions to package-level var; fix ADRPeter Stone
Hoists the map out of ValidTransition so it's not reallocated on every call. Adds missing CANCELLED→QUEUED and BUDGET_EXCEEDED→QUEUED entries to the ADR transition table to match the implemented state machine. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-10executor: fix session ID on second block-and-resume cyclePeter Stone
When a resumed execution is blocked again, SessionID was set to the new exec's own UUID instead of the original ResumeSessionID. The next resume would then pass the wrong --resume argument to claude and fail. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-10executor: explicit load balancing — code picks agent, classifier picks modelPeter Stone
pickAgent() deterministically selects the agent with the fewest active tasks, skipping rate-limited agents. The classifier now only selects the model for the pre-assigned agent, so Gemini gets tasks from the start rather than only as a fallback when Claude's quota is exhausted. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-10feat: append verbatim user input to docs/RAW_NARRATIVE.mdPeter Stone
The elaborator now logs every user prompt to docs/RAW_NARRATIVE.md within the project directory. This is done in a background goroutine to ensure it doesn't delay the response.
2026-03-10feat: include project context in elaborator promptPeter Stone
The elaborator now reads CLAUDE.md and SESSION_STATE.md from the project directory (if they exist) and prepends their content to the user prompt. This allows the AI to generate tasks that are more context-aware.
2026-03-10fix: ensure tasks are re-classified on manual restartPeter Stone
Updated handleRunTask to use ResetTaskForRetry, which clears the agent type and model. This ensures that manually restarted tasks are always re-classified, allowing the system to switch to a different agent if the previous one is rate-limited. Also improved Claude quota-exhaustion detection.
2026-03-10docs: add comprehensive documentation plan (tasks_plan.md)Claudomator Agent
2026-03-10cli: implement --config flag to load TOML config fileClaudomator Agent
The --config flag was registered but silently ignored. Now: - config.LoadFile loads a TOML file on top of defaults - PersistentPreRunE applies the file when --config is set - Explicit CLI flags (--data-dir, --claude-bin) take precedence over the file Tests: TestLoadFile_OverridesDefaults, TestLoadFile_MissingFile_ReturnsError, TestRootCmd_ConfigFile_Loaded, TestRootCmd_ConfigFile_CLIFlagOverrides, TestRootCmd_ConfigFile_Missing_ReturnsError Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>