summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
8 daysdocs: update ADR-007 with validation pipeline and nav projectPeter Stone
- Story state machine: SHIPPABLE → DEPLOYED → VALIDATING → REVIEW_READY | NEEDS_FIX - Merge-first strategy: no branch review phase, tests are the confidence mechanism - Elaborator owns validation spec (type, steps, success_criteria) - Validation types: curl | tests | playwright | gradle - Nav project (Android): deploy = push to GitHub, validate = gradle test/lint - Project registry: type + deploy_script fields, initial claudomator + nav entries - Out of scope: branch review deferred, CI polling out of band for nav Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
9 dayschore: improve debug-execution script and add ADR-007Peter Stone
- debug-execution: default to most recent execution when no ID given - docs/adr/007: planning layer and story model design decisions Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
9 daysfeat: add errors, throughput, and billing sections to stats dashboardPeter Stone
- GET /api/stats?window=7d: pre-aggregated SQL queries for errors, throughput, billing - Errors section: category summary (quota/rate_limit/timeout/git/failed) + failure table - Throughput section: stacked hourly bar chart (completed/failed/other) over 7d - Billing section: KPIs (7d total, avg/day, cost/run) + daily cost bar chart Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
9 daysfeat: agent status dashboard with availability timeline and Gemini quota ↵Peter Stone
detection - Detect Gemini TerminalQuotaError (daily quota) as BUDGET_EXCEEDED, not generic FAILED - Surface container stderr tail in error so quota/rate-limit classifiers can match it - Add agent_events table to persist rate-limit start/recovery events across restarts - Add GET /api/agents/status endpoint returning live agent state + 24h event history - Stats dashboard: agent status cards, 24h availability timeline, per-run execution table Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
10 daysMerge branch 'master' of /site/git.terst.org/repos/claudomatorPeter Stone
10 daysMerge feat/container-execution into masterPeter Stone
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
10 daysfeat: containerized execution with agent tooling and deployment fixesPeter Stone
- ContainerRunner replaces ClaudeRunner/GeminiRunner; all agent types run in Docker containers via claudomator-agent:latest - Writable agentHome staging dir (/home/agent) satisfies home-dir requirements for both claude and gemini CLIs without exposing host creds - Copy .credentials.json and .claude.json into staging dir at run time; GEMINI_API_KEY passed via env file - Fix git clone: remove MkdirTemp-created dir before cloning (git rejects pre-existing dirs even when empty) - Replace localhost with host.docker.internal in APIURL so container can reach host API; add --add-host=host.docker.internal:host-gateway - Run container as --user=$(uid):$(gid) so host-owned workspace files are readable; chmod workspace 0755 and instructions file 0644 after clone - Pre-create .gemini/ in staging dir to avoid atomic-rename ENOENT on first gemini-cli run - Add ct CLI tool to container image: pre-built Bash wrapper for Claudomator API (ct task submit/create/run/wait/status/list) - Document ct tool in CLAUDE.md agent instructions section - Add drain-failed-tasks script: retries failed tasks on a 5-minute interval - Update Dockerfile: Node 22 via NodeSource, Go 1.24, gemini-cli, git safe.directory=*, default ~/.claude.json Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
10 daysdocs: add REFACTOR_PLAN.md with codebase analysis findingsClaudomator Agent
Analyzed claudomator for architectural integrity, test coverage gaps, and bugs. Documents 1 critical race condition (QuestionRegistry.Answer panics on closed channel), 2 medium issues (sandbox leak, VAPID private key validation), and 8 minor issues covering error handling, test coverage gaps, and code duplication. 11 discrete subtasks created in Claudomator for each actionable item. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
10 daysfix: address final container execution issues and cleanup review docsPeter Stone
10 daysfix: address round 3 review feedback for container executionPeter Stone
- Fix push failure swallowing and ensure workspace preservation on push error - Fix wrong session ID in --resume flag and BlockedError - Implement safer shell quoting for instructions in buildInnerCmd - Capture and propagate actual Claude session ID from stream init message - Clean up redundant image resolution and stale TODOs - Mark ADR-005 as Superseded - Consolidate RepositoryURL to Task level (removed from AgentConfig) - Add unit test for session ID extraction in parseStream
10 daysfix: address round 2 review feedback for container executionPeter Stone
- Fix host/container path confusion for --env-file - Fix --resume flag to only be used during resumptions - Fix instruction passing to Claude CLI via shell-wrapped cat - Restore streamErr return logic to detect task-level failures - Improve success flag logic for workspace preservation - Remove duplicate RepositoryURL from AgentConfig - Fix app.js indentation and reformat DOMContentLoaded block - Restore behavioral test coverage in container_test.go
10 daysfix: comprehensive addressing of container execution review feedbackPeter Stone
- Fix Critical Bug 1: Only remove workspace on success, preserve on failure/BLOCKED. - Fix Critical Bug 2: Use correct Claude flag (--resume) and pass instructions via file. - Fix Critical Bug 3: Actually mount and use the instructions file in the container. - Address Design Issue 4: Implement Resume/BLOCKED detection and host-side workspace re-use. - Address Design Issue 5: Consolidate RepositoryURL to Task level and fix API fallback. - Address Design Issue 6: Make agent images configurable per runner type via CLI flags. - Address Design Issue 7: Secure API keys via .claudomator-env file and --env-file flag. - Address Code Quality 8: Add unit tests for ContainerRunner arg construction. - Address Code Quality 9: Fix indentation regression in app.js. - Address Code Quality 10: Clean up orphaned Claude/Gemini runner files and move helpers. - Fix tests: Update server_test.go and executor_test.go to work with new model.
10 daysfeat: implement containerized repository-based execution modelPeter Stone
This commit implements the architectural shift from local directory-based sandboxing to containerized execution using canonical repository URLs. Key changes: - Data Model: Added RepositoryURL and ContainerImage to task/agent configs. - Storage: Updated SQLite schema and queries to handle new fields. - Executor: Implemented ContainerRunner using Docker/Podman for isolation. - API/UI: Overhauled task creation to use Repository URLs and Image selection. - Webhook: Updated GitHub webhook to derive Repository URLs automatically. - Docs: Updated ADR-005 with risk feedback and added ADR-006 to document the new containerized model. - Defaults: Updated serve command to use ContainerRunner for all agents. This fixes systemic task failures caused by build dependency and permission issues on the host system.
10 daysfix: unsubscribe stale push subscription before re-subscribingClaudomator Agent
When the VAPID key changes (e.g. after the key-swap fix), the browser's cached PushSubscription was created with the old key. Calling PushManager.subscribe() with a different applicationServerKey then throws "The provided applicationServerKey is not valid". Fix by calling getSubscription()/unsubscribe() before subscribe() so any stale subscription is cleared. Adds web test covering both the stale and fresh subscription paths. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
11 daysfeat: persist active main tab to localStorageClaudomator Agent
On tab click, store the tab name under 'activeMainTab' in localStorage. On DOMContentLoaded, restore the previously active tab instead of always defaulting to 'queue'. Exported getActiveMainTab/setActiveMainTab for testability, following the same pattern as getTaskFilterTab/setTaskFilterTab. Tests: web/test/tab-persistence.test.mjs (6 tests, all green). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
11 dayschore: autocommit uncommitted changesClaudomator Agent
11 daysfix: unsubscribe stale push subscription before re-subscribingClaudomator Agent
When the VAPID key changes (e.g. after the key-swap fix), the browser's cached PushSubscription was created with the old key. Calling PushManager.subscribe() with a different applicationServerKey then throws "The provided applicationServerKey is not valid". Fix by calling getSubscription()/unsubscribe() before subscribe() so any stale subscription is cleared. Adds web test covering both the stale and fresh subscription paths. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
11 daysfix: validate VAPID public key on load, regenerate if swappedClaudomator Agent
The DB may contain keys generated before the swap fix, with the private key stored as the public key. Add ValidateVAPIDPublicKey() and use it in serve.go to detect and regenerate invalid stored keys on startup. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
12 daysfix: swap VAPID key return order in GenerateVAPIDKeys wrapperClaudomator Agent
webpush.GenerateVAPIDKeys() returns (privateKey, publicKey) but the claudomator wrapper declared (publicKey, privateKey), causing the 32-byte private key to be sent to browsers as the applicationServerKey. Browsers require a 65-byte uncompressed P256 point, so they rejected it with "The provided applicationServerKey is not valid." Adds a regression test that asserts public key is 87 chars/65 bytes with 0x04 prefix and private key is 43 chars/32 bytes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
12 daysfix: wait for service worker to activate before subscribing to pushPeter Stone
navigator.serviceWorker.register() returns before the SW is active. Use navigator.serviceWorker.ready which resolves only once a SW is controlling the page, so pushManager.subscribe() always has an active SW. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
12 daysfix: prefix SW registration path with BASE_PATHPeter Stone
The app is served at /claudomator/ so the SW and scope must use BASE_PATH + '/api/push/sw.js' and BASE_PATH + '/' respectively. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
12 daysfix: hide deployment badge when not deployed and trim notification button labelPeter Stone
- Deployment badge now returns null (hidden) when includes_fix is false instead of showing "Not deployed" noise - Badge also suppressed when fix_commits is empty (no tracked commits to check) - Notification button label trimmed to just the bell emoji - Preamble: warn agents not to use absolute paths in git commands (sandbox bypass) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
12 daysdocs: update SESSION_STATE.md for project field featureClaudomator Agent
12 daysfeat: expose project field in API and CLIClaudomator Agent
- POST /api/tasks now reads and stores the project field from request body - GET /api/tasks/{id} returns project in response (via Task struct json tags) - list command: adds PROJECT column to tabwriter output - status command: prints Project line when non-empty - Tests: TestProject_RoundTrip (API), TestListTasks_ShowsProject, TestStatusCmd_ShowsProject (CLI) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
12 daysfeat: display project field in web UIClaudomator Agent
Show task.project as a badge in task card meta row and as a field in the task detail overview grid. Both display conditionally only when project is non-empty. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
12 daysfix: serve sw.js from /api/push/sw.js to bypass Apache static file routingPeter Stone
Apache fronts the Go service and only proxies /api/ paths; /sw.js hits Apache's filesystem and 404s. Serve the service worker from /api/push/sw.js with Service-Worker-Allowed: / so the browser allows it to control the full origin scope. Update SW registration URL. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
12 daysfix: persist VAPID keys in DB instead of config filePeter Stone
The service runs as www-data which cannot write to the root-owned config file. VAPID keys are now stored in the settings table in SQLite (which is writable), loaded on startup, and generated once. Removes saveVAPIDToConfig and the stale warning on every restart. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
12 daysfeat: add web push notifications and file dropPeter Stone
Web Push: - WebPushNotifier with VAPID auth; urgency mapped to event type (BLOCKED=urgent, FAILED=high, COMPLETED=low) - Auto-generates VAPID keys on first serve, persists to config file - push_subscriptions table in SQLite (upsert by endpoint) - GET /api/push/vapid-key, POST/DELETE /api/push/subscribe endpoints - Service worker (sw.js) handles push events and notification clicks - Notification bell button in web UI; subscribes on click File Drop: - GET /api/drops, GET /api/drops/{filename}, POST /api/drops - Persistent ~/.claudomator/drops/ directory - CLAUDOMATOR_DROP_DIR env var passed to agent subprocesses - Drops tab (📁) in web UI with file listing and download links Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
12 daysfeat: add project column to storageClaudomator Agent
Adds project TEXT column to tasks table via additive migration, updates CreateTask INSERT, all SELECT queries, and scanTask to persist and retrieve Task.Project. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
12 daysfeat: display deployment status badge on READY task cardsClaudomator Agent
Add deployment_status field to task list/get API responses for READY tasks. The field includes deployed_commit, fix_commits, and includes_fix so the UI can show whether the deployed server includes each fix. - internal/api/task_view.go: taskView struct + enrichTask() helper - handleListTasks/handleGetTask: return enriched taskView responses - web/app.js: export renderDeploymentBadge(); add badge to READY cards - web/test/deployment-badge.test.mjs: 8 tests for renderDeploymentBadge - web/style.css: .deployment-badge--deployed / --pending styles - server_test.go: 3 new tests (red→green) for enriched task responses Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
12 daysfeat: add Project field to Task struct and YAML parsingClaudomator Agent
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
12 daysfix: clean up activePerAgent before sending to resultChClaudomator Agent
Move activePerAgent decrement/deletion out of execute() and executeResume() defers and into the code paths immediately before each resultCh send (handleRunResult and early-return paths). This guarantees that when a result consumer reads from the channel the map is already clean, eliminating a race between defer and result receipt. Remove the polling loop from TestPool_ActivePerAgent_DeletesZeroEntries and check the map state immediately after reading the result instead. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
12 daysfix: deploy script skips scripts/hooks/ subdirectory when copyingPeter Stone
cp without -r fails on directories. Use find -maxdepth 1 -type f to copy only files, since hooks/ is for local dev only. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
12 daysfix: eliminate flaky race in TestPool_ActivePerAgent_DeletesZeroEntriesPeter Stone
The deferred activePerAgent cleanup in execute() runs after resultCh is sent, so a consumer reading Results() could observe the map entry before it was removed. Poll briefly (100ms max) instead of checking immediately. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
12 daysfix: repair test regressions and add pre-commit/pre-push verification gatesPeter Stone
Fix four pre-existing bugs exposed after resolving a build failure: - sandboxCloneSource: accept any URL scheme for origin remote (was filtering out https://) - setupSandbox callers: fix := shadow variable so sandboxDir is set on BlockedError - parseGeminiStream: parse result lines to return execution errors and cost - TestElaborateTask_InvalidJSONFromClaude: stub Gemini fallback so test is hermetic Add verification infrastructure: - scripts/verify: runs go build + go test -race, used by hooks and deploy - scripts/hooks/pre-commit: blocks commits that don't compile - scripts/hooks/pre-push: blocks pushes where tests fail - scripts/install-hooks: symlinks version-controlled hooks into .git/hooks/ - scripts/deploy: runs scripts/verify before building the binary Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
12 daysfeat: add GitHub webhook endpoint for automatic CI failure task creationClaudomator Agent
Adds POST /api/webhooks/github that receives check_run and workflow_run events and creates a Claudomator task to investigate and fix the failure. - Config: new webhook_secret and [[projects]] fields in config.toml - HMAC-SHA256 validation when webhook_secret is configured - Ignores non-failure events (success, skipped, etc.) with 204 - Matches repo name to configured project dirs (case-insensitive) - Falls back to single project when no name match found - 11 new tests covering all acceptance criteria Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
12 daysfeat: add deployment status endpoint for tasksPeter Stone
Adds GET /api/tasks/{id}/deployment-status which checks whether the currently-deployed server binary includes the fix commits from the task's latest execution. Uses git merge-base --is-ancestor to compare commit hashes against the running version. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
12 daysfeat: improve next-task selection and rejection UXPeter Stone
- next-task script: exclude rejected tasks from fallback selection; only pick PENDING tasks with no rejection comment and no prior executions, or QUEUED tasks (e.g. BUDGET_EXCEEDED retries) - web/app.js: prompt for optional rejection comment when rejecting a task, passing it through to the API instead of always sending an empty string Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
13 daysfix: permission denied and host key verification errors; add gemini ↵Peter Stone
elaboration fallback
13 daysfeat: add elaboration_input field to tasks for richer subtask placeholderClaudomator Agent
- Add ElaborationInput field to Task struct (task.go) - Add DB migration and update CREATE/SELECT/scan in storage/db.go - Update handleCreateTask to accept elaboration_input from API - Update renderSubtaskRollup in app.js to prefer elaboration_input over description - Capture elaborate prompt in createTask() form submission - Update subtask-placeholder tests to cover elaboration_input priority - Fix missing io import in gemini.go When a task card is waiting for subtasks, it now shows: 1. The raw user prompt from elaboration (if stored) 2. The task description truncated at word boundary (~120 chars) 3. The task name as fallback 4. 'Waiting for subtasks…' only when all fields are empty Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
13 daysfeat: replace static subtask placeholder with task descriptionClaudomator Agent
When a BLOCKED/READY task has no subtasks yet, show the task description (truncated to ~120 chars at a word boundary) instead of the generic 'Waiting for subtasks…' text. Falls back to task.name if no description, and finally to the original generic text if neither is present. - Add truncateToWordBoundary(text, maxLen=120) helper - Update renderSubtaskRollup(task, footer) to use task object instead of taskId - Update both READY and BLOCKED call sites - Add web/test/subtask-placeholder.test.mjs with 11 tests Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
13 daysfix: restore running tab rendering and throttle history fetchPeter Stone
- poll() now calls renderActiveTab(cache) on early-return so switching tabs always renders immediately instead of leaving the panel blank - renderRunningView unchanged check now requires running.length > 0, fixing the empty-state message never appearing when no tasks run - Extract renderActiveTab() to avoid duplicating the tab switch logic - Throttle execution history fetch to once per 60s (was every poll) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
13 daysfix: promote stale BLOCKED parent tasks to READY on server startupPeter Stone
When the server restarts after all subtasks complete, the parent task was left stuck in BLOCKED state because maybeUnblockParent only fires during a live executor run. RecoverStaleBlocked() scans all BLOCKED tasks on startup and re-evaluates them using the existing logic. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
13 daysfeat: overhaul auto-refresh system with intelligent polling and differential ↵Peter Stone
updates
13 daysfeat: run build (Makefile, gradlew, or go build) before sandbox autocommitPeter Stone
13 daysfeat: show subtask rollup on READY task cardsClaudomator Agent
READY tasks now call renderSubtaskRollup identical to BLOCKED tasks (without a question). The rollup appears above Accept/Reject buttons. New test: web/test/ready-subtasks.test.mjs (10 assertions, all pass).
13 daysMerge remote-tracking branch 'local/master'Peter Stone
13 daysfeat: fix task failures via sandbox improvements and display commits in Web UIPeter Stone
- Fix ephemeral sandbox deletion issue by passing $CLAUDOMATOR_PROJECT_DIR to agents and using it for subtask project_dir. - Implement sandbox autocommit in teardown to prevent task failures from uncommitted work. - Track git commits created during executions and persist them in the DB. - Display git commits and changestats badges in the Web UI execution history. - Add badge counts to Web UI tabs for Interrupted, Ready, and Running states. - Improve scripts/next-task to handle QUEUED tasks and configurable DB path.
14 daysfeat: add task count badges to interrupted, ready, and running tabsClaudomator Agent
- Add computeTabBadgeCounts(tasks) exported pure function - Add updateTabBadges(tasks) that updates badge spans in tab buttons - Call updateTabBadges on every poll regardless of active tab - Add .tab-count-badge spans to interrupted/ready/running tab buttons in HTML - Add CSS for .tab-count-badge pill styling (hidden when count is zero) - Add 11 tests in web/test/tab-badges.test.mjs covering all states Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
14 daysfeat(Phase4): add file changes for changestats executor wiringClaude Sonnet 4.6
Files changed: CLAUDE.md, internal/api/changestats.go, internal/executor/executor.go, internal/executor/executor_test.go, internal/task/changestats.go (new) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>