claudomator.git - claudomator — task automation server

Age	Commit message (Collapse)	Author
2026-05-02	feat(executor): synthesize execution summary via local LLM fallback	Claude
	Phase 4 of "local OSS models as agents" plan. Closes the epic. When an execution finishes and the agent did NOT write a "## Summary" heading in its stdout (so the existing extractSummary path returns empty), and the Pool has a local LLM configured, we now synthesize a 2-4 sentence summary from the assistant text content of the log tail. Behavior: - Primary path unchanged: if the agent wrote "## Summary", that wins byte-for-byte (TestPool_HandleRunResult_ExtractSummaryWins guards). - Fallback path: empty extractSummary + Pool.LLM != nil → synthesize. - All-empty path: when no LLM is configured, summary stays empty — identical to pre-Phase-4 behavior. Implementation: - Pool gains an LLM llm.Client field, wired in serve.go and run.go alongside Classifier.LLM (same localClient used everywhere). - New synthesizeSummary in internal/executor/summary.go: 6s timeout so a slow local model can't stall finalization * 16 KB tail cap on the stdout log * readAssistantTextTail seeks to the last 16 KB and skips the first (likely partial) line, parses each line as a stream-json event, joins assistant `text` blocks (skips system/result/etc). * Returns "" on any error so the caller's behavior never regresses. - handleRunResult: 3-tier summary resolution — exec.Summary set by runner → extractSummary → synthesizeSummary → empty. - minimalMockStore now records UpdateTaskSummary calls (additive; existing tests unaffected) so integration tests can assert. Tests (9 new): - synthesizeSummary nil client / empty path / missing file all return "" without HTTP calls. - empty assistant content short-circuits without LLM call. - success path returns trimmed body, with both assistant texts in the user prompt. - LLM 500 returns "" (caller handles same as no-summary). - readAssistantTextTail seeks past early content in a large file. - Pool integration: ## Summary present → LLM not called, agent text used. ## Summary absent + LLM set → LLM called, synthesized summary recorded against the right task ID. Plan: docs/plans/local-oss-runner.md. Epic complete. Post-epic deep cleanup queue captured in the same plan file for follow-up. https://claude.ai/code/session_017Edeq947TpSm1vQTxMhi1J
2026-04-28	feat(api): route elaboration through local LLM when configured	Claude
	Phase 2 of "local OSS models as agents" plan. Adds a third elaboration path that calls the local OpenAI-compatible LLM via the internal/llm client, and reorders dispatch so the cheap path is tried first: local → claude → gemini, with each next attempt only on hard failure of the prior. Wiring is opt-out, not opt-in: when [local_model].endpoint is set, elaboration prefers local by default. Users with a slow or low-quality local model can disable just elaboration via: [local_model] endpoint = "..." prefer_for_elaborate = false without giving up the runner or the classifier path. Implementation: - Server gains an optional llm.Client field via SetLLM (matches the existing SetNotifier/SetWorkspaceRoot setter pattern, no NewServer signature break). - elaborateWithLocal() reuses buildElaboratePrompt verbatim and asks for response_format=json_object so we skip markdown-fence cleanup. - handleElaborateTask reorders try chain; existing Claude-first behavior is preserved exactly when SetLLM is not called. - LocalModel.UseForElaborate() encapsulates the default-true gating with a bool so explicit-false survives TOML parse. Tests: - elaborateWithLocal: parses valid response, errors on nil client, errors on bad JSON. - handler: local preferred when wired; falls back to claude when local fails; unchanged behavior when no LLM is configured. - config: UseForElaborate gating across empty/default/explicit-true/ explicit-false cases. Pre-existing test failures noted in docs/plans/local-oss-runner.md (post-epic cleanup): TestGeminiLogs_ParsedCorrectly returns 404 for gemini execution log fetch — predates this change. Plan: docs/plans/local-oss-runner.md. https://claude.ai/code/session_017Edeq947TpSm1vQTxMhi1J
2026-04-28	feat(executor): add LocalRunner and OpenAI-compat LLM client	Claude
	Phase 1 of "local OSS models as agents" plan. Adds a third Runner backed by any OpenAI-compatible HTTP server (Ollama, vLLM, LM Studio, llama.cpp), and migrates the Gemini-CLI classifier to route through the same client when configured. Two-layer split: internal/llm.Client is the workhorse (HTTP, no Pool, no DB) used directly by the classifier and any future internal helper that needs cheap reasoning. internal/executor.LocalRunner is a thin adapter implementing Runner for user-facing tasks. This avoids Pool reentrancy/deadlock when sub-second internal calls fire from inside Pool.execute(). Highlights: - internal/retry: relocated runWithBackoff/IsRateLimitError/ParseRetryAfter into a shared package reused by executor and llm. - internal/llm: Chat (non-streaming) and ChatStream (SSE) over /chat/completions with optional bearer auth, json_object response format, retry on 429/503, Retry-After parsing. - internal/executor/LocalRunner: streams deltas into stdout.log in the same stream-json envelope ClaudeRunner emits, then writes one consolidated assistant block plus a result terminator so existing parsers (extractSummary, ParseChangestatFromOutput) work unchanged. - internal/executor/Classifier: gains optional LLM field; uses json_object response format (no markdown-fence cleanup needed). Falls back to Gemini-CLI subprocess when LLM is nil. - Pool.skipClassification: now skips only when the requested agent type is registered, so unknown types still reach the load balancer. - Storage: additive tokens_in/tokens_out ALTERs on executions; CLI runners record cost_usd as before, LocalRunner records 0 + tokens. - Config: [local_model] section (endpoint, model, timeout_seconds, default_temperature, api_key). Empty endpoint = no LocalRunner registered, classifier falls back to Gemini. Pre-existing test issues fixed in passing: - claude_test.go setupSandbox callsites updated to current signature. - gemini_test.go TestParseGeminiStream skipped (asserts unimplemented GeminiRunner stream-error parsing; tracked separately). Plan: docs/plans/local-oss-runner.md. https://claude.ai/code/session_017Edeq947TpSm1vQTxMhi1J
2026-03-16	feat: add GitHub webhook endpoint for automatic CI failure task creation	Claudomator Agent
	Adds POST /api/webhooks/github that receives check_run and workflow_run events and creates a Claudomator task to investigate and fix the failure. - Config: new webhook_secret and [[projects]] fields in config.toml - HMAC-SHA256 validation when webhook_secret is configured - Ignores non-failure events (success, skipped, etc.) with 204 - Matches repo name to configured project dirs (case-insensitive) - Falls back to single project when no name match found - 11 new tests covering all acceptance criteria Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-15	fix: promote stale BLOCKED parent tasks to READY on server startup	Peter Stone
	When the server restarts after all subtasks complete, the parent task was left stuck in BLOCKED state because maybeUnblockParent only fires during a live executor run. RecoverStaleBlocked() scans all BLOCKED tasks on startup and re-evaluates them using the existing logic. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14	fix: surface agent stderr, auto-retry restart-killed tasks, handle stale ↵	Peter Stone
	sandboxes #1 - Diagnostics: tailFile() reads last 20 lines of subprocess stderr and appends to error message when claude/gemini exits non-zero. Previously all exit-1 failures were opaque; now the error_msg carries the actual subprocess output. #4 - Restart recovery: RecoverStaleRunning() now re-queues tasks after marking them FAILED, so tasks killed by a server restart automatically retry on the next boot rather than staying permanently FAILED. #2 - Stale sandbox: If a resume execution's preserved SandboxDir no longer exists (e.g. /tmp purge after reboot), clone a fresh sandbox instead of failing immediately with "no such file or directory". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14	feat: add agent selector to UI and support direct agent assignment	Peter Stone
	- Added an agent selector (Auto, Claude, Gemini) to the Start Next Task button. - Updated the backend to pass query parameters as environment variables to scripts. - Modified the executor pool to skip classification when a specific agent is requested. - Added --agent flag to claudomator start command. - Updated tests to cover the new functionality.
2026-03-13	fix: resubmit QUEUED tasks on server startup to prevent them getting stuck	Peter Stone
	Add Pool.RecoverStaleQueued() that lists all QUEUED tasks from the DB on startup and re-submits them to the in-memory pool. Previously, tasks that were QUEUED when the server restarted would remain stuck indefinitely since only RUNNING tasks were recovered (and marked FAILED). Called in serve.go immediately after RecoverStaleRunning(). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-10	cli: implement --config flag to load TOML config file	Claudomator Agent
	The --config flag was registered but silently ignored. Now: - config.LoadFile loads a TOML file on top of defaults - PersistentPreRunE applies the file when --config is set - Explicit CLI flags (--data-dir, --claude-bin) take precedence over the file Tests: TestLoadFile_OverridesDefaults, TestLoadFile_MissingFile_ReturnsError, TestRootCmd_ConfigFile_Loaded, TestRootCmd_ConfigFile_CLIFlagOverrides, TestRootCmd_ConfigFile_Missing_ReturnsError Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09	api: make workspace root configurable instead of hardcoded /workspace	Claudomator Agent
	- Add workspaceRoot field (default "/workspace") to Server struct - Add SetWorkspaceRoot method on Server - Update handleListWorkspaces to use s.workspaceRoot - Add WorkspaceRoot field to Config with default "/workspace" - Wire cfg.WorkspaceRoot into server in serve.go - Expose --workspace-root flag on the serve command - Add TestListWorkspaces_UsesConfiguredRoot integration test Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09	executor: recover stale RUNNING tasks on server startup	Peter Stone
	On restart, any tasks in RUNNING state have no active goroutine. RecoverStaleRunning() marks them FAILED (retryable) and closes their open execution records with an appropriate error message. Called once from serve.go after the pool is created.
2026-03-08	fix: restore task execution broken by add-gemini merge	Peter Stone
	- handleCreateTask: add legacy "claude" key fallback in input struct so old clients and YAML files sending claude:{...} still work - cli/create: send "agent" key instead of "claude"; add --agent-type flag - storage/db_test: fix ClaudeConfig → AgentConfig after rename Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08	fix(cli): register scripts in serve command	Peter Stone
	Restore the script registry in internal/cli/serve.go which was lost during the gemini merge. This fixes the 'Start Next Task' button in the web UI which relies on the /api/scripts/start-next-task endpoint.
2026-03-08	merge: pull latest from master and resolve conflicts	Peter Stone
	- Resolve conflicts in API server, CLI, and executor. - Maintain Gemini classification and assignment logic. - Update UI to use generic agent config and project_dir. - Fix ProjectDir/WorkingDir inconsistencies in Gemini runner. - All tests passing after merge.
2026-03-08	feat(executor): implement Gemini-based task classification and load balancing	Peter Stone
	- Add Classifier using gemini-2.0-flash-lite to automatically select agent/model. - Update Pool to track per-agent active tasks and rate limit status. - Enable classification for all tasks (top-level and subtasks). - Refine SystemStatus to be dynamic across all supported agents. - Add unit tests for the classifier and updated pool logic. - Minor UI improvements for project selection and 'Start Next' action.
2026-03-08	cli: newLogger helper, defaultServerURL, shared http client, report command	Peter Stone
	- Extract newLogger() to remove duplication across run/serve/start - Add defaultServerURL const ("http://localhost:8484") used by all client commands - Move http.Client into internal/cli/http.go with 30s timeout - Add 'report' command for printing execution summaries - Add test coverage for create and serve commands Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08	feat: rename working_dir→project_dir; git sandbox execution	Peter Stone
	- ClaudeConfig.WorkingDir → ProjectDir (json: project_dir) - UnmarshalJSON fallback reads legacy working_dir from DB records - New executions with project_dir clone into a temp sandbox via git clone --local - Non-git project_dirs get git init + initial commit before clone - After success: verify clean working tree, merge --ff-only back to project_dir, remove sandbox - On failure/BLOCKED: sandbox preserved, path included in error message - Resume executions run directly in project_dir (no re-clone) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08	security(cli): validate --parallel flag is positive in run command	Claudomator
	Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08	refactor(api/cli): update logs and status for generic agents	Peter Stone

2026-03-08	feat(wiring): configure GeminiRunner and update API server	Peter Stone

2026-03-07	feat: add create CLI command with --parent-id; deploy to /usr/local/bin	Peter Stone
	claudomator create <name> -i <instructions> [flags] --parent-id attach as subtask of given task ID --working-dir working directory --model claude model --budget max USD --timeout task timeout --priority high/normal/low --start queue immediately after creating deploy script now also copies binary to /usr/local/bin/claudomator. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05	cli: add start command and version package	Peter Stone
	Adds `claudomator start <task-id>` to queue a task via the running API server. Adds internal/version for embedding VCS commit hash in the server banner. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03	Add elaborate, logs-stream, templates, and subtask-list endpoints	Peter Stone
	- POST /api/tasks/elaborate: calls claude to draft a task config from a natural-language prompt - GET /api/executions/{id}/logs/stream: SSE tail of stdout.log - CRUD /api/templates: create/list/get/update/delete reusable task configs - GET /api/tasks/{id}/subtasks: list child tasks - Server.NewServer accepts claudeBinPath for elaborate; injectable elaborateCmdPath and logStore for test isolation - Valid-transition guard added to POST /api/tasks/{id}/run - CLI passes claude binary path through to the server Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-24	Add logs CLI subcommand to tail execution output	Peter Stone
	Adds `claudomator logs <execution-id>` to stream or display stdout logs from a past or running execution. Includes unit tests for the command. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-08	Rename Go module to github.com/thepeterstone/claudomator	Peter Stone
	Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08	Initial project: task model, executor, API server, CLI, storage, reporter	Peter Stone
	Claudomator automation toolkit for Claude Code with: - Task model with YAML parsing, validation, state machine (49 tests, 0 races) - SQLite storage for tasks and executions - Executor pool with bounded concurrency, timeout, cancellation - REST API + WebSocket for mobile PWA integration - Webhook/multi-notifier system - CLI: init, run, serve, list, status commands - Console, JSON, HTML reporters with cost tracking Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>