claudomator.git - claudomator — task automation server

Age	Commit message (Collapse)	Author
2026-05-12	feat(executor): bring GeminiRunner to sandbox-flow parity with Claude	Claude
	All coding tasks now follow the same flow regardless of runner: when project_dir is set, the agent runs in a temp clone, not in the user's working tree. On success, edits are autocommitted and pushed back to origin/master and the sandbox is removed. On failure or BLOCKED, the sandbox is preserved and its path surfaces in the error / BlockedError so the user can inspect partial work or resume in place. Before this commit, GeminiRunner.Run set cmd.Dir to project_dir directly, so an agent run could leave half-done edits in the user's working tree with no rollback. ClaudeRunner has had the full sandbox flow for a while; this commit closes the gap. Reused the existing package-level helpers from claude.go verbatim: setupSandbox, teardownSandbox, sandboxCloneSource, gitSafe, plus the resume/stale-sandbox/blocked-error patterns. No new shared abstraction needed — same package. LocalRunner intentionally not changed. The OpenAI chat path has no tool use, so the agent can't edit files; sandbox would be theater. Tests (6 new): - Run_ProjectDir_RunsInSandbox: cwd captured by fake binary is a sandbox path, not project_dir. - Run_BlockedError_IncludesSandboxDir: when question.json appears, BlockedError.SandboxDir is set and the dir exists. - Run_ExecError_PreservesSandbox: failing exit wraps error with "(sandbox preserved at <path>)" and the path exists on disk. - Run_ResumeUsesStoredSandboxDir: ResumeSessionID + SandboxDir → runs in that dir without re-cloning. - Run_StaleSandboxDir_ClonesAfresh: resume pointing at missing dir falls back to a fresh clone from project_dir. - Run_NoProjectDir_SkipsSandbox: tasks without project_dir don't trigger sandbox setup. https://claude.ai/code/session_017Edeq947TpSm1vQTxMhi1J
2026-05-03	chore: post-epic cleanup — green test suite, no skips	Claude
	Addresses the cleanup queue captured in docs/plans/local-oss-runner.md after the local-OSS-models epic landed. After this commit `go test -race ./...` is green across every package with zero `t.Skip` calls and no excluded tests. Real bugs fixed: - claude.go setupSandbox callsites used `sandboxDir, err := ...` which shadowed the outer variable, so BlockedError.SandboxDir was always empty. Resume-after-block was broken for both new and stale-sandbox paths. TestBlockedError_IncludesSandboxDir now exercises the right invariant. - TestPool_ActivePerAgent_DeletesZeroEntries flake under -race: the cleanup defer in execute()/executeResume() runs AFTER handleRunResult sends on resultCh, so consumers observing a result could see a still-counted activePerAgent entry. Extracted decActiveAgent(agentType, *cleaned) helper; called explicitly before every resultCh send, defer becomes a no-op via the cleaned flag. Verified clean over `go test -race -count=10`. Test infrastructure made hermetic: - gitSafe now also passes -c commit.gpgsign=false / -c tag.gpgsign=false so sandbox tests pass on hosts whose global config requires signing. - Bare repos in tests initialized with `-b main` (HEAD symbolic ref matched to the branch we push) so `git log` after push works. - TestSandboxCloneSource_FallsBackToOrigin uses a local-FS origin URL, matching sandboxCloneSource's intentional filter against network URLs. - TestGeminiLogs_ParsedCorrectly URL fixed to the actual log route (/api/executions/{id}/log). GeminiRunner gap closed (partial): - parseGeminiStream now walks lines for `result` events, surfacing is_error as an error and total_cost_usd as the float return value. - GeminiRunner.Run propagates parsed cost to Execution.CostUSD. - TestParseGeminiStream_ParsesStructuredOutput unskipped. Notes: - GeminiRunner is still simulated end-to-end (Run writes hardcoded stream data instead of execing the binary). The result/cost parser now exists; finishing the runner is a smaller, contained follow-up. Kept on the deferred queue. - Frontend "Local" agent option and a minor storage.db.go logger TODO remain on the deferred queue, both intentionally — neither blocks anything in flight. https://claude.ai/code/session_017Edeq947TpSm1vQTxMhi1J
2026-04-28	feat(executor): add LocalRunner and OpenAI-compat LLM client	Claude
	Phase 1 of "local OSS models as agents" plan. Adds a third Runner backed by any OpenAI-compatible HTTP server (Ollama, vLLM, LM Studio, llama.cpp), and migrates the Gemini-CLI classifier to route through the same client when configured. Two-layer split: internal/llm.Client is the workhorse (HTTP, no Pool, no DB) used directly by the classifier and any future internal helper that needs cheap reasoning. internal/executor.LocalRunner is a thin adapter implementing Runner for user-facing tasks. This avoids Pool reentrancy/deadlock when sub-second internal calls fire from inside Pool.execute(). Highlights: - internal/retry: relocated runWithBackoff/IsRateLimitError/ParseRetryAfter into a shared package reused by executor and llm. - internal/llm: Chat (non-streaming) and ChatStream (SSE) over /chat/completions with optional bearer auth, json_object response format, retry on 429/503, Retry-After parsing. - internal/executor/LocalRunner: streams deltas into stdout.log in the same stream-json envelope ClaudeRunner emits, then writes one consolidated assistant block plus a result terminator so existing parsers (extractSummary, ParseChangestatFromOutput) work unchanged. - internal/executor/Classifier: gains optional LLM field; uses json_object response format (no markdown-fence cleanup needed). Falls back to Gemini-CLI subprocess when LLM is nil. - Pool.skipClassification: now skips only when the requested agent type is registered, so unknown types still reach the load balancer. - Storage: additive tokens_in/tokens_out ALTERs on executions; CLI runners record cost_usd as before, LocalRunner records 0 + tokens. - Config: [local_model] section (endpoint, model, timeout_seconds, default_temperature, api_key). Empty endpoint = no LocalRunner registered, classifier falls back to Gemini. Pre-existing test issues fixed in passing: - claude_test.go setupSandbox callsites updated to current signature. - gemini_test.go TestParseGeminiStream skipped (asserts unimplemented GeminiRunner stream-error parsing; tracked separately). Plan: docs/plans/local-oss-runner.md. https://claude.ai/code/session_017Edeq947TpSm1vQTxMhi1J
2026-03-16	fix: permission denied and host key verification errors; add gemini ↵	Peter Stone
	elaboration fallback
2026-03-13	fix: enable Gemini file writing by passing --yolo and -p flags	Peter Stone
	GeminiRunner.buildArgs was missing --yolo (auto-approve all tools) so the gemini CLI only registered 3 tools (read_file, write_todos, cli_help) and write_file was not available. Agents that needed to create files silently failed (exit 0, no files written). Also switch instructions from bare positional arg to -p flag, which is required for non-interactive headless mode. Update preamble tests to match file-based summary approach (CLAUDOMATOR_SUMMARY_FILE) kept from the merge conflict resolution. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08	executor: update gemini model to 2.5-flash-lite and fix classifier parsing	Peter Stone
	Update the default Gemini model and classification prompt to use gemini-2.5-flash-lite, which is the current available model. Improved the classifier's parsing logic to correctly handle the JSON envelope returned by the gemini CLI (stripping 'response' wrapper and 'Loaded cached credentials' noise).
2026-03-08	merge: pull latest from master and resolve conflicts	Peter Stone
	- Resolve conflicts in API server, CLI, and executor. - Maintain Gemini classification and assignment logic. - Update UI to use generic agent config and project_dir. - Fix ProjectDir/WorkingDir inconsistencies in Gemini runner. - All tests passing after merge.
2026-03-08	refactor: address code review notes (backward compat, Gemini tests, unknown ↵	Peter Stone
	agent test)