<feed xmlns='http://www.w3.org/2005/Atom'>
<title>claudomator.git/internal/executor, branch main</title>
<subtitle>claudomator — task automation server
</subtitle>
<id>https://git.terst.org/claudomator.git/atom?h=main</id>
<link rel='self' href='https://git.terst.org/claudomator.git/atom?h=main'/>
<link rel='alternate' type='text/html' href='https://git.terst.org/claudomator.git/'/>
<updated>2026-05-13T04:02:20+00:00</updated>
<entry>
<title>merge: integrate github/main — LocalRunner, real GeminiRunner, llm client</title>
<updated>2026-05-13T04:02:20+00:00</updated>
<author>
<name>Peter Stone</name>
<email>thepeterstone@gmail.com</email>
</author>
<published>2026-05-13T04:02:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.terst.org/claudomator.git/commit/?id=68399a598924775a3ec22a39c2336ae497fb07f3'/>
<id>urn:sha1:68399a598924775a3ec22a39c2336ae497fb07f3</id>
<content type='text'>
Merges 12 commits from github/main (formerly master) that were developed
independently. Key additions:
- LocalRunner: OpenAI-compatible local LLM execution (Ollama, LM Studio)
- Real GeminiRunner with full sandbox parity to ClaudeRunner
- llm.Client for enriching CI failures and elaboration via local model
- retry.ParseRetryAfter moved to shared package
- tokens_in/tokens_out columns in executions table

Conflict resolutions:
- Kept local main's VAPID/push, stories, projects, agent events schema
- Merged both sets of Config fields (local + LocalModel from github/main)
- Unified activePerAgent accounting (decActiveAgent helper)
- Removed duplicate helpers from claude.go (now in helpers.go)
- Fixed double-decrement bug in handleRunResult vs decActiveAgent

Co-Authored-By: Claude Sonnet 4.6 &lt;noreply@anthropic.com&gt;
</content>
</entry>
<entry>
<title>feat(executor): bring GeminiRunner to sandbox-flow parity with Claude</title>
<updated>2026-05-12T21:03:30+00:00</updated>
<author>
<name>Claude</name>
<email>noreply@anthropic.com</email>
</author>
<published>2026-05-12T21:03:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.terst.org/claudomator.git/commit/?id=e7171181fff10c66b2b74eabfb1fc94b3cfbb4fb'/>
<id>urn:sha1:e7171181fff10c66b2b74eabfb1fc94b3cfbb4fb</id>
<content type='text'>
All coding tasks now follow the same flow regardless of runner: when
project_dir is set, the agent runs in a temp clone, not in the user's
working tree. On success, edits are autocommitted and pushed back to
origin/master and the sandbox is removed. On failure or BLOCKED, the
sandbox is preserved and its path surfaces in the error / BlockedError
so the user can inspect partial work or resume in place.

Before this commit, GeminiRunner.Run set cmd.Dir to project_dir
directly, so an agent run could leave half-done edits in the user's
working tree with no rollback. ClaudeRunner has had the full sandbox
flow for a while; this commit closes the gap.

Reused the existing package-level helpers from claude.go verbatim:
setupSandbox, teardownSandbox, sandboxCloneSource, gitSafe, plus the
resume/stale-sandbox/blocked-error patterns. No new shared abstraction
needed — same package.

LocalRunner intentionally not changed. The OpenAI chat path has no
tool use, so the agent can't edit files; sandbox would be theater.

Tests (6 new):
- Run_ProjectDir_RunsInSandbox: cwd captured by fake binary is a
  sandbox path, not project_dir.
- Run_BlockedError_IncludesSandboxDir: when question.json appears,
  BlockedError.SandboxDir is set and the dir exists.
- Run_ExecError_PreservesSandbox: failing exit wraps error with
  "(sandbox preserved at &lt;path&gt;)" and the path exists on disk.
- Run_ResumeUsesStoredSandboxDir: ResumeSessionID + SandboxDir →
  runs in that dir without re-cloning.
- Run_StaleSandboxDir_ClonesAfresh: resume pointing at missing
  dir falls back to a fresh clone from project_dir.
- Run_NoProjectDir_SkipsSandbox: tasks without project_dir don't
  trigger sandbox setup.

https://claude.ai/code/session_017Edeq947TpSm1vQTxMhi1J
</content>
</entry>
<entry>
<title>test(executor): verify explicit Claude commits are captured in execRecord</title>
<updated>2026-05-07T19:33:44+00:00</updated>
<author>
<name>Claude</name>
<email>noreply@anthropic.com</email>
</author>
<published>2026-05-07T19:33:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.terst.org/claudomator.git/commit/?id=22ecff1fde5aa17d3053f43a8ac81f9ca49d8d56'/>
<id>urn:sha1:22ecff1fde5aa17d3053f43a8ac81f9ca49d8d56</id>
<content type='text'>
Adds TestTeardownSandbox_CapturesExplicitCommits to cover the case
where the agent explicitly commits changes (no autocommit needed).
Previously only the autocommit path was tested; this confirms
teardownSandbox populates Commits for any commits ahead of origin.

https://claude.ai/code/session_01G4dT9JBWFFb8xGcSHenzRS
</content>
</entry>
<entry>
<title>fix: atomic execution creation + RUNNING state transition</title>
<updated>2026-05-03T17:59:18+00:00</updated>
<author>
<name>Peter Stone</name>
<email>thepeterstone@gmail.com</email>
</author>
<published>2026-04-10T09:17:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.terst.org/claudomator.git/commit/?id=f01231cc45f41ce2dc37072e77428e467ef3fc15'/>
<id>urn:sha1:f01231cc45f41ce2dc37072e77428e467ef3fc15</id>
<content type='text'>
Add CreateExecutionAndSetRunning to storage.DB and Store interface,
replacing the two sequential CreateExecution/UpdateTaskState calls in
executor.go. Eliminates the crash window where a task stays PENDING
with an orphaned RUNNING execution record.

Co-Authored-By: Claude Sonnet 4.6 &lt;noreply@anthropic.com&gt;
</content>
</entry>
<entry>
<title>fix: prevent SHIPPABLE stories and wrong READY state on failed tasks</title>
<updated>2026-05-03T17:58:25+00:00</updated>
<author>
<name>Peter Stone</name>
<email>thepeterstone@gmail.com</email>
</author>
<published>2026-04-04T21:59:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.terst.org/claudomator.git/commit/?id=25cf4c9d4d6f3c18ee7565bf8e6172896fff00c3'/>
<id>urn:sha1:25cf4c9d4d6f3c18ee7565bf8e6172896fff00c3</id>
<content type='text'>
Three related bugs fixed:

1. maybeUnblockParent: guard against promoting QUEUED leaf tasks (no
   subtasks) to READY. The vacuously-true 'all subtasks done' check was
   advancing tasks that stalled in QUEUED (due to a prior SQLite lock
   error) to READY on server restart via RecoverStaleBlocked, despite
   having only failed executions and no commits.

2. checkStoryCompletion: require COMPLETED (not just READY) for all
   top-level tasks before advancing a story to SHIPPABLE. READY means
   the checker agent is still pending or the task awaits human review;
   a story with READY tasks is not ready to ship.

3. handleAcceptTask: call CheckStoryCompletion after a task is accepted
   so stories with parent tasks (whose subtasks are all done and then
   the parent is manually accepted) can auto-advance to SHIPPABLE.

Co-Authored-By: Claude Sonnet 4.6 &lt;noreply@anthropic.com&gt;
</content>
</entry>
<entry>
<title>chore: close deferred work — real GeminiRunner, Local UI option, db.go cleanup</title>
<updated>2026-05-03T08:00:20+00:00</updated>
<author>
<name>Claude</name>
<email>noreply@anthropic.com</email>
</author>
<published>2026-05-03T08:00:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.terst.org/claudomator.git/commit/?id=e7b382bf177cbe518af3d86c3ee6c49344d225f4'/>
<id>urn:sha1:e7b382bf177cbe518af3d86c3ee6c49344d225f4</id>
<content type='text'>
Closes the three items left on the deferred queue after the post-epic
cleanup.

GeminiRunner.execOnce now actually executes the gemini binary instead
of writing hardcoded stream data. Mirrors ClaudeRunner.execOnce:
- exec.CommandContext with the same env vars (CLAUDOMATOR_API_URL etc.)
- process group SIGKILL on context cancel
- stdout piped through parseGeminiStream → stdoutFile
- stderr to file
- exit codes captured, stderr tail surfaced on failure

Test infrastructure bug uncovered in passing: testServerWithGeminiMockRunner's
mock script used double-quoted echo with literal triple-backticks, which
bash interpreted as command substitution. The script always produced
empty output. The bug was invisible until now because GeminiRunner
ignored the script entirely. Switched to a single-quoted heredoc.

Frontend: index.html dropdown gains a "Local" option. No JS branching
needed — the value flows through to agent.type verbatim and downstream
display reads the type string as-is.

storage/db.go: removed stale debug-comment scaffolding (the "TODO:
Replace with proper logger" block) that was tracking a dead
`fmt.Printf` call. The path it commented on is fine without logging —
unmarshal errors are returned wrapped.

Test status: `go test -race ./...` green across every package, zero
skips, zero excluded tests.

https://claude.ai/code/session_017Edeq947TpSm1vQTxMhi1J
</content>
</entry>
<entry>
<title>chore: post-epic cleanup — green test suite, no skips</title>
<updated>2026-05-03T03:58:19+00:00</updated>
<author>
<name>Claude</name>
<email>noreply@anthropic.com</email>
</author>
<published>2026-05-03T03:58:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.terst.org/claudomator.git/commit/?id=85c3bf4d28b0903a2005356339e6ea56855b8c80'/>
<id>urn:sha1:85c3bf4d28b0903a2005356339e6ea56855b8c80</id>
<content type='text'>
Addresses the cleanup queue captured in docs/plans/local-oss-runner.md
after the local-OSS-models epic landed. After this commit
`go test -race ./...` is green across every package with zero `t.Skip`
calls and no excluded tests.

Real bugs fixed:
- claude.go setupSandbox callsites used `sandboxDir, err := ...` which
  shadowed the outer variable, so BlockedError.SandboxDir was always
  empty. Resume-after-block was broken for both new and stale-sandbox
  paths. TestBlockedError_IncludesSandboxDir now exercises the right
  invariant.
- TestPool_ActivePerAgent_DeletesZeroEntries flake under -race: the
  cleanup defer in execute()/executeResume() runs AFTER
  handleRunResult sends on resultCh, so consumers observing a result
  could see a still-counted activePerAgent entry. Extracted
  decActiveAgent(agentType, *cleaned) helper; called explicitly before
  every resultCh send, defer becomes a no-op via the cleaned flag.
  Verified clean over `go test -race -count=10`.

Test infrastructure made hermetic:
- gitSafe now also passes -c commit.gpgsign=false / -c tag.gpgsign=false
  so sandbox tests pass on hosts whose global config requires signing.
- Bare repos in tests initialized with `-b main` (HEAD symbolic ref
  matched to the branch we push) so `git log` after push works.
- TestSandboxCloneSource_FallsBackToOrigin uses a local-FS origin URL,
  matching sandboxCloneSource's intentional filter against network URLs.
- TestGeminiLogs_ParsedCorrectly URL fixed to the actual log route
  (/api/executions/{id}/log).

GeminiRunner gap closed (partial):
- parseGeminiStream now walks lines for `result` events, surfacing
  is_error as an error and total_cost_usd as the float return value.
- GeminiRunner.Run propagates parsed cost to Execution.CostUSD.
- TestParseGeminiStream_ParsesStructuredOutput unskipped.

Notes:
- GeminiRunner is still simulated end-to-end (Run writes hardcoded
  stream data instead of execing the binary). The result/cost parser
  now exists; finishing the runner is a smaller, contained follow-up.
  Kept on the deferred queue.
- Frontend "Local" agent option and a minor storage.db.go logger TODO
  remain on the deferred queue, both intentionally — neither blocks
  anything in flight.

https://claude.ai/code/session_017Edeq947TpSm1vQTxMhi1J
</content>
</entry>
<entry>
<title>feat(executor): synthesize execution summary via local LLM fallback</title>
<updated>2026-05-02T08:00:17+00:00</updated>
<author>
<name>Claude</name>
<email>noreply@anthropic.com</email>
</author>
<published>2026-05-02T08:00:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.terst.org/claudomator.git/commit/?id=50f8fe8c1ff8b82e0bd399e5776e58bda3e57d1c'/>
<id>urn:sha1:50f8fe8c1ff8b82e0bd399e5776e58bda3e57d1c</id>
<content type='text'>
Phase 4 of "local OSS models as agents" plan. Closes the epic.

When an execution finishes and the agent did NOT write a "## Summary"
heading in its stdout (so the existing extractSummary path returns
empty), and the Pool has a local LLM configured, we now synthesize a
2-4 sentence summary from the assistant text content of the log tail.

Behavior:
- Primary path unchanged: if the agent wrote "## Summary", that wins
  byte-for-byte (TestPool_HandleRunResult_ExtractSummaryWins guards).
- Fallback path: empty extractSummary + Pool.LLM != nil → synthesize.
- All-empty path: when no LLM is configured, summary stays empty —
  identical to pre-Phase-4 behavior.

Implementation:
- Pool gains an LLM *llm.Client field, wired in serve.go and run.go
  alongside Classifier.LLM (same localClient used everywhere).
- New synthesizeSummary in internal/executor/summary.go:
  * 6s timeout so a slow local model can't stall finalization
  * 16 KB tail cap on the stdout log
  * readAssistantTextTail seeks to the last 16 KB and skips the
    first (likely partial) line, parses each line as a stream-json
    event, joins assistant `text` blocks (skips system/result/etc).
  * Returns "" on any error so the caller's behavior never regresses.
- handleRunResult: 3-tier summary resolution — exec.Summary set by
  runner → extractSummary → synthesizeSummary → empty.
- minimalMockStore now records UpdateTaskSummary calls (additive;
  existing tests unaffected) so integration tests can assert.

Tests (9 new):
- synthesizeSummary nil client / empty path / missing file all
  return "" without HTTP calls.
- empty assistant content short-circuits without LLM call.
- success path returns trimmed body, with both assistant texts in
  the user prompt.
- LLM 500 returns "" (caller handles same as no-summary).
- readAssistantTextTail seeks past early content in a large file.
- Pool integration: ## Summary present → LLM not called, agent text
  used. ## Summary absent + LLM set → LLM called, synthesized summary
  recorded against the right task ID.

Plan: docs/plans/local-oss-runner.md.

Epic complete. Post-epic deep cleanup queue captured in the same plan
file for follow-up.

https://claude.ai/code/session_017Edeq947TpSm1vQTxMhi1J
</content>
</entry>
<entry>
<title>feat(executor): add LocalRunner and OpenAI-compat LLM client</title>
<updated>2026-04-28T09:24:43+00:00</updated>
<author>
<name>Claude</name>
<email>noreply@anthropic.com</email>
</author>
<published>2026-04-28T09:24:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.terst.org/claudomator.git/commit/?id=0865afc43be562dbe14528e4299b9e213b54cc93'/>
<id>urn:sha1:0865afc43be562dbe14528e4299b9e213b54cc93</id>
<content type='text'>
Phase 1 of "local OSS models as agents" plan. Adds a third Runner
backed by any OpenAI-compatible HTTP server (Ollama, vLLM, LM Studio,
llama.cpp), and migrates the Gemini-CLI classifier to route through
the same client when configured.

Two-layer split: internal/llm.Client is the workhorse (HTTP, no Pool,
no DB) used directly by the classifier and any future internal helper
that needs cheap reasoning. internal/executor.LocalRunner is a thin
adapter implementing Runner for user-facing tasks. This avoids
Pool reentrancy/deadlock when sub-second internal calls fire from
inside Pool.execute().

Highlights:
- internal/retry: relocated runWithBackoff/IsRateLimitError/ParseRetryAfter
  into a shared package reused by executor and llm.
- internal/llm: Chat (non-streaming) and ChatStream (SSE) over
  /chat/completions with optional bearer auth, json_object response
  format, retry on 429/503, Retry-After parsing.
- internal/executor/LocalRunner: streams deltas into stdout.log in the
  same stream-json envelope ClaudeRunner emits, then writes one
  consolidated assistant block plus a result terminator so existing
  parsers (extractSummary, ParseChangestatFromOutput) work unchanged.
- internal/executor/Classifier: gains optional LLM field; uses
  json_object response format (no markdown-fence cleanup needed).
  Falls back to Gemini-CLI subprocess when LLM is nil.
- Pool.skipClassification: now skips only when the requested agent
  type is registered, so unknown types still reach the load balancer.
- Storage: additive tokens_in/tokens_out ALTERs on executions; CLI
  runners record cost_usd as before, LocalRunner records 0 + tokens.
- Config: [local_model] section (endpoint, model, timeout_seconds,
  default_temperature, api_key). Empty endpoint = no LocalRunner
  registered, classifier falls back to Gemini.

Pre-existing test issues fixed in passing:
- claude_test.go setupSandbox callsites updated to current signature.
- gemini_test.go TestParseGeminiStream skipped (asserts unimplemented
  GeminiRunner stream-error parsing; tracked separately).

Plan: docs/plans/local-oss-runner.md.

https://claude.ai/code/session_017Edeq947TpSm1vQTxMhi1J
</content>
</entry>
<entry>
<title>cleanup: remove dead code (QuestionRegistry, changestats wrappers, scanner.Err)</title>
<updated>2026-04-11T18:10:32+00:00</updated>
<author>
<name>Claudomator Agent</name>
<email>agent@claudomator</email>
</author>
<published>2026-04-11T18:10:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.terst.org/claudomator.git/commit/?id=e94573bb84874eda7d233cafc36f3a21688c0568'/>
<id>urn:sha1:e94573bb84874eda7d233cafc36f3a21688c0568</id>
<content type='text'>
Fix 1: Remove QuestionRegistry and related types (QuestionHandler, PendingQuestion)
from question.go -- nothing reads Pool.Questions or uses the registry. Remove
NewQuestionRegistry() call from NewPool and the Questions field from Pool.
Remove the now-superfluous registry tests; keep stream/parse helpers which are
still used by the claude runner.

Fix 2: Check scanner.Err() after the parseStream loop so I/O errors from the
scanner are not silently swallowed when streamErr is still nil.

Fix 3: Delete internal/api/changestats.go -- the parseChangestatFromFile and
parseChangestatFromOutput wrappers were only needed to support processResult(),
which no longer calls them; they are unreachable dead code.

Co-Authored-By: Claude Sonnet 4.6 &lt;noreply@anthropic.com&gt;
</content>
</entry>
</feed>
