<feed xmlns='http://www.w3.org/2005/Atom'>
<title>claudomator.git/internal/llm/client_test.go, branch main</title>
<subtitle>claudomator — task automation server
</subtitle>
<id>https://git.terst.org/claudomator.git/atom?h=main</id>
<link rel='self' href='https://git.terst.org/claudomator.git/atom?h=main'/>
<link rel='alternate' type='text/html' href='https://git.terst.org/claudomator.git/'/>
<updated>2026-04-28T09:24:43+00:00</updated>
<entry>
<title>feat(executor): add LocalRunner and OpenAI-compat LLM client</title>
<updated>2026-04-28T09:24:43+00:00</updated>
<author>
<name>Claude</name>
<email>noreply@anthropic.com</email>
</author>
<published>2026-04-28T09:24:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.terst.org/claudomator.git/commit/?id=0865afc43be562dbe14528e4299b9e213b54cc93'/>
<id>urn:sha1:0865afc43be562dbe14528e4299b9e213b54cc93</id>
<content type='text'>
Phase 1 of "local OSS models as agents" plan. Adds a third Runner
backed by any OpenAI-compatible HTTP server (Ollama, vLLM, LM Studio,
llama.cpp), and migrates the Gemini-CLI classifier to route through
the same client when configured.

Two-layer split: internal/llm.Client is the workhorse (HTTP, no Pool,
no DB) used directly by the classifier and any future internal helper
that needs cheap reasoning. internal/executor.LocalRunner is a thin
adapter implementing Runner for user-facing tasks. This avoids
Pool reentrancy/deadlock when sub-second internal calls fire from
inside Pool.execute().

Highlights:
- internal/retry: relocated runWithBackoff/IsRateLimitError/ParseRetryAfter
  into a shared package reused by executor and llm.
- internal/llm: Chat (non-streaming) and ChatStream (SSE) over
  /chat/completions with optional bearer auth, json_object response
  format, retry on 429/503, Retry-After parsing.
- internal/executor/LocalRunner: streams deltas into stdout.log in the
  same stream-json envelope ClaudeRunner emits, then writes one
  consolidated assistant block plus a result terminator so existing
  parsers (extractSummary, ParseChangestatFromOutput) work unchanged.
- internal/executor/Classifier: gains optional LLM field; uses
  json_object response format (no markdown-fence cleanup needed).
  Falls back to Gemini-CLI subprocess when LLM is nil.
- Pool.skipClassification: now skips only when the requested agent
  type is registered, so unknown types still reach the load balancer.
- Storage: additive tokens_in/tokens_out ALTERs on executions; CLI
  runners record cost_usd as before, LocalRunner records 0 + tokens.
- Config: [local_model] section (endpoint, model, timeout_seconds,
  default_temperature, api_key). Empty endpoint = no LocalRunner
  registered, classifier falls back to Gemini.

Pre-existing test issues fixed in passing:
- claude_test.go setupSandbox callsites updated to current signature.
- gemini_test.go TestParseGeminiStream skipped (asserts unimplemented
  GeminiRunner stream-error parsing; tracked separately).

Plan: docs/plans/local-oss-runner.md.

https://claude.ai/code/session_017Edeq947TpSm1vQTxMhi1J
</content>
</entry>
</feed>
