diff options
| author | Claude <noreply@anthropic.com> | 2026-04-28 09:24:43 +0000 |
|---|---|---|
| committer | Claude <noreply@anthropic.com> | 2026-04-28 09:24:43 +0000 |
| commit | 0865afc43be562dbe14528e4299b9e213b54cc93 (patch) | |
| tree | 3ffb11207fb6b9866b5a2477bba7abe38964f83a /internal/executor/classifier.go | |
| parent | c2aa026f6ce1c9e216b99d74f294fc133d5fcddd (diff) | |
feat(executor): add LocalRunner and OpenAI-compat LLM client
Phase 1 of "local OSS models as agents" plan. Adds a third Runner
backed by any OpenAI-compatible HTTP server (Ollama, vLLM, LM Studio,
llama.cpp), and migrates the Gemini-CLI classifier to route through
the same client when configured.
Two-layer split: internal/llm.Client is the workhorse (HTTP, no Pool,
no DB) used directly by the classifier and any future internal helper
that needs cheap reasoning. internal/executor.LocalRunner is a thin
adapter implementing Runner for user-facing tasks. This avoids
Pool reentrancy/deadlock when sub-second internal calls fire from
inside Pool.execute().
Highlights:
- internal/retry: relocated runWithBackoff/IsRateLimitError/ParseRetryAfter
into a shared package reused by executor and llm.
- internal/llm: Chat (non-streaming) and ChatStream (SSE) over
/chat/completions with optional bearer auth, json_object response
format, retry on 429/503, Retry-After parsing.
- internal/executor/LocalRunner: streams deltas into stdout.log in the
same stream-json envelope ClaudeRunner emits, then writes one
consolidated assistant block plus a result terminator so existing
parsers (extractSummary, ParseChangestatFromOutput) work unchanged.
- internal/executor/Classifier: gains optional LLM field; uses
json_object response format (no markdown-fence cleanup needed).
Falls back to Gemini-CLI subprocess when LLM is nil.
- Pool.skipClassification: now skips only when the requested agent
type is registered, so unknown types still reach the load balancer.
- Storage: additive tokens_in/tokens_out ALTERs on executions; CLI
runners record cost_usd as before, LocalRunner records 0 + tokens.
- Config: [local_model] section (endpoint, model, timeout_seconds,
default_temperature, api_key). Empty endpoint = no LocalRunner
registered, classifier falls back to Gemini.
Pre-existing test issues fixed in passing:
- claude_test.go setupSandbox callsites updated to current signature.
- gemini_test.go TestParseGeminiStream skipped (asserts unimplemented
GeminiRunner stream-error parsing; tracked separately).
Plan: docs/plans/local-oss-runner.md.
https://claude.ai/code/session_017Edeq947TpSm1vQTxMhi1J
Diffstat (limited to 'internal/executor/classifier.go')
| -rw-r--r-- | internal/executor/classifier.go | 33 |
1 files changed, 33 insertions, 0 deletions
diff --git a/internal/executor/classifier.go b/internal/executor/classifier.go index 7a474b6..049dc4f 100644 --- a/internal/executor/classifier.go +++ b/internal/executor/classifier.go @@ -6,6 +6,8 @@ import ( "fmt" "os/exec" "strings" + + "github.com/thepeterstone/claudomator/internal/llm" ) type Classification struct { @@ -19,7 +21,12 @@ type SystemStatus struct { RateLimited map[string]bool } +// Classifier picks a model for an incoming task. When LLM is non-nil the +// classifier routes through the local OpenAI-compatible client (cheap, +// private, fast). Otherwise it falls back to invoking the Gemini CLI +// at GeminiBinaryPath. type Classifier struct { + LLM *llm.Client GeminiBinaryPath string } @@ -62,6 +69,10 @@ func (c *Classifier) Classify(ctx context.Context, taskName, instructions string agentType, taskName, instructions, agentType, ) + if c.LLM != nil { + return c.classifyViaLLM(ctx, prompt, agentType) + } + binary := c.GeminiBinaryPath if binary == "" { binary = "gemini" @@ -123,3 +134,25 @@ func (c *Classifier) Classify(ctx context.Context, taskName, instructions string return &cls, nil } + +// classifyViaLLM routes classification through the local OpenAI-compatible +// client with response_format=json_object, so we get clean JSON without the +// markdown-fence cleanup needed for the Gemini CLI fallback. +func (c *Classifier) classifyViaLLM(ctx context.Context, prompt, agentType string) (*Classification, error) { + resp, err := c.LLM.Chat(ctx, llm.ChatRequest{ + Messages: []llm.Message{{Role: "user", Content: prompt}}, + ResponseJSON: true, + }) + if err != nil { + return nil, fmt.Errorf("classifier (local llm): %w", err) + } + body := strings.TrimSpace(resp.Content) + var cls Classification + if err := json.Unmarshal([]byte(body), &cls); err != nil { + return nil, fmt.Errorf("classifier (local llm): parse JSON: %w\nbody: %s", err, body) + } + if cls.AgentType == "" { + cls.AgentType = agentType + } + return &cls, nil +} |
