summaryrefslogtreecommitdiff
path: root/docs/plans/local-oss-runner.md
diff options
context:
space:
mode:
authorClaude <noreply@anthropic.com>2026-04-28 17:10:27 +0000
committerClaude <noreply@anthropic.com>2026-04-28 17:10:27 +0000
commitae833b2765c7c8086bf8e1ea8e8ec8ee9b73e656 (patch)
treeb2cda4dc982d6c04eb22033e19645091af42224b /docs/plans/local-oss-runner.md
parent0865afc43be562dbe14528e4299b9e213b54cc93 (diff)
feat(api): route elaboration through local LLM when configured
Phase 2 of "local OSS models as agents" plan. Adds a third elaboration path that calls the local OpenAI-compatible LLM via the internal/llm client, and reorders dispatch so the cheap path is tried first: local → claude → gemini, with each next attempt only on hard failure of the prior. Wiring is opt-out, not opt-in: when [local_model].endpoint is set, elaboration prefers local by default. Users with a slow or low-quality local model can disable just elaboration via: [local_model] endpoint = "..." prefer_for_elaborate = false without giving up the runner or the classifier path. Implementation: - Server gains an optional *llm.Client field via SetLLM (matches the existing SetNotifier/SetWorkspaceRoot setter pattern, no NewServer signature break). - elaborateWithLocal() reuses buildElaboratePrompt verbatim and asks for response_format=json_object so we skip markdown-fence cleanup. - handleElaborateTask reorders try chain; existing Claude-first behavior is preserved exactly when SetLLM is not called. - LocalModel.UseForElaborate() encapsulates the default-true gating with a *bool so explicit-false survives TOML parse. Tests: - elaborateWithLocal: parses valid response, errors on nil client, errors on bad JSON. - handler: local preferred when wired; falls back to claude when local fails; unchanged behavior when no LLM is configured. - config: UseForElaborate gating across empty/default/explicit-true/ explicit-false cases. Pre-existing test failures noted in docs/plans/local-oss-runner.md (post-epic cleanup): TestGeminiLogs_ParsedCorrectly returns 404 for gemini execution log fetch — predates this change. Plan: docs/plans/local-oss-runner.md. https://claude.ai/code/session_017Edeq947TpSm1vQTxMhi1J
Diffstat (limited to 'docs/plans/local-oss-runner.md')
-rw-r--r--docs/plans/local-oss-runner.md64
1 files changed, 64 insertions, 0 deletions
diff --git a/docs/plans/local-oss-runner.md b/docs/plans/local-oss-runner.md
index de16e05..108495b 100644
--- a/docs/plans/local-oss-runner.md
+++ b/docs/plans/local-oss-runner.md
@@ -183,3 +183,67 @@ This is the only phase we execute in this pass. Phases 2–4 will get their own
- Branch pushed to remote
After Phase 1 lands, we stop and decide whether to begin Phase 2 (elaboration). At that point we'll write a Phase 2 focused plan in `docs/plans/local-oss-runner.md`.
+
+---
+
+# Post-epic follow-up: deep cleanup
+
+After all four phases land, plan and execute a deep cleanup pass. Things noticed in flight that we deliberately did not chase mid-epic:
+
+- **Sandbox/git tests fail in this environment** because `git commit` invokes a signing server that returns 400 ("missing source"). Affected: `TestSandboxCloneSource_*`, `TestSetupSandbox_*`, `TestTeardownSandbox_*`, `TestBlockedError_IncludesSandboxDir`, `TestClaudeRunner_Run_StaleSandboxDir_ClonesAfresh`. Fix: set `commit.gpgsign=false` in test setup so sandbox tests run hermetically.
+- **`TestParseGeminiStream_ParsesStructuredOutput` is currently `t.Skip`** as a pre-existing gemini-stub gap. Either implement result-error/cost parsing in `parseGeminiStream` or delete the test until the stub is finished.
+- **`TestPool_ActivePerAgent_DeletesZeroEntries` flakes** under `-race` when run with the full suite (passes in isolation and on `-count=3`). Likely goroutine-ordering in the `activePerAgent` map cleanup path. Audit dispatch/finish ordering.
+- **`setupSandbox` test signature drift** was just fixed; audit other tests for similar staleness from prior refactors.
+- **Pre-existing `executor` tests didn't compile on trunk** until the setupSandbox fix landed. Verify CI reality — is it green via something we're missing, or quietly broken?
+- **GeminiRunner is still simulated** (`gemini.go:107-116`). Decide: finish it (real subprocess + cost parsing + sandbox) or delete it and leave only Claude + Local.
+- **Frontend "Local" agent option** — UI dropdown still says "Auto / Claude / Gemini". Add Local once token telemetry has a place to render.
+- **Audit `*_test.go` for `t.Skip` and other dormant breakage** before shipping more code on top.
+- **`TestGeminiLogs_ParsedCorrectly`** in `internal/api` returns 404 from `GET /log` for a gemini execution — pre-existing on Phase 1 baseline. Some routing or log-path resolution mismatch specific to gemini executions. Likely related to the GeminiRunner stub status above.
+
+Goal: clean `go test -race ./...` with zero skips and zero environmental failures on whatever platform CI runs on.
+
+---
+
+# Phase 2 — Focused Plan (Elaboration)
+
+## Phase 2 scope
+
+`internal/api/elaborate.go` currently has two paths: Claude and Gemini. Add a third (local) and make it the preferred path when local model is configured. Try-order: local → claude → gemini, with each next attempt only on hard failure of the prior.
+
+Second-cheapest, second-highest-volume LLM call after classification (one per task creation, sub-second target). Routing through local removes another cost line and lets elaboration work offline.
+
+## What ships
+
+- `Server` (`internal/api/server.go`) gains `llm *llm.Client` threaded through `NewServer`
+- `internal/api/elaborate.go` gains `elaborateWithLocal(ctx, *llm.Client, input string) (string, error)`
+- Dispatch in `Server.elaborate` reorders to: local → claude → gemini, gated by `PreferLocalForElaborate`
+- `Config` gains `PreferLocalForElaborate bool`, defaulted true when `LocalModel.Endpoint != ""`
+- Wiring in `internal/cli/serve.go` passes the LLM client into `NewServer`
+
+## Explicit non-goals
+
+- No prompt rework — reuse existing elaboration prompt template verbatim
+- No streaming the response into SSE/WebSocket (one-shot RPC)
+- No changes to webhook (Phase 3) or summary (Phase 4)
+- No UI changes — `/elaborate` endpoint signature stays the same
+
+## Task list
+
+1. Read `internal/api/elaborate.go` end-to-end: dispatch site, Claude path, Gemini path, prompt template
+2. Read `internal/api/server.go` `NewServer` signature and `Server` fields
+3. Thread `llm *llm.Client` through `NewServer` and update callers (`internal/cli/serve.go`)
+4. Implement `elaborateWithLocal` using the same prompt template as Claude/Gemini, returning `(string, error)`
+5. Add `PreferLocalForElaborate bool` to `config.Config`, default true when local endpoint configured
+6. Reorder dispatch: `if s.llm != nil && cfg.PreferLocalForElaborate { try local; else fall through }` then existing claude → gemini chain
+7. httptest-based unit test for `elaborateWithLocal`
+8. Dispatch fallback test: local fails → claude attempted
+9. `go build ./... && go test -race ./...`
+10. Commit Phase 2 on the same branch
+11. Push
+
+## Stop conditions
+
+- Tests green under `-race`
+- `prefer_local_for_elaborate=false` short-circuits to Claude path (preserves current behavior when user opts out)
+- Local-failure fallback to Claude verified by test
+- Branch pushed