From 50f8fe8c1ff8b82e0bd399e5776e58bda3e57d1c Mon Sep 17 00:00:00 2001
From: Claude <noreply@anthropic.com>
Date: Sat, 2 May 2026 08:00:17 +0000
Subject: feat(executor): synthesize execution summary via local LLM fallback
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Phase 4 of "local OSS models as agents" plan. Closes the epic.

When an execution finishes and the agent did NOT write a "## Summary"
heading in its stdout (so the existing extractSummary path returns
empty), and the Pool has a local LLM configured, we now synthesize a
2-4 sentence summary from the assistant text content of the log tail.

Behavior:
- Primary path unchanged: if the agent wrote "## Summary", that wins
  byte-for-byte (TestPool_HandleRunResult_ExtractSummaryWins guards).
- Fallback path: empty extractSummary + Pool.LLM != nil → synthesize.
- All-empty path: when no LLM is configured, summary stays empty —
  identical to pre-Phase-4 behavior.

Implementation:
- Pool gains an LLM *llm.Client field, wired in serve.go and run.go
  alongside Classifier.LLM (same localClient used everywhere).
- New synthesizeSummary in internal/executor/summary.go:
  * 6s timeout so a slow local model can't stall finalization
  * 16 KB tail cap on the stdout log
  * readAssistantTextTail seeks to the last 16 KB and skips the
    first (likely partial) line, parses each line as a stream-json
    event, joins assistant `text` blocks (skips system/result/etc).
  * Returns "" on any error so the caller's behavior never regresses.
- handleRunResult: 3-tier summary resolution — exec.Summary set by
  runner → extractSummary → synthesizeSummary → empty.
- minimalMockStore now records UpdateTaskSummary calls (additive;
  existing tests unaffected) so integration tests can assert.

Tests (9 new):
- synthesizeSummary nil client / empty path / missing file all
  return "" without HTTP calls.
- empty assistant content short-circuits without LLM call.
- success path returns trimmed body, with both assistant texts in
  the user prompt.
- LLM 500 returns "" (caller handles same as no-summary).
- readAssistantTextTail seeks past early content in a large file.
- Pool integration: ## Summary present → LLM not called, agent text
  used. ## Summary absent + LLM set → LLM called, synthesized summary
  recorded against the right task ID.

Plan: docs/plans/local-oss-runner.md.

Epic complete. Post-epic deep cleanup queue captured in the same plan
file for follow-up.

https://claude.ai/code/session_017Edeq947TpSm1vQTxMhi1J
---
 internal/executor/executor_test.go | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

(limited to 'internal/executor/executor_test.go')

diff --git a/internal/executor/executor_test.go b/internal/executor/executor_test.go
index 878a32d..b1173cb 100644
--- a/internal/executor/executor_test.go
+++ b/internal/executor/executor_test.go
@@ -980,6 +980,7 @@ type minimalMockStore struct {
 	executions      map[string]*storage.Execution
 	stateUpdates    []struct{ id string; state task.State }
 	questionUpdates []string
+	summaryUpdates  []struct{ taskID, summary string }
 	changestatCalls []struct {
 		execID string
 		stats  *task.Changestats
@@ -1035,7 +1036,21 @@ func (m *minimalMockStore) UpdateTaskQuestion(taskID, questionJSON string) error
 	m.mu.Unlock()
 	return nil
 }
-func (m *minimalMockStore) UpdateTaskSummary(taskID, summary string) error        { return nil }
+func (m *minimalMockStore) UpdateTaskSummary(taskID, summary string) error {
+	m.mu.Lock()
+	m.summaryUpdates = append(m.summaryUpdates, struct{ taskID, summary string }{taskID, summary})
+	m.mu.Unlock()
+	return nil
+}
+func (m *minimalMockStore) lastSummaryUpdate() (string, string, bool) {
+	m.mu.Lock()
+	defer m.mu.Unlock()
+	if len(m.summaryUpdates) == 0 {
+		return "", "", false
+	}
+	last := m.summaryUpdates[len(m.summaryUpdates)-1]
+	return last.taskID, last.summary, true
+}
 func (m *minimalMockStore) AppendTaskInteraction(taskID string, _ task.Interaction) error {
 	return nil
 }
-- 
cgit v1.2.3