summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorPeter Stone <thepeterstone@gmail.com>2026-03-13 03:14:40 +0000
committerPeter Stone <thepeterstone@gmail.com>2026-03-13 03:14:40 +0000
commit5303a68d67e435da863353cdce09fa2e3a8c2ccd (patch)
tree2e16b9c17c11cbb3b7c9395e1b3fb119b73ef2ca
parentf28c22352aa1a8ede7552ee0277f7d60552d9094 (diff)
feat: resume support, summary extraction, and task state improvements
- Extend Resume to CANCELLED, FAILED, and BUDGET_EXCEEDED tasks - Add summary extraction from agent stdout stream-json output - Fix storage: persist stdout/stderr/artifact_dir paths in UpdateExecution - Clear question_json on ResetTaskForRetry - Resume BLOCKED tasks in preserved sandbox so Claude finds its session - Add planning preamble: CLAUDOMATOR_SUMMARY_FILE env var + summary step - Update ADR-002 with new state transitions - UI style improvements Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
-rw-r--r--docs/adr/002-task-state-machine.md18
-rw-r--r--docs/packages_old/task.md262
-rw-r--r--internal/api/docs/RAW_NARRATIVE.md117
-rw-r--r--internal/executor/claude.go8
-rw-r--r--internal/executor/preamble.go11
-rw-r--r--internal/executor/summary.go57
-rw-r--r--internal/executor/summary_test.go49
-rw-r--r--internal/storage/db.go66
-rw-r--r--internal/task/task.go51
-rw-r--r--web/style.css38
10 files changed, 641 insertions, 36 deletions
diff --git a/docs/adr/002-task-state-machine.md b/docs/adr/002-task-state-machine.md
index 310c337..6910f6a 100644
--- a/docs/adr/002-task-state-machine.md
+++ b/docs/adr/002-task-state-machine.md
@@ -66,13 +66,13 @@ True terminal state (no outgoing transitions): `COMPLETED`. All other non-succes
| `QUEUED` | `RUNNING` | Pool goroutine starts execution |
| `QUEUED` | `CANCELLED` | `POST /api/tasks/{id}/cancel` |
| `RUNNING` | `READY` | Runner exits 0, no question file, top-level task (`parent_task_id == ""`), and task has no subtasks |
-| `RUNNING` | `BLOCKED` | Runner exits 0, no question file, top-level task (`parent_task_id == ""`), and task has subtasks |
+| `RUNNING` | `BLOCKED` (subtasks) | Runner exits 0, no question file, top-level task (`parent_task_id == ""`), and task has subtasks |
+| `RUNNING` | `BLOCKED` (question) | Runner exits 0 but left a `question.json` file in log dir (any task type) |
| `RUNNING` | `COMPLETED` | Runner exits 0, no question file, subtask (`parent_task_id != ""`) |
| `RUNNING` | `FAILED` | Runner exits non-zero or stream signals `is_error: true` |
| `RUNNING` | `TIMED_OUT` | Context deadline exceeded (`context.DeadlineExceeded`) |
| `RUNNING` | `CANCELLED` | Context cancelled (`context.Canceled`) |
| `RUNNING` | `BUDGET_EXCEEDED` | `--max-budget-usd` exceeded (signalled by runner) |
-| `RUNNING` | `BLOCKED` | Runner exits 0 but left a `question.json` file in log dir |
| `READY` | `COMPLETED` | `POST /api/tasks/{id}/accept` |
| `READY` | `PENDING` | `POST /api/tasks/{id}/reject` (with optional comment) |
| `FAILED` | `QUEUED` | Retry (manual re-run via `POST /api/tasks/{id}/run`) |
@@ -85,7 +85,7 @@ True terminal state (no outgoing transitions): `COMPLETED`. All other non-succes
## Implementation
**Validation:** `task.ValidTransition(from, to State) bool`
-(`internal/task/task.go:93`) — called by API handlers before every state change.
+(`internal/task/task.go:123`) — called by API handlers before every state change.
**State writes:** `storage.DB.UpdateTaskState(id, state)` — single source of
write; called by both API handlers and the executor pool.
@@ -158,9 +158,9 @@ Task lifecycle changes produce WebSocket broadcasts to all connected clients:
## Known Limitations and Edge Cases
-- **`BUDGET_EXCEEDED` transition.** `BUDGET_EXCEEDED` appears in `terminalFailureStates`
- (used by `waitForDependencies`) but has no outgoing transitions in `ValidTransition`,
- making it permanently terminal. There is no `/resume` endpoint for it.
+- **`BUDGET_EXCEEDED` retry.** `BUDGET_EXCEEDED → QUEUED` is a valid transition (retry via
+ `POST /run`), matching `FAILED` and `CANCELLED` behaviour. However, there is no dedicated
+ `/resume` endpoint for it — callers must use the standard `/run` restart path.
- **Retry enforcement.** `RetryConfig.MaxAttempts` is stored but not enforced by
the pool. The API allows unlimited manual retries via `POST /run` from `FAILED`.
@@ -178,9 +178,9 @@ Task lifecycle changes produce WebSocket broadcasts to all connected clients:
| Concern | File | Lines |
|---|---|---|
-| State constants | `internal/task/task.go` | 7–18 |
-| `ValidTransition` | `internal/task/task.go` | 93–109 |
-| State machine tests | `internal/task/task_test.go` | 8–72 |
+| State constants | `internal/task/task.go` | 9–20 |
+| `ValidTransition` | `internal/task/task.go` | 107–130 |
+| State machine tests | `internal/task/task_test.go` | 8–75 |
| Pool execute | `internal/executor/executor.go` | 194–303 |
| Pool executeResume | `internal/executor/executor.go` | 116–185 |
| Dependency wait | `internal/executor/executor.go` | 305–340 |
diff --git a/docs/packages_old/task.md b/docs/packages_old/task.md
new file mode 100644
index 0000000..923cd56
--- /dev/null
+++ b/docs/packages_old/task.md
@@ -0,0 +1,262 @@
+# Package: task
+
+`internal/task` — Task definition, parsing, validation, and state machine.
+
+---
+
+## 1. Overview
+
+A **Task** is the central unit of work in Claudomator. It describes what an agent should do (`agent.instructions`), how it should be run (timeout, retry, priority), and how it relates to other tasks (`depends_on`, `parent_task_id`). Tasks are defined in YAML files, parsed into `Task` structs, persisted in SQLite, and driven through a state machine from `PENDING` to a terminal state.
+
+---
+
+## 2. Task Struct
+
+```go
+type Task struct {
+ ID string // UUID; auto-generated if omitted in YAML
+ ParentTaskID string // ID of parent task (subtask linkage); empty for root tasks
+ Name string // Human-readable label; required
+ Description string // Optional longer description
+ Agent AgentConfig // How to invoke the agent
+ Timeout Duration // Maximum wall-clock run time (e.g. "30m"); 0 = no limit
+ Retry RetryConfig // Retry policy
+ Priority Priority // "high" | "normal" | "low"; default "normal"
+ Tags []string // Arbitrary labels for filtering
+ DependsOn []string // Task IDs that must reach COMPLETED before this queues
+ State State // Current lifecycle state; not read from YAML (yaml:"-")
+ RejectionComment string // Set by RejectTask; not read from YAML (yaml:"-")
+ QuestionJSON string // Pending question from a READY agent; not read from YAML (yaml:"-")
+ CreatedAt time.Time // Set on parse; not read from YAML (yaml:"-")
+ UpdatedAt time.Time // Updated on every state change; not read from YAML (yaml:"-")
+}
+```
+
+Fields tagged `yaml:"-"` are runtime-only and are never parsed from task YAML files.
+
+---
+
+## 3. AgentConfig Struct
+
+```go
+type AgentConfig struct {
+ Type string // Agent implementation: "claude", "gemini", etc.
+ Model string // Model identifier passed to the agent binary
+ ContextFiles []string // Files injected into agent context at start
+ Instructions string // Prompt / task description sent to the agent; required
+ ProjectDir string // Working directory for the agent process
+ MaxBudgetUSD float64 // Spending cap in USD; 0 = unlimited; must be >= 0
+ PermissionMode string // One of: default | acceptEdits | bypassPermissions | plan | dontAsk | delegate
+ AllowedTools []string // Whitelist of tool names the agent may use
+ DisallowedTools []string // Blacklist of tool names the agent may not use
+ SystemPromptAppend string // Text appended to the agent's system prompt
+ AdditionalArgs []string // Extra CLI arguments forwarded to the agent binary
+ SkipPlanning bool // When true, bypass the planning phase
+}
+```
+
+---
+
+## 4. RetryConfig Struct
+
+```go
+type RetryConfig struct {
+ MaxAttempts int // Total attempts including the first; default 1 (no retry)
+ Backoff string // "linear" or "exponential"; default "exponential"
+}
+```
+
+---
+
+## 5. YAML Task File Format — Single Task
+
+```yaml
+# Unique identifier. Optional: auto-generated UUID if omitted.
+id: "fix-login-bug"
+
+# Human-readable label. Required.
+name: "Fix login redirect bug"
+
+# Optional description shown in the UI.
+description: "Users are redirected to /home instead of /dashboard after login."
+
+# Agent configuration.
+agent:
+ # Agent type: "claude", "gemini", etc. Can be omitted for auto-classification.
+ type: "claude"
+
+ # Model to use. Empty = agent default.
+ model: "claude-opus-4-6"
+
+ # Files loaded into the agent's context before execution.
+ context_files:
+ - "src/auth/login.go"
+ - "docs/design/auth.md"
+
+ # Prompt sent to the agent. Required.
+ instructions: |
+ Fix the post-login redirect in src/auth/login.go so that users are
+ sent to /dashboard instead of /home. Add a regression test.
+
+ # Working directory for the agent process. Empty = server working directory.
+ project_dir: "/workspace/myapp"
+
+ # USD spending cap. 0 = no limit.
+ max_budget_usd: 1.00
+
+ # Permission mode. Valid values: default | acceptEdits | bypassPermissions | plan | dontAsk | delegate
+ permission_mode: "acceptEdits"
+
+ # Tool whitelist. Empty = all tools allowed.
+ allowed_tools:
+ - "Edit"
+ - "Read"
+ - "Bash"
+
+ # Tool blacklist.
+ disallowed_tools:
+ - "WebFetch"
+
+ # Appended to the agent's system prompt.
+ system_prompt_append: "Always write tests before implementation."
+
+ # Extra arguments forwarded verbatim to the agent binary.
+ additional_args:
+ - "--verbose"
+
+ # Skip the planning phase.
+ skip_planning: false
+
+# Maximum run time. Accepts Go duration strings: "30m", "1h30m", "45s".
+# 0 or omitted = no limit.
+timeout: "30m"
+
+# Retry policy.
+retry:
+ # Total attempts (including first). Must be >= 1.
+ max_attempts: 3
+ # "linear" or "exponential".
+ backoff: "exponential"
+
+# Scheduling priority: "high", "normal" (default), or "low".
+priority: "normal"
+
+# Arbitrary string labels for filtering.
+tags:
+ - "bug"
+ - "auth"
+
+# Task IDs that must be COMPLETED before this task is queued.
+depends_on:
+ - "setup-test-db"
+```
+
+---
+
+## 6. Batch File Format
+
+A batch file wraps multiple tasks under a `tasks` key. Each entry is a full task definition (same fields as above). All tasks are parsed and initialized together.
+
+```yaml
+tasks:
+ - name: "Step 1 — scaffold"
+ agent:
+ instructions: "Create the initial project structure."
+ priority: "high"
+
+ - name: "Step 2 — implement"
+ agent:
+ instructions: "Implement the feature described in docs/feature.md."
+ depends_on:
+ - "step-1-id"
+
+ - name: "Step 3 — test"
+ agent:
+ instructions: "Write and run integration tests."
+ depends_on:
+ - "step-2-id"
+ retry:
+ max_attempts: 2
+ backoff: "linear"
+```
+
+`ParseFile` tries the batch format first; if no `tasks` key is present it falls back to single-task parsing.
+
+---
+
+## 7. State Constants
+
+| Constant | Value | Meaning |
+|-------------------|------------------|-------------------------------------------------------------------------|
+| `StatePending` | `PENDING` | Newly created; awaiting classification or human approval. |
+| `StateQueued` | `QUEUED` | Accepted and waiting for an available agent slot. |
+| `StateRunning` | `RUNNING` | Agent process is actively executing. |
+| `StateReady` | `READY` | Agent has paused and is awaiting human input (question / approval). |
+| `StateCompleted` | `COMPLETED` | Agent finished successfully. Terminal — no further transitions allowed. |
+| `StateFailed` | `FAILED` | Agent exited with a non-zero code or internal error. |
+| `StateTimedOut` | `TIMED_OUT` | Execution exceeded the configured `timeout`. |
+| `StateCancelled` | `CANCELLED` | Explicitly cancelled by the user or scheduler. |
+| `StateBudgetExceeded` | `BUDGET_EXCEEDED` | Agent hit the `max_budget_usd` cap before finishing. |
+| `StateBlocked` | `BLOCKED` | Waiting on a dependency task that has not yet completed. |
+
+---
+
+## 8. State Machine — Valid Transitions
+
+| From | To | Condition / trigger |
+|--------------------|--------------------|--------------------------------------------------------------|
+| `PENDING` | `QUEUED` | Task approved and eligible for scheduling. |
+| `PENDING` | `CANCELLED` | Cancelled before being queued. |
+| `QUEUED` | `RUNNING` | Agent slot becomes available; execution starts. |
+| `QUEUED` | `CANCELLED` | Cancelled while waiting in the queue. |
+| `RUNNING` | `READY` | Agent pauses and emits a question for human input. |
+| `RUNNING` | `COMPLETED` | Agent exits successfully. |
+| `RUNNING` | `FAILED` | Agent exits with an error. |
+| `RUNNING` | `TIMED_OUT` | Wall-clock time exceeds `timeout`. |
+| `RUNNING` | `CANCELLED` | Cancelled mid-execution. |
+| `RUNNING` | `BUDGET_EXCEEDED` | Cumulative cost exceeds `max_budget_usd`. |
+| `RUNNING` | `BLOCKED` | Runtime dependency check fails. |
+| `READY` | `COMPLETED` | Human answer accepted; task finishes. |
+| `READY` | `PENDING` | Answer rejected; task returns to pending for re-approval. |
+| `FAILED` | `QUEUED` | Retry requested (re-enqueue). |
+| `TIMED_OUT` | `QUEUED` | Retry or resume requested. |
+| `CANCELLED` | `QUEUED` | Restart requested. |
+| `BUDGET_EXCEEDED` | `QUEUED` | Retry with higher or no budget. |
+| `BLOCKED` | `QUEUED` | Dependency became satisfied; task re-queued. |
+| `BLOCKED` | `READY` | Dependency resolved but human review required. |
+| `COMPLETED` | *(none)* | Terminal state. |
+
+---
+
+## 9. Key Functions
+
+### `ParseFile(path string) ([]Task, error)`
+Reads a YAML file at `path` and returns one or more tasks. Tries batch format (`tasks:` key) first; falls back to single-task format. Auto-assigns UUIDs, default priority, default retry config, and sets `State = PENDING` on all returned tasks.
+
+### `Parse(data []byte) ([]Task, error)`
+Same as `ParseFile` but operates on raw bytes instead of a file path.
+
+### `ValidTransition(from, to State) bool`
+Returns `true` if the transition from `from` to `to` is permitted by the state machine. Used by `storage.DB.UpdateTaskState` to enforce transitions atomically inside a transaction.
+
+### `Validate(t *Task) error`
+Validates a task's fields. Returns `*ValidationError` (implementing `error`) with all failures collected, or `nil` if valid.
+
+---
+
+## 10. Validation Rules
+
+The `Validate` function enforces the following rules:
+
+| Rule | Details |
+|------|---------|
+| `name` required | Must be non-empty. |
+| `agent.instructions` required | Must be non-empty. |
+| `agent.max_budget_usd` non-negative | Must be `>= 0`. |
+| `timeout` non-negative | Must be `>= 0` (zero means no limit). |
+| `retry.max_attempts >= 1` | Must be at least 1. |
+| `retry.backoff` valid values | Must be empty, `"linear"`, or `"exponential"`. |
+| `priority` valid values | Must be empty, `"high"`, `"normal"`, or `"low"`. |
+| `agent.permission_mode` valid values | Must be empty or one of: `default`, `acceptEdits`, `bypassPermissions`, `plan`, `dontAsk`, `delegate`. |
+
+Multiple failures are collected and returned together in a single `ValidationError`.
diff --git a/internal/api/docs/RAW_NARRATIVE.md b/internal/api/docs/RAW_NARRATIVE.md
new file mode 100644
index 0000000..7944463
--- /dev/null
+++ b/internal/api/docs/RAW_NARRATIVE.md
@@ -0,0 +1,117 @@
+
+--- 2026-03-10T09:33:34Z ---
+run the Go test suite with race detector and fail if coverage < 80%
+
+--- 2026-03-10T09:33:34Z ---
+do something
+
+--- 2026-03-10T09:33:34Z ---
+do something
+
+--- 2026-03-10T16:46:39Z ---
+run the Go test suite with race detector and fail if coverage < 80%
+
+--- 2026-03-10T16:46:39Z ---
+do something
+
+--- 2026-03-10T16:46:39Z ---
+do something
+
+--- 2026-03-10T17:16:31Z ---
+run the Go test suite with race detector and fail if coverage < 80%
+
+--- 2026-03-10T17:16:31Z ---
+do something
+
+--- 2026-03-10T17:16:31Z ---
+do something
+
+--- 2026-03-10T17:25:16Z ---
+run the Go test suite with race detector and fail if coverage < 80%
+
+--- 2026-03-10T17:25:16Z ---
+do something
+
+--- 2026-03-10T17:25:16Z ---
+do something
+
+--- 2026-03-10T23:54:53Z ---
+run the Go test suite with race detector and fail if coverage < 80%
+
+--- 2026-03-10T23:54:53Z ---
+do something
+
+--- 2026-03-10T23:54:53Z ---
+do something
+
+--- 2026-03-10T23:55:54Z ---
+run the Go test suite with race detector and fail if coverage < 80%
+
+--- 2026-03-10T23:55:54Z ---
+do something
+
+--- 2026-03-10T23:55:54Z ---
+do something
+
+--- 2026-03-10T23:56:06Z ---
+run the Go test suite with race detector and fail if coverage < 80%
+
+--- 2026-03-10T23:56:06Z ---
+do something
+
+--- 2026-03-10T23:56:06Z ---
+do something
+
+--- 2026-03-10T23:57:26Z ---
+run the Go test suite with race detector and fail if coverage < 80%
+
+--- 2026-03-10T23:57:26Z ---
+do something
+
+--- 2026-03-10T23:57:26Z ---
+do something
+
+--- 2026-03-11T07:40:17Z ---
+run the Go test suite with race detector and fail if coverage < 80%
+
+--- 2026-03-11T07:40:17Z ---
+do something
+
+--- 2026-03-11T07:40:17Z ---
+do something
+
+--- 2026-03-11T08:25:03Z ---
+run the Go test suite with race detector and fail if coverage < 80%
+
+--- 2026-03-11T08:25:04Z ---
+do something
+
+--- 2026-03-11T08:25:04Z ---
+do something
+
+--- 2026-03-12T21:00:28Z ---
+generate a report
+
+--- 2026-03-12T21:00:33Z ---
+generate a report
+
+--- 2026-03-12T21:00:34Z ---
+run the Go test suite with race detector and fail if coverage < 80%
+
+--- 2026-03-12T21:00:34Z ---
+do something
+
+--- 2026-03-12T21:00:34Z ---
+do something
+
+--- 2026-03-13T02:27:38Z ---
+generate a report
+
+--- 2026-03-13T02:27:38Z ---
+run the Go test suite with race detector and fail if coverage < 80%
+
+--- 2026-03-13T02:27:38Z ---
+do something
+
+--- 2026-03-13T02:27:38Z ---
+do something
diff --git a/internal/executor/claude.go b/internal/executor/claude.go
index 0e29f7f..a58f1ad 100644
--- a/internal/executor/claude.go
+++ b/internal/executor/claude.go
@@ -150,6 +150,13 @@ func (r *ClaudeRunner) Run(ctx context.Context, t *task.Task, e *storage.Executi
return &BlockedError{QuestionJSON: strings.TrimSpace(string(data)), SessionID: e.SessionID, SandboxDir: sandboxDir}
}
+ // Read agent summary if written.
+ summaryFile := filepath.Join(logDir, "summary.txt")
+ if summaryData, readErr := os.ReadFile(summaryFile); readErr == nil {
+ os.Remove(summaryFile) // consumed
+ e.Summary = strings.TrimSpace(string(summaryData))
+ }
+
// Merge sandbox back to project_dir and clean up.
if sandboxDir != "" {
if mergeErr := teardownSandbox(projectDir, sandboxDir, r.Logger); mergeErr != nil {
@@ -261,6 +268,7 @@ func (r *ClaudeRunner) execOnce(ctx context.Context, args []string, workingDir s
"CLAUDOMATOR_API_URL="+r.APIURL,
"CLAUDOMATOR_TASK_ID="+e.TaskID,
"CLAUDOMATOR_QUESTION_FILE="+filepath.Join(e.ArtifactDir, "question.json"),
+ "CLAUDOMATOR_SUMMARY_FILE="+filepath.Join(e.ArtifactDir, "summary.txt"),
)
// Put the subprocess in its own process group so we can SIGKILL the entire
// group (MCP servers, bash children, etc.) on cancellation.
diff --git a/internal/executor/preamble.go b/internal/executor/preamble.go
index e50c16f..bc5c32c 100644
--- a/internal/executor/preamble.go
+++ b/internal/executor/preamble.go
@@ -46,6 +46,17 @@ The sandbox is rejected if there are any uncommitted modifications.
---
+## Final Summary (mandatory)
+
+Before exiting, write a brief summary paragraph (2–5 sentences) describing what you did
+and the outcome. Write it to the path in $CLAUDOMATOR_SUMMARY_FILE:
+
+ echo "Your summary here." > "$CLAUDOMATOR_SUMMARY_FILE"
+
+This summary is displayed in the task UI so the user knows what happened.
+
+---
+
`
func withPlanningPreamble(instructions string) string {
diff --git a/internal/executor/summary.go b/internal/executor/summary.go
new file mode 100644
index 0000000..a942de0
--- /dev/null
+++ b/internal/executor/summary.go
@@ -0,0 +1,57 @@
+package executor
+
+import (
+ "bufio"
+ "encoding/json"
+ "os"
+ "strings"
+)
+
+// extractSummary reads a stream-json stdout log and returns the text following
+// the last "## Summary" heading found in any assistant text block.
+// Returns empty string if the file cannot be read or no summary is found.
+func extractSummary(stdoutPath string) string {
+ f, err := os.Open(stdoutPath)
+ if err != nil {
+ return ""
+ }
+ defer f.Close()
+
+ var last string
+ scanner := bufio.NewScanner(f)
+ scanner.Buffer(make([]byte, 1024*1024), 1024*1024)
+ for scanner.Scan() {
+ if text := summaryFromLine(scanner.Bytes()); text != "" {
+ last = text
+ }
+ }
+ return last
+}
+
+// summaryFromLine parses a single stream-json line and returns the text after
+// "## Summary" if the line is an assistant text block containing that heading.
+func summaryFromLine(line []byte) string {
+ var event struct {
+ Type string `json:"type"`
+ Message struct {
+ Content []struct {
+ Type string `json:"type"`
+ Text string `json:"text"`
+ } `json:"content"`
+ } `json:"message"`
+ }
+ if err := json.Unmarshal(line, &event); err != nil || event.Type != "assistant" {
+ return ""
+ }
+ for _, block := range event.Message.Content {
+ if block.Type != "text" {
+ continue
+ }
+ idx := strings.Index(block.Text, "## Summary")
+ if idx == -1 {
+ continue
+ }
+ return strings.TrimSpace(block.Text[idx+len("## Summary"):])
+ }
+ return ""
+}
diff --git a/internal/executor/summary_test.go b/internal/executor/summary_test.go
new file mode 100644
index 0000000..4a73711
--- /dev/null
+++ b/internal/executor/summary_test.go
@@ -0,0 +1,49 @@
+package executor
+
+import (
+ "os"
+ "path/filepath"
+ "testing"
+)
+
+func TestExtractSummary_WithSummarySection(t *testing.T) {
+ dir := t.TempDir()
+ path := filepath.Join(dir, "stdout.log")
+ content := streamLine(`{"type":"assistant","message":{"content":[{"type":"text","text":"## Summary\nThe task was completed successfully."}]}}`)
+ if err := os.WriteFile(path, []byte(content), 0600); err != nil {
+ t.Fatal(err)
+ }
+ got := extractSummary(path)
+ want := "The task was completed successfully."
+ if got != want {
+ t.Errorf("got %q, want %q", got, want)
+ }
+}
+
+func TestExtractSummary_NoSummary(t *testing.T) {
+ dir := t.TempDir()
+ path := filepath.Join(dir, "stdout.log")
+ content := streamLine(`{"type":"assistant","message":{"content":[{"type":"text","text":"All done, no summary heading."}]}}`)
+ if err := os.WriteFile(path, []byte(content), 0600); err != nil {
+ t.Fatal(err)
+ }
+ got := extractSummary(path)
+ if got != "" {
+ t.Errorf("expected empty string, got %q", got)
+ }
+}
+
+func TestExtractSummary_MultipleSections_PicksLast(t *testing.T) {
+ dir := t.TempDir()
+ path := filepath.Join(dir, "stdout.log")
+ content := streamLine(`{"type":"assistant","message":{"content":[{"type":"text","text":"## Summary\nFirst summary."}]}}`) +
+ streamLine(`{"type":"assistant","message":{"content":[{"type":"text","text":"## Summary\nFinal summary."}]}}`)
+ if err := os.WriteFile(path, []byte(content), 0600); err != nil {
+ t.Fatal(err)
+ }
+ got := extractSummary(path)
+ want := "Final summary."
+ if got != want {
+ t.Errorf("got %q, want %q", got, want)
+ }
+}
diff --git a/internal/storage/db.go b/internal/storage/db.go
index aaf1e09..b8a7085 100644
--- a/internal/storage/db.go
+++ b/internal/storage/db.go
@@ -81,6 +81,8 @@ func (s *DB) migrate() error {
`ALTER TABLE tasks ADD COLUMN question_json TEXT`,
`ALTER TABLE executions ADD COLUMN session_id TEXT`,
`ALTER TABLE executions ADD COLUMN sandbox_dir TEXT`,
+ `ALTER TABLE tasks ADD COLUMN summary TEXT`,
+ `ALTER TABLE tasks ADD COLUMN interactions_json TEXT NOT NULL DEFAULT '[]'`,
}
for _, m := range migrations {
if _, err := s.db.Exec(m); err != nil {
@@ -129,13 +131,13 @@ func (s *DB) CreateTask(t *task.Task) error {
// GetTask retrieves a task by ID.
func (s *DB) GetTask(id string) (*task.Task, error) {
- row := s.db.QueryRow(`SELECT id, name, description, config_json, priority, timeout_ns, retry_json, tags_json, depends_on_json, parent_task_id, state, created_at, updated_at, rejection_comment, question_json FROM tasks WHERE id = ?`, id)
+ row := s.db.QueryRow(`SELECT id, name, description, config_json, priority, timeout_ns, retry_json, tags_json, depends_on_json, parent_task_id, state, created_at, updated_at, rejection_comment, question_json, summary, interactions_json FROM tasks WHERE id = ?`, id)
return scanTask(row)
}
// ListTasks returns tasks matching the given filter.
func (s *DB) ListTasks(filter TaskFilter) ([]*task.Task, error) {
- query := `SELECT id, name, description, config_json, priority, timeout_ns, retry_json, tags_json, depends_on_json, parent_task_id, state, created_at, updated_at, rejection_comment, question_json FROM tasks WHERE 1=1`
+ query := `SELECT id, name, description, config_json, priority, timeout_ns, retry_json, tags_json, depends_on_json, parent_task_id, state, created_at, updated_at, rejection_comment, question_json, summary, interactions_json FROM tasks WHERE 1=1`
var args []interface{}
if filter.State != "" {
@@ -167,7 +169,7 @@ func (s *DB) ListTasks(filter TaskFilter) ([]*task.Task, error) {
// ListSubtasks returns all tasks whose parent_task_id matches the given ID.
func (s *DB) ListSubtasks(parentID string) ([]*task.Task, error) {
- rows, err := s.db.Query(`SELECT id, name, description, config_json, priority, timeout_ns, retry_json, tags_json, depends_on_json, parent_task_id, state, created_at, updated_at, rejection_comment, question_json FROM tasks WHERE parent_task_id = ? ORDER BY created_at ASC`, parentID)
+ rows, err := s.db.Query(`SELECT id, name, description, config_json, priority, timeout_ns, retry_json, tags_json, depends_on_json, parent_task_id, state, created_at, updated_at, rejection_comment, question_json, summary, interactions_json FROM tasks WHERE parent_task_id = ? ORDER BY created_at ASC`, parentID)
if err != nil {
return nil, err
}
@@ -220,7 +222,7 @@ func (s *DB) ResetTaskForRetry(id string) (*task.Task, error) {
}
defer tx.Rollback() //nolint:errcheck
- t, err := scanTask(tx.QueryRow(`SELECT id, name, description, config_json, priority, timeout_ns, retry_json, tags_json, depends_on_json, parent_task_id, state, created_at, updated_at, rejection_comment, question_json FROM tasks WHERE id = ?`, id))
+ t, err := scanTask(tx.QueryRow(`SELECT id, name, description, config_json, priority, timeout_ns, retry_json, tags_json, depends_on_json, parent_task_id, state, created_at, updated_at, rejection_comment, question_json, summary, interactions_json FROM tasks WHERE id = ?`, id))
if err != nil {
if err == sql.ErrNoRows {
return nil, fmt.Errorf("task %q not found", id)
@@ -355,6 +357,8 @@ type Execution struct {
// In-memory only: set when creating a resume execution, not stored in DB.
ResumeSessionID string
ResumeAnswer string
+ // In-memory only: populated by the runner after successful execution.
+ Summary string
}
// CreateExecution inserts an execution record.
@@ -516,6 +520,48 @@ func (s *DB) UpdateTaskQuestion(taskID, questionJSON string) error {
return err
}
+// UpdateTaskSummary stores the agent's final summary paragraph on a task.
+func (s *DB) UpdateTaskSummary(taskID, summary string) error {
+ _, err := s.db.Exec(`UPDATE tasks SET summary = ?, updated_at = ? WHERE id = ?`,
+ summary, time.Now().UTC(), taskID)
+ return err
+}
+
+// AppendTaskInteraction appends a Q&A interaction to the task's interaction history.
+func (s *DB) AppendTaskInteraction(taskID string, interaction task.Interaction) error {
+ tx, err := s.db.Begin()
+ if err != nil {
+ return err
+ }
+ defer tx.Rollback() //nolint:errcheck
+
+ var raw sql.NullString
+ if err := tx.QueryRow(`SELECT interactions_json FROM tasks WHERE id = ?`, taskID).Scan(&raw); err != nil {
+ if err == sql.ErrNoRows {
+ return fmt.Errorf("task %q not found", taskID)
+ }
+ return err
+ }
+ existing := raw.String
+ if existing == "" {
+ existing = "[]"
+ }
+ var interactions []task.Interaction
+ if err := json.Unmarshal([]byte(existing), &interactions); err != nil {
+ return fmt.Errorf("unmarshaling interactions: %w", err)
+ }
+ interactions = append(interactions, interaction)
+ updated, err := json.Marshal(interactions)
+ if err != nil {
+ return fmt.Errorf("marshaling interactions: %w", err)
+ }
+ if _, err := tx.Exec(`UPDATE tasks SET interactions_json = ?, updated_at = ? WHERE id = ?`,
+ string(updated), time.Now().UTC(), taskID); err != nil {
+ return err
+ }
+ return tx.Commit()
+}
+
// UpdateExecution updates a completed execution.
func (s *DB) UpdateExecution(e *Execution) error {
_, err := s.db.Exec(`
@@ -545,11 +591,14 @@ func scanTask(row scanner) (*task.Task, error) {
parentTaskID sql.NullString
rejectionComment sql.NullString
questionJSON sql.NullString
+ summary sql.NullString
+ interactionsJSON sql.NullString
)
- err := row.Scan(&t.ID, &t.Name, &t.Description, &configJSON, &priority, &timeoutNS, &retryJSON, &tagsJSON, &depsJSON, &parentTaskID, &state, &t.CreatedAt, &t.UpdatedAt, &rejectionComment, &questionJSON)
+ err := row.Scan(&t.ID, &t.Name, &t.Description, &configJSON, &priority, &timeoutNS, &retryJSON, &tagsJSON, &depsJSON, &parentTaskID, &state, &t.CreatedAt, &t.UpdatedAt, &rejectionComment, &questionJSON, &summary, &interactionsJSON)
t.ParentTaskID = parentTaskID.String
t.RejectionComment = rejectionComment.String
t.QuestionJSON = questionJSON.String
+ t.Summary = summary.String
if err != nil {
return nil, err
}
@@ -568,6 +617,13 @@ func scanTask(row scanner) (*task.Task, error) {
if err := json.Unmarshal([]byte(depsJSON), &t.DependsOn); err != nil {
return nil, fmt.Errorf("unmarshaling depends_on: %w", err)
}
+ raw := interactionsJSON.String
+ if raw == "" {
+ raw = "[]"
+ }
+ if err := json.Unmarshal([]byte(raw), &t.Interactions); err != nil {
+ return nil, fmt.Errorf("unmarshaling interactions: %w", err)
+ }
return &t, nil
}
diff --git a/internal/task/task.go b/internal/task/task.go
index 9968b15..2c57922 100644
--- a/internal/task/task.go
+++ b/internal/task/task.go
@@ -48,6 +48,14 @@ type RetryConfig struct {
Backoff string `yaml:"backoff" json:"backoff"` // "linear", "exponential"
}
+// Interaction records a single question/answer exchange between an agent and the user.
+type Interaction struct {
+ QuestionText string `json:"question_text"`
+ Options []string `json:"options,omitempty"`
+ Answer string `json:"answer,omitempty"`
+ AskedAt time.Time `json:"asked_at"`
+}
+
type Task struct {
ID string `yaml:"id" json:"id"`
ParentTaskID string `yaml:"parent_task_id" json:"parent_task_id"`
@@ -59,11 +67,13 @@ type Task struct {
Priority Priority `yaml:"priority" json:"priority"`
Tags []string `yaml:"tags" json:"tags"`
DependsOn []string `yaml:"depends_on" json:"depends_on"`
- State State `yaml:"-" json:"state"`
- RejectionComment string `yaml:"-" json:"rejection_comment,omitempty"`
- QuestionJSON string `yaml:"-" json:"question,omitempty"`
- CreatedAt time.Time `yaml:"-" json:"created_at"`
- UpdatedAt time.Time `yaml:"-" json:"updated_at"`
+ State State `yaml:"-" json:"state"`
+ RejectionComment string `yaml:"-" json:"rejection_comment,omitempty"`
+ QuestionJSON string `yaml:"-" json:"question,omitempty"`
+ Summary string `yaml:"-" json:"summary,omitempty"`
+ Interactions []Interaction `yaml:"-" json:"interactions,omitempty"`
+ CreatedAt time.Time `yaml:"-" json:"created_at"`
+ UpdatedAt time.Time `yaml:"-" json:"updated_at"`
}
// Duration wraps time.Duration for YAML unmarshaling from strings like "30m".
@@ -94,27 +104,24 @@ type BatchFile struct {
}
// validTransitions maps each state to the set of states it may transition into.
-// Terminal state COMPLETED has no outgoing edges.
+// COMPLETED is the only true terminal state (no outgoing edges).
// CANCELLED, FAILED, TIMED_OUT, and BUDGET_EXCEEDED all allow re-entry at QUEUED
// (restart or retry).
-var validTransitions = map[State][]State{
- StatePending: {StateQueued, StateCancelled},
- StateQueued: {StateRunning, StateCancelled},
- StateRunning: {StateReady, StateCompleted, StateFailed, StateTimedOut, StateCancelled, StateBudgetExceeded, StateBlocked},
- StateReady: {StateCompleted, StatePending},
- StateFailed: {StateQueued}, // retry
- StateTimedOut: {StateQueued}, // retry or resume
- StateCancelled: {StateQueued}, // restart
- StateBudgetExceeded: {StateQueued}, // retry
- StateBlocked: {StateQueued, StateReady},
+// READY may go back to PENDING on user rejection.
+// BLOCKED may advance to READY when all subtasks complete, or back to QUEUED on user answer.
+var validTransitions = map[State]map[State]bool{
+ StatePending: {StateQueued: true, StateCancelled: true},
+ StateQueued: {StateRunning: true, StateCancelled: true},
+ StateRunning: {StateReady: true, StateCompleted: true, StateFailed: true, StateTimedOut: true, StateCancelled: true, StateBudgetExceeded: true, StateBlocked: true},
+ StateReady: {StateCompleted: true, StatePending: true},
+ StateFailed: {StateQueued: true}, // retry
+ StateTimedOut: {StateQueued: true}, // retry or resume
+ StateCancelled: {StateQueued: true}, // restart
+ StateBudgetExceeded: {StateQueued: true}, // retry
+ StateBlocked: {StateQueued: true, StateReady: true},
}
// ValidTransition returns true if moving from the current state to next is allowed.
func ValidTransition(from, to State) bool {
- for _, allowed := range validTransitions[from] {
- if allowed == to {
- return true
- }
- }
- return false
+ return validTransitions[from][to]
}
diff --git a/web/style.css b/web/style.css
index 2b872fe..342e0b3 100644
--- a/web/style.css
+++ b/web/style.css
@@ -1176,3 +1176,41 @@ dialog label select:focus {
margin-top: 0.25rem;
text-align: right;
}
+
+/* ── Task Summary + Q&A History ────────────────────────────── */
+
+.task-summary {
+ color: var(--text);
+ line-height: 1.6;
+ margin: 0;
+ white-space: pre-wrap;
+}
+
+.qa-list {
+ display: flex;
+ flex-direction: column;
+ gap: 0.75rem;
+}
+
+.qa-item {
+ border-left: 3px solid var(--border);
+ padding: 0.5rem 0.75rem;
+ display: flex;
+ flex-direction: column;
+ gap: 0.25rem;
+}
+
+.qa-question {
+ font-weight: 500;
+ color: var(--text);
+}
+
+.qa-options {
+ font-size: 0.82rem;
+ color: var(--text-muted, #94a3b8);
+}
+
+.qa-answer {
+ color: var(--accent, #60a5fa);
+ font-style: italic;
+}