feat: resume support, summary extraction, and task state improvements

- Extend Resume to CANCELLED, FAILED, and BUDGET_EXCEEDED tasks - Add summary extraction from agent stdout stream-json output - Fix storage: persist stdout/stderr/artifact_dir paths in UpdateExecution - Clear question_json on ResetTaskForRetry - Resume BLOCKED tasks in preserved sandbox so Claude finds its session - Add planning preamble: CLAUDOMATOR_SUMMARY_FILE env var + summary step - Update ADR-002 with new state transitions - UI style improvements Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
author: Peter Stone <thepeterstone@gmail.com> 2026-03-13 03:14:40 +0000
committer: Peter Stone <thepeterstone@gmail.com> 2026-03-13 03:14:40 +0000
commit: 5303a68d67e435da863353cdce09fa2e3a8c2ccd (patch)
tree: 2e16b9c17c11cbb3b7c9395e1b3fb119b73ef2ca /docs
parent: f28c22352aa1a8ede7552ee0277f7d60552d9094 (diff)
2 files changed, 271 insertions, 9 deletions
diff --git a/docs/adr/002-task-state-machine.md b/docs/adr/002-task-state-machine.md
index 310c337..6910f6a 100644
--- a/docs/adr/002-task-state-machine.md
+++ b/docs/adr/002-task-state-machine.md
@@ -66,13 +66,13 @@ True terminal state (no outgoing transitions): `COMPLETED`. All other non-succes
 | `QUEUED` | `RUNNING` | Pool goroutine starts execution |
 | `QUEUED` | `CANCELLED` | `POST /api/tasks/{id}/cancel` |
 | `RUNNING` | `READY` | Runner exits 0, no question file, top-level task (`parent_task_id == ""`), and task has no subtasks |
-| `RUNNING` | `BLOCKED` | Runner exits 0, no question file, top-level task (`parent_task_id == ""`), and task has subtasks |
+| `RUNNING` | `BLOCKED` (subtasks) | Runner exits 0, no question file, top-level task (`parent_task_id == ""`), and task has subtasks |
+| `RUNNING` | `BLOCKED` (question) | Runner exits 0 but left a `question.json` file in log dir (any task type) |
 | `RUNNING` | `COMPLETED` | Runner exits 0, no question file, subtask (`parent_task_id != ""`) |
 | `RUNNING` | `FAILED` | Runner exits non-zero or stream signals `is_error: true` |
 | `RUNNING` | `TIMED_OUT` | Context deadline exceeded (`context.DeadlineExceeded`) |
 | `RUNNING` | `CANCELLED` | Context cancelled (`context.Canceled`) |
 | `RUNNING` | `BUDGET_EXCEEDED` | `--max-budget-usd` exceeded (signalled by runner) |
-| `RUNNING` | `BLOCKED` | Runner exits 0 but left a `question.json` file in log dir |
 | `READY` | `COMPLETED` | `POST /api/tasks/{id}/accept` |
 | `READY` | `PENDING` | `POST /api/tasks/{id}/reject` (with optional comment) |
 | `FAILED` | `QUEUED` | Retry (manual re-run via `POST /api/tasks/{id}/run`) |
@@ -85,7 +85,7 @@ True terminal state (no outgoing transitions): `COMPLETED`. All other non-succes
 ## Implementation
 
 **Validation:** `task.ValidTransition(from, to State) bool`
-(`internal/task/task.go:93`) — called by API handlers before every state change.
+(`internal/task/task.go:123`) — called by API handlers before every state change.
 
 **State writes:** `storage.DB.UpdateTaskState(id, state)` — single source of
 write; called by both API handlers and the executor pool.
@@ -158,9 +158,9 @@ Task lifecycle changes produce WebSocket broadcasts to all connected clients:
 
 ## Known Limitations and Edge Cases
 
-- **`BUDGET_EXCEEDED` transition.** `BUDGET_EXCEEDED` appears in `terminalFailureStates`
-  (used by `waitForDependencies`) but has no outgoing transitions in `ValidTransition`,
-  making it permanently terminal. There is no `/resume` endpoint for it.
+- **`BUDGET_EXCEEDED` retry.** `BUDGET_EXCEEDED → QUEUED` is a valid transition (retry via
+  `POST /run`), matching `FAILED` and `CANCELLED` behaviour. However, there is no dedicated
+  `/resume` endpoint for it — callers must use the standard `/run` restart path.
 
 - **Retry enforcement.** `RetryConfig.MaxAttempts` is stored but not enforced by
   the pool. The API allows unlimited manual retries via `POST /run` from `FAILED`.
@@ -178,9 +178,9 @@ Task lifecycle changes produce WebSocket broadcasts to all connected clients:
 
 | Concern | File | Lines |
 |---|---|---|
-| State constants | `internal/task/task.go` | 7–18 |
-| `ValidTransition` | `internal/task/task.go` | 93–109 |
-| State machine tests | `internal/task/task_test.go` | 8–72 |
+| State constants | `internal/task/task.go` | 9–20 |
+| `ValidTransition` | `internal/task/task.go` | 107–130 |
+| State machine tests | `internal/task/task_test.go` | 8–75 |
 | Pool execute | `internal/executor/executor.go` | 194–303 |
 | Pool executeResume | `internal/executor/executor.go` | 116–185 |
 | Dependency wait | `internal/executor/executor.go` | 305–340 |
diff --git a/docs/packages_old/task.md b/docs/packages_old/task.md
new file mode 100644
index 0000000..923cd56
--- /dev/null
+++ b/docs/packages_old/task.md
@@ -0,0 +1,262 @@
+# Package: task
+
+`internal/task` — Task definition, parsing, validation, and state machine.
+
+---
+
+## 1. Overview
+
+A **Task** is the central unit of work in Claudomator. It describes what an agent should do (`agent.instructions`), how it should be run (timeout, retry, priority), and how it relates to other tasks (`depends_on`, `parent_task_id`). Tasks are defined in YAML files, parsed into `Task` structs, persisted in SQLite, and driven through a state machine from `PENDING` to a terminal state.
+
+---
+
+## 2. Task Struct
+
+```go
+type Task struct {
+    ID               string      // UUID; auto-generated if omitted in YAML
+    ParentTaskID     string      // ID of parent task (subtask linkage); empty for root tasks
+    Name             string      // Human-readable label; required
+    Description      string      // Optional longer description
+    Agent            AgentConfig // How to invoke the agent
+    Timeout          Duration    // Maximum wall-clock run time (e.g. "30m"); 0 = no limit
+    Retry            RetryConfig // Retry policy
+    Priority         Priority    // "high" | "normal" | "low"; default "normal"
+    Tags             []string    // Arbitrary labels for filtering
+    DependsOn        []string    // Task IDs that must reach COMPLETED before this queues
+    State            State       // Current lifecycle state; not read from YAML (yaml:"-")
+    RejectionComment string      // Set by RejectTask; not read from YAML (yaml:"-")
+    QuestionJSON     string      // Pending question from a READY agent; not read from YAML (yaml:"-")
+    CreatedAt        time.Time   // Set on parse; not read from YAML (yaml:"-")
+    UpdatedAt        time.Time   // Updated on every state change; not read from YAML (yaml:"-")
+}
+```
+
+Fields tagged `yaml:"-"` are runtime-only and are never parsed from task YAML files.
+
+---
+
+## 3. AgentConfig Struct
+
+```go
+type AgentConfig struct {
+    Type               string   // Agent implementation: "claude", "gemini", etc.
+    Model              string   // Model identifier passed to the agent binary
+    ContextFiles       []string // Files injected into agent context at start
+    Instructions       string   // Prompt / task description sent to the agent; required
+    ProjectDir         string   // Working directory for the agent process
+    MaxBudgetUSD       float64  // Spending cap in USD; 0 = unlimited; must be >= 0
+    PermissionMode     string   // One of: default | acceptEdits | bypassPermissions | plan | dontAsk | delegate
+    AllowedTools       []string // Whitelist of tool names the agent may use
+    DisallowedTools    []string // Blacklist of tool names the agent may not use
+    SystemPromptAppend string   // Text appended to the agent's system prompt
+    AdditionalArgs     []string // Extra CLI arguments forwarded to the agent binary
+    SkipPlanning       bool     // When true, bypass the planning phase
+}
+```
+
+---
+
+## 4. RetryConfig Struct
+
+```go
+type RetryConfig struct {
+    MaxAttempts int    // Total attempts including the first; default 1 (no retry)
+    Backoff     string // "linear" or "exponential"; default "exponential"
+}
+```
+
+---
+
+## 5. YAML Task File Format — Single Task
+
+```yaml
+# Unique identifier. Optional: auto-generated UUID if omitted.
+id: "fix-login-bug"
+
+# Human-readable label. Required.
+name: "Fix login redirect bug"
+
+# Optional description shown in the UI.
+description: "Users are redirected to /home instead of /dashboard after login."
+
+# Agent configuration.
+agent:
+  # Agent type: "claude", "gemini", etc. Can be omitted for auto-classification.
+  type: "claude"
+
+  # Model to use. Empty = agent default.
+  model: "claude-opus-4-6"
+
+  # Files loaded into the agent's context before execution.
+  context_files:
+    - "src/auth/login.go"
+    - "docs/design/auth.md"
+
+  # Prompt sent to the agent. Required.
+  instructions: |
+    Fix the post-login redirect in src/auth/login.go so that users are
+    sent to /dashboard instead of /home. Add a regression test.
+
+  # Working directory for the agent process. Empty = server working directory.
+  project_dir: "/workspace/myapp"
+
+  # USD spending cap. 0 = no limit.
+  max_budget_usd: 1.00
+
+  # Permission mode. Valid values: default | acceptEdits | bypassPermissions | plan | dontAsk | delegate
+  permission_mode: "acceptEdits"
+
+  # Tool whitelist. Empty = all tools allowed.
+  allowed_tools:
+    - "Edit"
+    - "Read"
+    - "Bash"
+
+  # Tool blacklist.
+  disallowed_tools:
+    - "WebFetch"
+
+  # Appended to the agent's system prompt.
+  system_prompt_append: "Always write tests before implementation."
+
+  # Extra arguments forwarded verbatim to the agent binary.
+  additional_args:
+    - "--verbose"
+
+  # Skip the planning phase.
+  skip_planning: false
+
+# Maximum run time. Accepts Go duration strings: "30m", "1h30m", "45s".
+# 0 or omitted = no limit.
+timeout: "30m"
+
+# Retry policy.
+retry:
+  # Total attempts (including first). Must be >= 1.
+  max_attempts: 3
+  # "linear" or "exponential".
+  backoff: "exponential"
+
+# Scheduling priority: "high", "normal" (default), or "low".
+priority: "normal"
+
+# Arbitrary string labels for filtering.
+tags:
+  - "bug"
+  - "auth"
+
+# Task IDs that must be COMPLETED before this task is queued.
+depends_on:
+  - "setup-test-db"
+```
+
+---
+
+## 6. Batch File Format
+
+A batch file wraps multiple tasks under a `tasks` key. Each entry is a full task definition (same fields as above). All tasks are parsed and initialized together.
+
+```yaml
+tasks:
+  - name: "Step 1 — scaffold"
+    agent:
+      instructions: "Create the initial project structure."
+    priority: "high"
+
+  - name: "Step 2 — implement"
+    agent:
+      instructions: "Implement the feature described in docs/feature.md."
+    depends_on:
+      - "step-1-id"
+
+  - name: "Step 3 — test"
+    agent:
+      instructions: "Write and run integration tests."
+    depends_on:
+      - "step-2-id"
+    retry:
+      max_attempts: 2
+      backoff: "linear"
+```
+
+`ParseFile` tries the batch format first; if no `tasks` key is present it falls back to single-task parsing.
+
+---
+
+## 7. State Constants
+
+| Constant          | Value            | Meaning                                                                 |
+|-------------------|------------------|-------------------------------------------------------------------------|
+| `StatePending`    | `PENDING`        | Newly created; awaiting classification or human approval.               |
+| `StateQueued`     | `QUEUED`         | Accepted and waiting for an available agent slot.                       |
+| `StateRunning`    | `RUNNING`        | Agent process is actively executing.                                    |
+| `StateReady`      | `READY`          | Agent has paused and is awaiting human input (question / approval).     |
+| `StateCompleted`  | `COMPLETED`      | Agent finished successfully. Terminal — no further transitions allowed. |
+| `StateFailed`     | `FAILED`         | Agent exited with a non-zero code or internal error.                    |
+| `StateTimedOut`   | `TIMED_OUT`      | Execution exceeded the configured `timeout`.                            |
+| `StateCancelled`  | `CANCELLED`      | Explicitly cancelled by the user or scheduler.                          |
+| `StateBudgetExceeded` | `BUDGET_EXCEEDED` | Agent hit the `max_budget_usd` cap before finishing.              |
+| `StateBlocked`    | `BLOCKED`        | Waiting on a dependency task that has not yet completed.                |
+
+---
+
+## 8. State Machine — Valid Transitions
+
+| From               | To                 | Condition / trigger                                          |
+|--------------------|--------------------|--------------------------------------------------------------|
+| `PENDING`          | `QUEUED`           | Task approved and eligible for scheduling.                   |
+| `PENDING`          | `CANCELLED`        | Cancelled before being queued.                               |
+| `QUEUED`           | `RUNNING`          | Agent slot becomes available; execution starts.              |
+| `QUEUED`           | `CANCELLED`        | Cancelled while waiting in the queue.                        |
+| `RUNNING`          | `READY`            | Agent pauses and emits a question for human input.           |
+| `RUNNING`          | `COMPLETED`        | Agent exits successfully.                                    |
+| `RUNNING`          | `FAILED`           | Agent exits with an error.                                   |
+| `RUNNING`          | `TIMED_OUT`        | Wall-clock time exceeds `timeout`.                           |
+| `RUNNING`          | `CANCELLED`        | Cancelled mid-execution.                                     |
+| `RUNNING`          | `BUDGET_EXCEEDED`  | Cumulative cost exceeds `max_budget_usd`.                    |
+| `RUNNING`          | `BLOCKED`          | Runtime dependency check fails.                              |
+| `READY`            | `COMPLETED`        | Human answer accepted; task finishes.                        |
+| `READY`            | `PENDING`          | Answer rejected; task returns to pending for re-approval.    |
+| `FAILED`           | `QUEUED`           | Retry requested (re-enqueue).                                |
+| `TIMED_OUT`        | `QUEUED`           | Retry or resume requested.                                   |
+| `CANCELLED`        | `QUEUED`           | Restart requested.                                           |
+| `BUDGET_EXCEEDED`  | `QUEUED`           | Retry with higher or no budget.                              |
+| `BLOCKED`          | `QUEUED`           | Dependency became satisfied; task re-queued.                 |
+| `BLOCKED`          | `READY`            | Dependency resolved but human review required.               |
+| `COMPLETED`        | *(none)*           | Terminal state.                                              |
+
+---
+
+## 9. Key Functions
+
+### `ParseFile(path string) ([]Task, error)`
+Reads a YAML file at `path` and returns one or more tasks. Tries batch format (`tasks:` key) first; falls back to single-task format. Auto-assigns UUIDs, default priority, default retry config, and sets `State = PENDING` on all returned tasks.
+
+### `Parse(data []byte) ([]Task, error)`
+Same as `ParseFile` but operates on raw bytes instead of a file path.
+
+### `ValidTransition(from, to State) bool`
+Returns `true` if the transition from `from` to `to` is permitted by the state machine. Used by `storage.DB.UpdateTaskState` to enforce transitions atomically inside a transaction.
+
+### `Validate(t *Task) error`
+Validates a task's fields. Returns `*ValidationError` (implementing `error`) with all failures collected, or `nil` if valid.
+
+---
+
+## 10. Validation Rules
+
+The `Validate` function enforces the following rules:
+
+| Rule | Details |
+|------|---------|
+| `name` required | Must be non-empty. |
+| `agent.instructions` required | Must be non-empty. |
+| `agent.max_budget_usd` non-negative | Must be `>= 0`. |
+| `timeout` non-negative | Must be `>= 0` (zero means no limit). |
+| `retry.max_attempts >= 1` | Must be at least 1. |
+| `retry.backoff` valid values | Must be empty, `"linear"`, or `"exponential"`. |
+| `priority` valid values | Must be empty, `"high"`, `"normal"`, or `"low"`. |
+| `agent.permission_mode` valid values | Must be empty or one of: `default`, `acceptEdits`, `bypassPermissions`, `plan`, `dontAsk`, `delegate`. |
+
+Multiple failures are collected and returned together in a single `ValidationError`.
author	Peter Stone <thepeterstone@gmail.com>	2026-03-13 03:14:40 +0000
committer	Peter Stone <thepeterstone@gmail.com>	2026-03-13 03:14:40 +0000
commit	5303a68d67e435da863353cdce09fa2e3a8c2ccd (patch)
tree	2e16b9c17c11cbb3b7c9395e1b3fb119b73ef2ca /docs
parent	f28c22352aa1a8ede7552ee0277f7d60552d9094 (diff)