diff options
| author | Peter Stone <thepeterstone@gmail.com> | 2026-03-13 03:14:40 +0000 |
|---|---|---|
| committer | Peter Stone <thepeterstone@gmail.com> | 2026-03-13 03:14:40 +0000 |
| commit | 5303a68d67e435da863353cdce09fa2e3a8c2ccd (patch) | |
| tree | 2e16b9c17c11cbb3b7c9395e1b3fb119b73ef2ca /docs | |
| parent | f28c22352aa1a8ede7552ee0277f7d60552d9094 (diff) | |
feat: resume support, summary extraction, and task state improvements
- Extend Resume to CANCELLED, FAILED, and BUDGET_EXCEEDED tasks
- Add summary extraction from agent stdout stream-json output
- Fix storage: persist stdout/stderr/artifact_dir paths in UpdateExecution
- Clear question_json on ResetTaskForRetry
- Resume BLOCKED tasks in preserved sandbox so Claude finds its session
- Add planning preamble: CLAUDOMATOR_SUMMARY_FILE env var + summary step
- Update ADR-002 with new state transitions
- UI style improvements
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Diffstat (limited to 'docs')
| -rw-r--r-- | docs/adr/002-task-state-machine.md | 18 | ||||
| -rw-r--r-- | docs/packages_old/task.md | 262 |
2 files changed, 271 insertions, 9 deletions
diff --git a/docs/adr/002-task-state-machine.md b/docs/adr/002-task-state-machine.md index 310c337..6910f6a 100644 --- a/docs/adr/002-task-state-machine.md +++ b/docs/adr/002-task-state-machine.md @@ -66,13 +66,13 @@ True terminal state (no outgoing transitions): `COMPLETED`. All other non-succes | `QUEUED` | `RUNNING` | Pool goroutine starts execution | | `QUEUED` | `CANCELLED` | `POST /api/tasks/{id}/cancel` | | `RUNNING` | `READY` | Runner exits 0, no question file, top-level task (`parent_task_id == ""`), and task has no subtasks | -| `RUNNING` | `BLOCKED` | Runner exits 0, no question file, top-level task (`parent_task_id == ""`), and task has subtasks | +| `RUNNING` | `BLOCKED` (subtasks) | Runner exits 0, no question file, top-level task (`parent_task_id == ""`), and task has subtasks | +| `RUNNING` | `BLOCKED` (question) | Runner exits 0 but left a `question.json` file in log dir (any task type) | | `RUNNING` | `COMPLETED` | Runner exits 0, no question file, subtask (`parent_task_id != ""`) | | `RUNNING` | `FAILED` | Runner exits non-zero or stream signals `is_error: true` | | `RUNNING` | `TIMED_OUT` | Context deadline exceeded (`context.DeadlineExceeded`) | | `RUNNING` | `CANCELLED` | Context cancelled (`context.Canceled`) | | `RUNNING` | `BUDGET_EXCEEDED` | `--max-budget-usd` exceeded (signalled by runner) | -| `RUNNING` | `BLOCKED` | Runner exits 0 but left a `question.json` file in log dir | | `READY` | `COMPLETED` | `POST /api/tasks/{id}/accept` | | `READY` | `PENDING` | `POST /api/tasks/{id}/reject` (with optional comment) | | `FAILED` | `QUEUED` | Retry (manual re-run via `POST /api/tasks/{id}/run`) | @@ -85,7 +85,7 @@ True terminal state (no outgoing transitions): `COMPLETED`. All other non-succes ## Implementation **Validation:** `task.ValidTransition(from, to State) bool` -(`internal/task/task.go:93`) — called by API handlers before every state change. +(`internal/task/task.go:123`) — called by API handlers before every state change. **State writes:** `storage.DB.UpdateTaskState(id, state)` — single source of write; called by both API handlers and the executor pool. @@ -158,9 +158,9 @@ Task lifecycle changes produce WebSocket broadcasts to all connected clients: ## Known Limitations and Edge Cases -- **`BUDGET_EXCEEDED` transition.** `BUDGET_EXCEEDED` appears in `terminalFailureStates` - (used by `waitForDependencies`) but has no outgoing transitions in `ValidTransition`, - making it permanently terminal. There is no `/resume` endpoint for it. +- **`BUDGET_EXCEEDED` retry.** `BUDGET_EXCEEDED → QUEUED` is a valid transition (retry via + `POST /run`), matching `FAILED` and `CANCELLED` behaviour. However, there is no dedicated + `/resume` endpoint for it — callers must use the standard `/run` restart path. - **Retry enforcement.** `RetryConfig.MaxAttempts` is stored but not enforced by the pool. The API allows unlimited manual retries via `POST /run` from `FAILED`. @@ -178,9 +178,9 @@ Task lifecycle changes produce WebSocket broadcasts to all connected clients: | Concern | File | Lines | |---|---|---| -| State constants | `internal/task/task.go` | 7–18 | -| `ValidTransition` | `internal/task/task.go` | 93–109 | -| State machine tests | `internal/task/task_test.go` | 8–72 | +| State constants | `internal/task/task.go` | 9–20 | +| `ValidTransition` | `internal/task/task.go` | 107–130 | +| State machine tests | `internal/task/task_test.go` | 8–75 | | Pool execute | `internal/executor/executor.go` | 194–303 | | Pool executeResume | `internal/executor/executor.go` | 116–185 | | Dependency wait | `internal/executor/executor.go` | 305–340 | diff --git a/docs/packages_old/task.md b/docs/packages_old/task.md new file mode 100644 index 0000000..923cd56 --- /dev/null +++ b/docs/packages_old/task.md @@ -0,0 +1,262 @@ +# Package: task + +`internal/task` — Task definition, parsing, validation, and state machine. + +--- + +## 1. Overview + +A **Task** is the central unit of work in Claudomator. It describes what an agent should do (`agent.instructions`), how it should be run (timeout, retry, priority), and how it relates to other tasks (`depends_on`, `parent_task_id`). Tasks are defined in YAML files, parsed into `Task` structs, persisted in SQLite, and driven through a state machine from `PENDING` to a terminal state. + +--- + +## 2. Task Struct + +```go +type Task struct { + ID string // UUID; auto-generated if omitted in YAML + ParentTaskID string // ID of parent task (subtask linkage); empty for root tasks + Name string // Human-readable label; required + Description string // Optional longer description + Agent AgentConfig // How to invoke the agent + Timeout Duration // Maximum wall-clock run time (e.g. "30m"); 0 = no limit + Retry RetryConfig // Retry policy + Priority Priority // "high" | "normal" | "low"; default "normal" + Tags []string // Arbitrary labels for filtering + DependsOn []string // Task IDs that must reach COMPLETED before this queues + State State // Current lifecycle state; not read from YAML (yaml:"-") + RejectionComment string // Set by RejectTask; not read from YAML (yaml:"-") + QuestionJSON string // Pending question from a READY agent; not read from YAML (yaml:"-") + CreatedAt time.Time // Set on parse; not read from YAML (yaml:"-") + UpdatedAt time.Time // Updated on every state change; not read from YAML (yaml:"-") +} +``` + +Fields tagged `yaml:"-"` are runtime-only and are never parsed from task YAML files. + +--- + +## 3. AgentConfig Struct + +```go +type AgentConfig struct { + Type string // Agent implementation: "claude", "gemini", etc. + Model string // Model identifier passed to the agent binary + ContextFiles []string // Files injected into agent context at start + Instructions string // Prompt / task description sent to the agent; required + ProjectDir string // Working directory for the agent process + MaxBudgetUSD float64 // Spending cap in USD; 0 = unlimited; must be >= 0 + PermissionMode string // One of: default | acceptEdits | bypassPermissions | plan | dontAsk | delegate + AllowedTools []string // Whitelist of tool names the agent may use + DisallowedTools []string // Blacklist of tool names the agent may not use + SystemPromptAppend string // Text appended to the agent's system prompt + AdditionalArgs []string // Extra CLI arguments forwarded to the agent binary + SkipPlanning bool // When true, bypass the planning phase +} +``` + +--- + +## 4. RetryConfig Struct + +```go +type RetryConfig struct { + MaxAttempts int // Total attempts including the first; default 1 (no retry) + Backoff string // "linear" or "exponential"; default "exponential" +} +``` + +--- + +## 5. YAML Task File Format — Single Task + +```yaml +# Unique identifier. Optional: auto-generated UUID if omitted. +id: "fix-login-bug" + +# Human-readable label. Required. +name: "Fix login redirect bug" + +# Optional description shown in the UI. +description: "Users are redirected to /home instead of /dashboard after login." + +# Agent configuration. +agent: + # Agent type: "claude", "gemini", etc. Can be omitted for auto-classification. + type: "claude" + + # Model to use. Empty = agent default. + model: "claude-opus-4-6" + + # Files loaded into the agent's context before execution. + context_files: + - "src/auth/login.go" + - "docs/design/auth.md" + + # Prompt sent to the agent. Required. + instructions: | + Fix the post-login redirect in src/auth/login.go so that users are + sent to /dashboard instead of /home. Add a regression test. + + # Working directory for the agent process. Empty = server working directory. + project_dir: "/workspace/myapp" + + # USD spending cap. 0 = no limit. + max_budget_usd: 1.00 + + # Permission mode. Valid values: default | acceptEdits | bypassPermissions | plan | dontAsk | delegate + permission_mode: "acceptEdits" + + # Tool whitelist. Empty = all tools allowed. + allowed_tools: + - "Edit" + - "Read" + - "Bash" + + # Tool blacklist. + disallowed_tools: + - "WebFetch" + + # Appended to the agent's system prompt. + system_prompt_append: "Always write tests before implementation." + + # Extra arguments forwarded verbatim to the agent binary. + additional_args: + - "--verbose" + + # Skip the planning phase. + skip_planning: false + +# Maximum run time. Accepts Go duration strings: "30m", "1h30m", "45s". +# 0 or omitted = no limit. +timeout: "30m" + +# Retry policy. +retry: + # Total attempts (including first). Must be >= 1. + max_attempts: 3 + # "linear" or "exponential". + backoff: "exponential" + +# Scheduling priority: "high", "normal" (default), or "low". +priority: "normal" + +# Arbitrary string labels for filtering. +tags: + - "bug" + - "auth" + +# Task IDs that must be COMPLETED before this task is queued. +depends_on: + - "setup-test-db" +``` + +--- + +## 6. Batch File Format + +A batch file wraps multiple tasks under a `tasks` key. Each entry is a full task definition (same fields as above). All tasks are parsed and initialized together. + +```yaml +tasks: + - name: "Step 1 — scaffold" + agent: + instructions: "Create the initial project structure." + priority: "high" + + - name: "Step 2 — implement" + agent: + instructions: "Implement the feature described in docs/feature.md." + depends_on: + - "step-1-id" + + - name: "Step 3 — test" + agent: + instructions: "Write and run integration tests." + depends_on: + - "step-2-id" + retry: + max_attempts: 2 + backoff: "linear" +``` + +`ParseFile` tries the batch format first; if no `tasks` key is present it falls back to single-task parsing. + +--- + +## 7. State Constants + +| Constant | Value | Meaning | +|-------------------|------------------|-------------------------------------------------------------------------| +| `StatePending` | `PENDING` | Newly created; awaiting classification or human approval. | +| `StateQueued` | `QUEUED` | Accepted and waiting for an available agent slot. | +| `StateRunning` | `RUNNING` | Agent process is actively executing. | +| `StateReady` | `READY` | Agent has paused and is awaiting human input (question / approval). | +| `StateCompleted` | `COMPLETED` | Agent finished successfully. Terminal — no further transitions allowed. | +| `StateFailed` | `FAILED` | Agent exited with a non-zero code or internal error. | +| `StateTimedOut` | `TIMED_OUT` | Execution exceeded the configured `timeout`. | +| `StateCancelled` | `CANCELLED` | Explicitly cancelled by the user or scheduler. | +| `StateBudgetExceeded` | `BUDGET_EXCEEDED` | Agent hit the `max_budget_usd` cap before finishing. | +| `StateBlocked` | `BLOCKED` | Waiting on a dependency task that has not yet completed. | + +--- + +## 8. State Machine — Valid Transitions + +| From | To | Condition / trigger | +|--------------------|--------------------|--------------------------------------------------------------| +| `PENDING` | `QUEUED` | Task approved and eligible for scheduling. | +| `PENDING` | `CANCELLED` | Cancelled before being queued. | +| `QUEUED` | `RUNNING` | Agent slot becomes available; execution starts. | +| `QUEUED` | `CANCELLED` | Cancelled while waiting in the queue. | +| `RUNNING` | `READY` | Agent pauses and emits a question for human input. | +| `RUNNING` | `COMPLETED` | Agent exits successfully. | +| `RUNNING` | `FAILED` | Agent exits with an error. | +| `RUNNING` | `TIMED_OUT` | Wall-clock time exceeds `timeout`. | +| `RUNNING` | `CANCELLED` | Cancelled mid-execution. | +| `RUNNING` | `BUDGET_EXCEEDED` | Cumulative cost exceeds `max_budget_usd`. | +| `RUNNING` | `BLOCKED` | Runtime dependency check fails. | +| `READY` | `COMPLETED` | Human answer accepted; task finishes. | +| `READY` | `PENDING` | Answer rejected; task returns to pending for re-approval. | +| `FAILED` | `QUEUED` | Retry requested (re-enqueue). | +| `TIMED_OUT` | `QUEUED` | Retry or resume requested. | +| `CANCELLED` | `QUEUED` | Restart requested. | +| `BUDGET_EXCEEDED` | `QUEUED` | Retry with higher or no budget. | +| `BLOCKED` | `QUEUED` | Dependency became satisfied; task re-queued. | +| `BLOCKED` | `READY` | Dependency resolved but human review required. | +| `COMPLETED` | *(none)* | Terminal state. | + +--- + +## 9. Key Functions + +### `ParseFile(path string) ([]Task, error)` +Reads a YAML file at `path` and returns one or more tasks. Tries batch format (`tasks:` key) first; falls back to single-task format. Auto-assigns UUIDs, default priority, default retry config, and sets `State = PENDING` on all returned tasks. + +### `Parse(data []byte) ([]Task, error)` +Same as `ParseFile` but operates on raw bytes instead of a file path. + +### `ValidTransition(from, to State) bool` +Returns `true` if the transition from `from` to `to` is permitted by the state machine. Used by `storage.DB.UpdateTaskState` to enforce transitions atomically inside a transaction. + +### `Validate(t *Task) error` +Validates a task's fields. Returns `*ValidationError` (implementing `error`) with all failures collected, or `nil` if valid. + +--- + +## 10. Validation Rules + +The `Validate` function enforces the following rules: + +| Rule | Details | +|------|---------| +| `name` required | Must be non-empty. | +| `agent.instructions` required | Must be non-empty. | +| `agent.max_budget_usd` non-negative | Must be `>= 0`. | +| `timeout` non-negative | Must be `>= 0` (zero means no limit). | +| `retry.max_attempts >= 1` | Must be at least 1. | +| `retry.backoff` valid values | Must be empty, `"linear"`, or `"exponential"`. | +| `priority` valid values | Must be empty, `"high"`, `"normal"`, or `"low"`. | +| `agent.permission_mode` valid values | Must be empty or one of: `default`, `acceptEdits`, `bypassPermissions`, `plan`, `dontAsk`, `delegate`. | + +Multiple failures are collected and returned together in a single `ValidationError`. |
