diff options
| -rw-r--r-- | docs/superpowers/plans/2026-04-04-task-checker-story-ship.md | 1226 |
1 files changed, 1226 insertions, 0 deletions
diff --git a/docs/superpowers/plans/2026-04-04-task-checker-story-ship.md b/docs/superpowers/plans/2026-04-04-task-checker-story-ship.md new file mode 100644 index 0000000..021405f --- /dev/null +++ b/docs/superpowers/plans/2026-04-04-task-checker-story-ship.md @@ -0,0 +1,1226 @@ +# Task Checker Agent and Story Ship Gate — Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Add an async per-task checker agent that auto-accepts passing tasks, and replace the auto-deploy story trigger with an explicit human "Ship" action. + +**Architecture:** Checker tasks are regular pool tasks with a new `checker_for_task_id` field; when they complete successfully the pool auto-accepts the linked task. `checkStoryCompletion` still transitions stories to SHIPPABLE but no longer fires the deploy — a new `POST /api/stories/{id}/ship` endpoint and "Ship" button do that instead. Story elaboration is extended to produce `acceptance_criteria` per task. + +**Tech Stack:** Go 1.25, SQLite (database/sql + go-sqlite3), vanilla JS (no framework) + +--- + +## File Map + +| File | Change | +|---|---| +| `internal/task/task.go` | Add `AcceptanceCriteria`, `CheckerForTaskID`, `CheckerReport` fields to `Task` | +| `internal/storage/db.go` | 3 migrations; extend `CreateTask`, `scanTask`, all SELECT queries; add `UpdateTaskCheckerReport`, `GetCheckerTask` | +| `internal/executor/executor.go` | Add 2 methods to `Store` interface; add `spawnCheckerTask`; modify `handleRunResult`; guard `checkStoryCompletion`; remove auto-deploy; add `ShipStory` | +| `internal/api/server.go` | Register `POST /api/stories/{id}/ship` | +| `internal/api/stories.go` | Add `handleShipStory`; pass `AcceptanceCriteria` in `handleApproveStory` | +| `internal/api/elaborate.go` | Add `AcceptanceCriteria` to `elaboratedStoryTask`; update `buildStoryElaboratePrompt` | +| `web/app.js` | Ship button on SHIPPABLE story cards; checker report on READY task cards | + +--- + +## Task 1: Task struct — three new fields + +**Files:** +- Modify: `internal/task/task.go` + +- [ ] **Step 1: Add fields to Task struct** + +In `internal/task/task.go`, add three fields after `StoryID`: + +```go +type Task struct { + ID string `yaml:"id" json:"id"` + ParentTaskID string `yaml:"parent_task_id" json:"parent_task_id"` + Name string `yaml:"name" json:"name"` + Description string `yaml:"description" json:"description"` + Project string `yaml:"project" json:"project"` + RepositoryURL string `yaml:"repository_url" json:"repository_url"` + Agent AgentConfig `yaml:"agent" json:"agent"` + Timeout Duration `yaml:"timeout" json:"timeout"` + Retry RetryConfig `yaml:"retry" json:"retry"` + Priority Priority `yaml:"priority" json:"priority"` + Tags []string `yaml:"tags" json:"tags"` + DependsOn []string `yaml:"depends_on" json:"depends_on"` + StoryID string `yaml:"-" json:"story_id,omitempty"` + BranchName string `yaml:"-" json:"branch_name,omitempty"` + AcceptanceCriteria string `yaml:"-" json:"acceptance_criteria,omitempty"` + CheckerForTaskID string `yaml:"-" json:"checker_for_task_id,omitempty"` + CheckerReport string `yaml:"-" json:"checker_report,omitempty"` + State State `yaml:"-" json:"state"` + RejectionComment string `yaml:"-" json:"rejection_comment,omitempty"` + QuestionJSON string `yaml:"-" json:"question,omitempty"` + ElaborationInput string `yaml:"-" json:"elaboration_input,omitempty"` + Summary string `yaml:"-" json:"summary,omitempty"` + Interactions []Interaction `yaml:"-" json:"interactions,omitempty"` + CreatedAt time.Time `yaml:"-" json:"created_at"` + UpdatedAt time.Time `yaml:"-" json:"updated_at"` +} +``` + +- [ ] **Step 2: Build to verify no compilation errors** + +```bash +cd /workspace/claudomator && go build ./... +``` + +Expected: no output (success). + +- [ ] **Step 3: Commit** + +```bash +git add internal/task/task.go +git commit -m "feat: add AcceptanceCriteria, CheckerForTaskID, CheckerReport to Task struct" +``` + +--- + +## Task 2: Storage — migrations, queries, two new methods + +**Files:** +- Modify: `internal/storage/db.go` +- Test: `internal/storage/db_test.go` + +- [ ] **Step 1: Write failing tests for the two new storage methods** + +Find the existing test file and add at the end: + +```go +func TestUpdateTaskCheckerReport(t *testing.T) { + db := openTestDB(t) + tk := &task.Task{ + ID: "cr-1", Name: "orig", RepositoryURL: "https://github.com/x/y", + Agent: task.AgentConfig{Type: "claude", Instructions: "x"}, + Priority: task.PriorityNormal, + Retry: task.RetryConfig{MaxAttempts: 1, Backoff: "linear"}, + Tags: []string{}, DependsOn: []string{}, + State: task.StatePending, CreatedAt: time.Now().UTC(), UpdatedAt: time.Now().UTC(), + } + if err := db.CreateTask(tk); err != nil { + t.Fatalf("CreateTask: %v", err) + } + if err := db.UpdateTaskCheckerReport("cr-1", "Tests failed: missing endpoint"); err != nil { + t.Fatalf("UpdateTaskCheckerReport: %v", err) + } + got, err := db.GetTask("cr-1") + if err != nil { + t.Fatalf("GetTask: %v", err) + } + if got.CheckerReport != "Tests failed: missing endpoint" { + t.Errorf("expected checker report, got %q", got.CheckerReport) + } +} + +func TestGetCheckerTask(t *testing.T) { + db := openTestDB(t) + checked := &task.Task{ + ID: "chk-orig", Name: "orig", RepositoryURL: "https://github.com/x/y", + Agent: task.AgentConfig{Type: "claude", Instructions: "x"}, + Priority: task.PriorityNormal, + Retry: task.RetryConfig{MaxAttempts: 1, Backoff: "linear"}, + Tags: []string{}, DependsOn: []string{}, + State: task.StatePending, CreatedAt: time.Now().UTC(), UpdatedAt: time.Now().UTC(), + } + if err := db.CreateTask(checked); err != nil { + t.Fatalf("CreateTask checked: %v", err) + } + checker := &task.Task{ + ID: "chk-checker", Name: "Check: orig", CheckerForTaskID: "chk-orig", + RepositoryURL: "https://github.com/x/y", + Agent: task.AgentConfig{Type: "claude", Instructions: "validate"}, + Priority: task.PriorityNormal, + Retry: task.RetryConfig{MaxAttempts: 1, Backoff: "linear"}, + Tags: []string{}, DependsOn: []string{}, + State: task.StatePending, CreatedAt: time.Now().UTC(), UpdatedAt: time.Now().UTC(), + } + if err := db.CreateTask(checker); err != nil { + t.Fatalf("CreateTask checker: %v", err) + } + + // Should find the checker task. + got, err := db.GetCheckerTask("chk-orig") + if err != nil { + t.Fatalf("GetCheckerTask: %v", err) + } + if got == nil || got.ID != "chk-checker" { + t.Errorf("expected checker task ID chk-checker, got %v", got) + } + + // Should return nil when no checker exists. + none, err := db.GetCheckerTask("nonexistent") + if err != nil { + t.Fatalf("GetCheckerTask nonexistent: %v", err) + } + if none != nil { + t.Errorf("expected nil for task with no checker, got %v", none) + } +} +``` + +- [ ] **Step 2: Run tests to verify they fail** + +```bash +cd /workspace/claudomator && go test ./internal/storage/... -run "TestUpdateTaskCheckerReport|TestGetCheckerTask" -v +``` + +Expected: FAIL — `db.UpdateTaskCheckerReport undefined`, `db.GetCheckerTask undefined`. + +- [ ] **Step 3: Add three migrations to `db.go`** + +In the `migrations` slice in `migrate()`, append after the `ALTER TABLE tasks ADD COLUMN story_id TEXT` entry: + +```go +`ALTER TABLE tasks ADD COLUMN acceptance_criteria TEXT NOT NULL DEFAULT ''`, +`ALTER TABLE tasks ADD COLUMN checker_for_task_id TEXT NOT NULL DEFAULT ''`, +`ALTER TABLE tasks ADD COLUMN checker_report TEXT NOT NULL DEFAULT ''`, +``` + +- [ ] **Step 4: Update `CreateTask` INSERT to include the three new columns** + +Replace the `INSERT INTO tasks` statement in `CreateTask`: + +```go +_, err = s.db.Exec(` + INSERT INTO tasks (id, name, description, elaboration_input, project, repository_url, config_json, priority, timeout_ns, retry_json, tags_json, depends_on_json, parent_task_id, state, created_at, updated_at, story_id, acceptance_criteria, checker_for_task_id, checker_report) + VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)`, + t.ID, t.Name, t.Description, t.ElaborationInput, t.Project, t.RepositoryURL, string(configJSON), string(t.Priority), + t.Timeout.Duration.Nanoseconds(), string(retryJSON), string(tagsJSON), string(depsJSON), + t.ParentTaskID, string(t.State), t.CreatedAt.UTC(), t.UpdatedAt.UTC(), t.StoryID, + t.AcceptanceCriteria, t.CheckerForTaskID, t.CheckerReport, + ) +``` + +- [ ] **Step 5: Update `scanTask` to scan the three new columns** + +`scanTask` currently declares local vars and calls `row.Scan(...)` with 21 positional arguments. Add three new vars and extend the scan. The new `var` block: + +```go +func scanTask(row scanner) (*task.Task, error) { + var ( + t task.Task + configJSON string + retryJSON string + tagsJSON string + depsJSON string + state string + priority string + timeoutNS int64 + parentTaskID sql.NullString + elaborationInput sql.NullString + project sql.NullString + repositoryURL sql.NullString + rejectionComment sql.NullString + questionJSON sql.NullString + summary sql.NullString + interactionsJSON sql.NullString + storyID sql.NullString + acceptanceCriteria sql.NullString + checkerForTaskID sql.NullString + checkerReport sql.NullString + ) + err := row.Scan( + &t.ID, &t.Name, &t.Description, &elaborationInput, &project, &repositoryURL, + &configJSON, &priority, &timeoutNS, &retryJSON, &tagsJSON, &depsJSON, + &parentTaskID, &state, &t.CreatedAt, &t.UpdatedAt, + &rejectionComment, &questionJSON, &summary, &interactionsJSON, &storyID, + &acceptanceCriteria, &checkerForTaskID, &checkerReport, + ) + t.ParentTaskID = parentTaskID.String + t.ElaborationInput = elaborationInput.String + t.Project = project.String + t.RepositoryURL = repositoryURL.String + t.RejectionComment = rejectionComment.String + t.QuestionJSON = questionJSON.String + t.Summary = summary.String + t.StoryID = storyID.String + t.AcceptanceCriteria = acceptanceCriteria.String + t.CheckerForTaskID = checkerForTaskID.String + t.CheckerReport = checkerReport.String + // ... rest of function unchanged +``` + +- [ ] **Step 6: Update all SELECT queries to include the three new columns** + +There are five SELECT statements that need `acceptance_criteria, checker_for_task_id, checker_report` appended to the column list. The pattern to find: every query with `story_id FROM tasks`. Update each one: + +In `GetTask` (line ~185): +```go +row := s.db.QueryRow(`SELECT id, name, description, elaboration_input, project, repository_url, config_json, priority, timeout_ns, retry_json, tags_json, depends_on_json, parent_task_id, state, created_at, updated_at, rejection_comment, question_json, summary, interactions_json, story_id, acceptance_criteria, checker_for_task_id, checker_report FROM tasks WHERE id = ?`, id) +``` + +In `ListTasks` (line ~191): +```go +query := `SELECT id, name, description, elaboration_input, project, repository_url, config_json, priority, timeout_ns, retry_json, tags_json, depends_on_json, parent_task_id, state, created_at, updated_at, rejection_comment, question_json, summary, interactions_json, story_id, acceptance_criteria, checker_for_task_id, checker_report FROM tasks WHERE 1=1` +``` + +In `ListSubtasks` (line ~227): +```go +rows, err := s.db.Query(`SELECT id, name, description, elaboration_input, project, repository_url, config_json, priority, timeout_ns, retry_json, tags_json, depends_on_json, parent_task_id, state, created_at, updated_at, rejection_comment, question_json, summary, interactions_json, story_id, acceptance_criteria, checker_for_task_id, checker_report FROM tasks WHERE parent_task_id = ? ORDER BY created_at ASC`, parentID) +``` + +In `ResetTaskForRetry` (line ~280): +```go +t, err := scanTask(tx.QueryRow(`SELECT id, name, description, elaboration_input, project, repository_url, config_json, priority, timeout_ns, retry_json, tags_json, depends_on_json, parent_task_id, state, created_at, updated_at, rejection_comment, question_json, summary, interactions_json, story_id, acceptance_criteria, checker_for_task_id, checker_report FROM tasks WHERE id = ?`, id)) +``` + +In `ListTasksByStory` (line ~1202): +```go +`SELECT id, name, description, elaboration_input, project, repository_url, config_json, priority, timeout_ns, retry_json, tags_json, depends_on_json, parent_task_id, state, created_at, updated_at, rejection_comment, question_json, summary, interactions_json, story_id, acceptance_criteria, checker_for_task_id, checker_report FROM tasks WHERE story_id = ? ORDER BY created_at ASC`, +``` + +- [ ] **Step 7: Add `UpdateTaskCheckerReport`** + +Add after `UpdateTaskSummary`: + +```go +// UpdateTaskCheckerReport sets the checker_report field on a task. +func (s *DB) UpdateTaskCheckerReport(id, report string) error { + now := time.Now().UTC() + _, err := s.db.Exec(`UPDATE tasks SET checker_report = ?, updated_at = ? WHERE id = ?`, report, now, id) + return err +} +``` + +- [ ] **Step 8: Add `GetCheckerTask`** + +Add after `UpdateTaskCheckerReport`: + +```go +// GetCheckerTask returns the checker task for the given checked task ID, +// or nil if no checker task exists. +func (s *DB) GetCheckerTask(checkedTaskID string) (*task.Task, error) { + row := s.db.QueryRow(`SELECT id, name, description, elaboration_input, project, repository_url, config_json, priority, timeout_ns, retry_json, tags_json, depends_on_json, parent_task_id, state, created_at, updated_at, rejection_comment, question_json, summary, interactions_json, story_id, acceptance_criteria, checker_for_task_id, checker_report FROM tasks WHERE checker_for_task_id = ? LIMIT 1`, checkedTaskID) + t, err := scanTask(row) + if err == sql.ErrNoRows { + return nil, nil + } + return t, err +} +``` + +- [ ] **Step 9: Run the failing tests to verify they pass** + +```bash +cd /workspace/claudomator && go test ./internal/storage/... -run "TestUpdateTaskCheckerReport|TestGetCheckerTask" -v +``` + +Expected: PASS. + +- [ ] **Step 10: Run full storage tests** + +```bash +cd /workspace/claudomator && go test ./internal/storage/... -v +``` + +Expected: all PASS. + +- [ ] **Step 11: Commit** + +```bash +git add internal/storage/db.go internal/storage/db_test.go +git commit -m "feat: add checker task columns, UpdateTaskCheckerReport, GetCheckerTask" +``` + +--- + +## Task 3: Executor — checker task spawn and completion handling + +**Files:** +- Modify: `internal/executor/executor.go` +- Modify: `internal/executor/executor_test.go` + +- [ ] **Step 1: Add two methods to executor's `Store` interface** + +In `executor.go`, the `Store` interface (around line 22). Add after `CreateTask`: + +```go +UpdateTaskCheckerReport(id, report string) error +GetCheckerTask(checkedTaskID string) (*task.Task, error) +``` + +- [ ] **Step 2: Write failing tests** + +In `executor_test.go`, add: + +```go +func TestPool_CheckerSpawned_OnReady(t *testing.T) { + store := testStore(t) + runner := &mockRunner{} // succeeds instantly + pool := NewPool(2, map[string]Runner{"claude": runner}, store, slog.New(slog.NewTextHandler(os.Stderr, &slog.HandlerOptions{Level: slog.LevelError}))) + + tk := makeTask("checker-spawn-1") + tk.RepositoryURL = "https://github.com/x/y" + store.CreateTask(tk) + pool.Submit(context.Background(), tk) + <-pool.Results() // wait for original task to finish + + // Give the async spawnCheckerTask goroutine a moment to run. + time.Sleep(200 * time.Millisecond) + + checker, err := store.GetCheckerTask("checker-spawn-1") + if err != nil { + t.Fatalf("GetCheckerTask: %v", err) + } + if checker == nil { + t.Fatal("expected a checker task to be created, got nil") + } + if checker.CheckerForTaskID != "checker-spawn-1" { + t.Errorf("expected CheckerForTaskID=checker-spawn-1, got %q", checker.CheckerForTaskID) + } +} + +func TestPool_CheckerNotSpawned_ForSubtask(t *testing.T) { + store := testStore(t) + runner := &mockRunner{} + pool := NewPool(2, map[string]Runner{"claude": runner}, store, slog.New(slog.NewTextHandler(os.Stderr, &slog.HandlerOptions{Level: slog.LevelError}))) + + parent := makeTask("no-checker-parent") + parent.RepositoryURL = "https://github.com/x/y" + store.CreateTask(parent) + + sub := makeTask("no-checker-sub") + sub.ParentTaskID = "no-checker-parent" + sub.RepositoryURL = "https://github.com/x/y" + store.CreateTask(sub) + + pool.Submit(context.Background(), sub) + <-pool.Results() + + time.Sleep(100 * time.Millisecond) + + checker, err := store.GetCheckerTask("no-checker-sub") + if err != nil { + t.Fatalf("GetCheckerTask: %v", err) + } + if checker != nil { + t.Error("expected no checker for subtask, but one was created") + } +} + +func TestPool_CheckerPass_AutoAcceptsTask(t *testing.T) { + store := testStore(t) + // Two-phase: first runner succeeds (original task), second also succeeds (checker). + callCount := 0 + runner := &mockRunner{ + onRun: func(t *task.Task, e *storage.Execution) error { + callCount++ + return nil // both original and checker succeed + }, + } + pool := NewPool(2, map[string]Runner{"claude": runner}, store, slog.New(slog.NewTextHandler(os.Stderr, &slog.HandlerOptions{Level: slog.LevelError}))) + + tk := makeTask("autoaccept-1") + tk.RepositoryURL = "https://github.com/x/y" + store.CreateTask(tk) + pool.Submit(context.Background(), tk) + <-pool.Results() // original finishes → READY + checker spawned + + // Wait for checker to run and complete. + deadline := time.Now().Add(5 * time.Second) + for time.Now().Before(deadline) { + got, _ := store.GetTask("autoaccept-1") + if got != nil && got.State == task.StateCompleted { + break + } + <-pool.Results() + } + + got, err := store.GetTask("autoaccept-1") + if err != nil { + t.Fatalf("GetTask: %v", err) + } + if got.State != task.StateCompleted { + t.Errorf("expected COMPLETED after checker pass, got %s", got.State) + } +} + +func TestPool_CheckerFail_AttachesReport(t *testing.T) { + store := testStore(t) + callCount := 0 + runner := &mockRunner{ + onRun: func(t *task.Task, e *storage.Execution) error { + callCount++ + if t.CheckerForTaskID != "" { + return fmt.Errorf("test suite failed: 3 failures") + } + return nil // original task succeeds + }, + } + pool := NewPool(2, map[string]Runner{"claude": runner}, store, slog.New(slog.NewTextHandler(os.Stderr, &slog.HandlerOptions{Level: slog.LevelError}))) + + tk := makeTask("fail-checker-1") + tk.RepositoryURL = "https://github.com/x/y" + store.CreateTask(tk) + pool.Submit(context.Background(), tk) + <-pool.Results() // original → READY + + // Wait for checker to fail. + deadline := time.Now().Add(5 * time.Second) + for time.Now().Before(deadline) { + got, _ := store.GetTask("fail-checker-1") + if got != nil && got.CheckerReport != "" { + break + } + select { + case <-pool.Results(): + case <-time.After(100 * time.Millisecond): + } + } + + got, err := store.GetTask("fail-checker-1") + if err != nil { + t.Fatalf("GetTask: %v", err) + } + if got.State != task.StateReady { + t.Errorf("expected task to stay READY after checker fail, got %s", got.State) + } + if got.CheckerReport == "" { + t.Error("expected checker_report to be set after checker failure") + } +} +``` + +- [ ] **Step 3: Run tests to verify they fail** + +```bash +cd /workspace/claudomator && go test ./internal/executor/... -run "TestPool_Checker" -v 2>&1 | head -30 +``` + +Expected: FAIL — `store.UpdateTaskCheckerReport undefined`, `store.GetCheckerTask undefined`, `spawnCheckerTask undefined`. + +- [ ] **Step 4: Add `spawnCheckerTask` to `executor.go`** + +Add this function after `checkStoryCompletion`: + +```go +// spawnCheckerTask creates and submits a checker task for the given completed task. +// Guards: not called for subtasks, checker tasks, or tasks that already have a checker. +func (p *Pool) spawnCheckerTask(ctx context.Context, checked *task.Task) { + // Never spawn a checker for subtasks or checker tasks themselves. + if checked.ParentTaskID != "" || checked.CheckerForTaskID != "" { + return + } + // Idempotent: don't create a second checker if one already exists. + existing, err := p.store.GetCheckerTask(checked.ID) + if err != nil { + p.logger.Error("spawnCheckerTask: GetCheckerTask failed", "taskID", checked.ID, "error", err) + return + } + if existing != nil { + return + } + + criteria := checked.AcceptanceCriteria + if criteria == "" { + criteria = checked.Agent.Instructions + } + + instructions := fmt.Sprintf(`You are validating a completed task. Do not make any changes to the code or repository. + +Task: %s +Instructions given to the implementor: +%s + +Acceptance criteria: +%s + +Steps: +1. Clone the repository and review the changes made. +2. Verify each acceptance criterion is met. Run tests or make HTTP requests as needed. +3. If all criteria are satisfied, exit normally (success). +4. If any criterion is not met, use the Bash tool to exit with a non-zero code: + bash -c "exit 1" + Before exiting, write a brief summary of what failed.`, checked.Name, checked.Agent.Instructions, criteria) + + now := time.Now().UTC() + checker := &task.Task{ + ID: uuid.New().String(), + Name: "Check: " + checked.Name, + CheckerForTaskID: checked.ID, + RepositoryURL: checked.RepositoryURL, + Agent: task.AgentConfig{ + Type: "claude", + Instructions: instructions, + MaxBudgetUSD: 0.50, + AllowedTools: []string{"Bash", "Read", "Glob", "Grep"}, + }, + Timeout: task.Duration{Duration: 10 * time.Minute}, + Priority: task.PriorityNormal, + Tags: []string{}, + DependsOn: []string{}, + Retry: task.RetryConfig{MaxAttempts: 1, Backoff: "linear"}, + State: task.StatePending, + CreatedAt: now, + UpdatedAt: now, + } + + if err := p.store.CreateTask(checker); err != nil { + p.logger.Error("spawnCheckerTask: CreateTask failed", "error", err) + return + } + checker.State = task.StateQueued + if err := p.store.UpdateTaskState(checker.ID, task.StateQueued); err != nil { + p.logger.Error("spawnCheckerTask: UpdateTaskState failed", "error", err) + return + } + if err := p.Submit(ctx, checker); err != nil { + p.logger.Error("spawnCheckerTask: Submit failed", "error", err) + } +} +``` + +- [ ] **Step 5: Modify `handleRunResult` — success path** + +Find the success branch in `handleRunResult` (the `} else {` block after all the error handling). Currently it looks like: + +```go +} else { + p.mu.Lock() + p.consecutiveFailures[agentType] = 0 + p.mu.Unlock() + if t.ParentTaskID == "" { + subtasks, subErr := p.store.ListSubtasks(t.ID) + // ... + if subErr == nil && len(subtasks) > 0 { + exec.Status = "BLOCKED" + if err := p.store.UpdateTaskState(t.ID, task.StateBlocked); err != nil { ... } + } else { + exec.Status = "READY" + if err := p.store.UpdateTaskState(t.ID, task.StateReady); err != nil { ... } + } + } else { + exec.Status = "COMPLETED" + if err := p.store.UpdateTaskState(t.ID, task.StateCompleted); err != nil { ... } + p.maybeUnblockParent(t.ParentTaskID) + } + if t.StoryID != "" { + // ...checkStoryCompletion / checkValidationResult + } +} +``` + +Replace it with: + +```go +} else { + p.mu.Lock() + p.consecutiveFailures[agentType] = 0 + p.mu.Unlock() + if t.CheckerForTaskID != "" { + // Checker task succeeded — auto-accept the checked task. + exec.Status = "COMPLETED" + if err := p.store.UpdateTaskState(t.ID, task.StateCompleted); err != nil { + p.logger.Error("handleRunResult: failed to complete checker task", "taskID", t.ID, "error", err) + } + checkedTask, getErr := p.store.GetTask(t.CheckerForTaskID) + if getErr == nil { + if acceptErr := p.store.UpdateTaskState(t.CheckerForTaskID, task.StateCompleted); acceptErr != nil { + p.logger.Error("handleRunResult: failed to auto-accept checked task", "taskID", t.CheckerForTaskID, "error", acceptErr) + } else if checkedTask.StoryID != "" { + go p.checkStoryCompletion(ctx, checkedTask.StoryID) + } + } else { + p.logger.Error("handleRunResult: failed to get checked task", "taskID", t.CheckerForTaskID, "error", getErr) + } + } else if t.ParentTaskID == "" { + subtasks, subErr := p.store.ListSubtasks(t.ID) + if subErr != nil { + p.logger.Error("failed to list subtasks", "taskID", t.ID, "error", subErr) + } + if subErr == nil && len(subtasks) > 0 { + exec.Status = "BLOCKED" + if err := p.store.UpdateTaskState(t.ID, task.StateBlocked); err != nil { + p.logger.Error("failed to update task state", "taskID", t.ID, "state", task.StateBlocked, "error", err) + } + } else { + exec.Status = "READY" + if err := p.store.UpdateTaskState(t.ID, task.StateReady); err != nil { + p.logger.Error("failed to update task state", "taskID", t.ID, "state", task.StateReady, "error", err) + } + go p.spawnCheckerTask(ctx, t) + } + } else { + exec.Status = "COMPLETED" + if err := p.store.UpdateTaskState(t.ID, task.StateCompleted); err != nil { + p.logger.Error("failed to update task state", "taskID", t.ID, "state", task.StateCompleted, "error", err) + } + p.maybeUnblockParent(t.ParentTaskID) + } + if t.StoryID != "" { + storyID := t.StoryID + go func() { + story, getErr := p.store.GetStory(storyID) + if getErr != nil { + p.logger.Error("handleRunResult: failed to get story", "storyID", storyID, "error", getErr) + return + } + if story.Status == task.StoryValidating { + p.checkValidationResult(ctx, storyID, task.StateCompleted, "") + } else { + p.checkStoryCompletion(ctx, storyID) + } + }() + } +} +``` + +- [ ] **Step 6: Modify `handleRunResult` — failure path, attach checker report** + +In the generic failure `else` case (where `exec.Status = "FAILED"` is set and `consecutiveFailures` is incremented), add after the failures increment: + +```go +p.mu.Lock() +p.consecutiveFailures[agentType]++ +p.mu.Unlock() +// If this is a checker task, attach the failure report to the checked task. +if t.CheckerForTaskID != "" { + report := exec.ErrorMsg + if reportErr := p.store.UpdateTaskCheckerReport(t.CheckerForTaskID, report); reportErr != nil { + p.logger.Error("handleRunResult: failed to set checker report", "taskID", t.CheckerForTaskID, "error", reportErr) + } +} +``` + +Also update the checker report after summary extraction (around the `summary := exec.Summary` block), to prefer summary over error message when available. After the summary is resolved, add: + +```go +if t.CheckerForTaskID != "" && exec.Status == "FAILED" && summary != "" { + // Overwrite the initial error-message report with the richer summary. + if reportErr := p.store.UpdateTaskCheckerReport(t.CheckerForTaskID, summary); reportErr != nil { + p.logger.Error("handleRunResult: failed to update checker report with summary", "taskID", t.CheckerForTaskID, "error", reportErr) + } +} +``` + +- [ ] **Step 7: Run the checker tests** + +```bash +cd /workspace/claudomator && go test ./internal/executor/... -run "TestPool_Checker" -v -timeout 30s +``` + +Expected: all PASS. + +- [ ] **Step 8: Run full executor tests** + +```bash +cd /workspace/claudomator && go test ./internal/executor/... -race -timeout 120s +``` + +Expected: all PASS. + +- [ ] **Step 9: Commit** + +```bash +git add internal/executor/executor.go internal/executor/executor_test.go +git commit -m "feat: spawn checker task on READY; auto-accept on pass; attach report on fail" +``` + +--- + +## Task 4: Story ship gate — remove auto-deploy, add explicit ship endpoint + +**Files:** +- Modify: `internal/executor/executor.go` +- Modify: `internal/api/server.go` +- Modify: `internal/api/stories.go` +- Test: `internal/api/server_test.go` + +- [ ] **Step 1: Write failing test for ship endpoint** + +In `internal/api/server_test.go`, add: + +```go +func TestShipStory_ShippableStory_Returns202(t *testing.T) { + srv, store := testServer(t) + + // Create a project with a deploy script (empty path — deploy will fail but that's OK for this test). + proj := &task.Project{ + ID: "ship-proj-1", Name: "test", RemoteURL: "https://github.com/x/y", + Type: "web", DeployScript: "", + CreatedAt: time.Now().UTC(), UpdatedAt: time.Now().UTC(), + } + if err := store.CreateProject(proj); err != nil { + t.Fatalf("CreateProject: %v", err) + } + + story := &task.Story{ + ID: "ship-story-1", Name: "Ship Test", ProjectID: "ship-proj-1", + Status: task.StoryShippable, CreatedAt: time.Now().UTC(), UpdatedAt: time.Now().UTC(), + } + if err := store.CreateStory(story); err != nil { + t.Fatalf("CreateStory: %v", err) + } + + req := httptest.NewRequest("POST", "/api/stories/ship-story-1/ship", nil) + w := httptest.NewRecorder() + srv.Handler().ServeHTTP(w, req) + + if w.Code != http.StatusAccepted { + t.Errorf("expected 202, got %d: %s", w.Code, w.Body.String()) + } +} + +func TestShipStory_NonShippable_Returns409(t *testing.T) { + srv, store := testServer(t) + + story := &task.Story{ + ID: "nonship-1", Name: "Not Ready", ProjectID: "", + Status: task.StoryInProgress, CreatedAt: time.Now().UTC(), UpdatedAt: time.Now().UTC(), + } + if err := store.CreateStory(story); err != nil { + t.Fatalf("CreateStory: %v", err) + } + + req := httptest.NewRequest("POST", "/api/stories/nonship-1/ship", nil) + w := httptest.NewRecorder() + srv.Handler().ServeHTTP(w, req) + + if w.Code != http.StatusConflict { + t.Errorf("expected 409, got %d", w.Code) + } +} +``` + +- [ ] **Step 2: Run tests to verify they fail** + +```bash +cd /workspace/claudomator && go test ./internal/api/... -run "TestShipStory" -v +``` + +Expected: FAIL — `404 page not found` (route doesn't exist yet). + +- [ ] **Step 3: Remove auto-deploy from `checkStoryCompletion` and add status guard** + +In `executor.go`, replace `checkStoryCompletion`: + +```go +func (p *Pool) checkStoryCompletion(ctx context.Context, storyID string) { + story, err := p.store.GetStory(storyID) + if err != nil { + p.logger.Error("checkStoryCompletion: failed to get story", "storyID", storyID, "error", err) + return + } + if story.Status != task.StoryInProgress { + return // already SHIPPABLE or beyond — nothing to do + } + tasks, err := p.store.ListTasksByStory(storyID) + if err != nil { + p.logger.Error("checkStoryCompletion: failed to list tasks", "storyID", storyID, "error", err) + return + } + if len(tasks) == 0 { + return + } + topLevelCount := 0 + for _, t := range tasks { + if t.ParentTaskID != "" { + continue // subtasks are covered by their parent + } + topLevelCount++ + if t.State != task.StateCompleted && t.State != task.StateReady { + return // not all top-level tasks done + } + } + if topLevelCount == 0 { + return + } + if err := p.store.UpdateStoryStatus(storyID, task.StoryShippable); err != nil { + p.logger.Error("checkStoryCompletion: failed to update story status", "storyID", storyID, "error", err) + return + } + p.logger.Info("story transitioned to SHIPPABLE", "storyID", storyID) + // Deploy is now triggered explicitly by the human via POST /api/stories/{id}/ship. +} +``` + +- [ ] **Step 4: Add `ShipStory` to Pool** + +Add after `checkStoryCompletion`: + +```go +// ShipStory merges the story branch and runs the deploy script. +// Returns an error if the story is not in SHIPPABLE state. +func (p *Pool) ShipStory(ctx context.Context, storyID string) error { + story, err := p.store.GetStory(storyID) + if err != nil { + return fmt.Errorf("story not found: %w", err) + } + if story.Status != task.StoryShippable { + return fmt.Errorf("story is not SHIPPABLE (current status: %s)", story.Status) + } + go p.triggerStoryDeploy(ctx, storyID) + return nil +} +``` + +- [ ] **Step 5: Register the route in `server.go`** + +In the `routes()` method, after the existing story routes, add: + +```go +s.mux.HandleFunc("POST /api/stories/{id}/ship", s.handleShipStory) +``` + +- [ ] **Step 6: Add `handleShipStory` to `stories.go`** + +Add at the end of `stories.go`: + +```go +// handleShipStory triggers the merge + deploy for a SHIPPABLE story. +// POST /api/stories/{id}/ship +func (s *Server) handleShipStory(w http.ResponseWriter, r *http.Request) { + id := r.PathValue("id") + if err := s.pool.ShipStory(r.Context(), id); err != nil { + writeJSON(w, http.StatusConflict, map[string]string{"error": err.Error()}) + return + } + writeJSON(w, http.StatusAccepted, map[string]string{"message": "story shipping initiated", "story_id": id}) +} +``` + +- [ ] **Step 7: Run the ship tests** + +```bash +cd /workspace/claudomator && go test ./internal/api/... -run "TestShipStory" -v +``` + +Expected: both PASS. + +- [ ] **Step 8: Run full test suite** + +```bash +cd /workspace/claudomator && go test ./... -race -timeout 120s +``` + +Expected: all PASS. + +- [ ] **Step 9: Commit** + +```bash +git add internal/executor/executor.go internal/api/server.go internal/api/stories.go internal/api/server_test.go +git commit -m "feat: story ship gate — explicit POST /api/stories/{id}/ship; remove auto-deploy" +``` + +--- + +## Task 5: Elaborator — acceptance criteria per story task + +**Files:** +- Modify: `internal/api/elaborate.go` +- Modify: `internal/api/stories.go` +- Test: `internal/api/stories_test.go` + +- [ ] **Step 1: Write failing test** + +In `internal/api/stories_test.go`, find (or add) a test for story approval and verify acceptance criteria flows through: + +```go +func TestApproveStory_AcceptanceCriteriaStored(t *testing.T) { + srv, store := testServer(t) + + proj := &task.Project{ + ID: "ac-proj", Name: "test", RemoteURL: "https://github.com/x/y", + Type: "web", DeployScript: "", + CreatedAt: time.Now().UTC(), UpdatedAt: time.Now().UTC(), + } + store.CreateProject(proj) + + body := `{ + "name": "AC Story", + "branch_name": "story/ac-test", + "project_id": "ac-proj", + "tasks": [ + { + "name": "Add feature", + "instructions": "implement the thing", + "acceptance_criteria": "run go test ./... and verify all pass", + "subtasks": [] + } + ], + "validation": {"type": "test", "steps": [], "success_criteria": "tests pass"} + }` + req := httptest.NewRequest("POST", "/api/stories/approve", strings.NewReader(body)) + req.Header.Set("Content-Type", "application/json") + w := httptest.NewRecorder() + srv.Handler().ServeHTTP(w, req) + + if w.Code != http.StatusCreated { + t.Fatalf("expected 201, got %d: %s", w.Code, w.Body.String()) + } + + var resp struct { + TaskIDs []string `json:"task_ids"` + } + json.NewDecoder(w.Body).Decode(&resp) + if len(resp.TaskIDs) == 0 { + t.Fatal("expected task_ids in response") + } + + tk, err := store.GetTask(resp.TaskIDs[0]) + if err != nil { + t.Fatalf("GetTask: %v", err) + } + if tk.AcceptanceCriteria != "run go test ./... and verify all pass" { + t.Errorf("expected acceptance criteria stored on task, got %q", tk.AcceptanceCriteria) + } +} +``` + +- [ ] **Step 2: Run test to verify it fails** + +```bash +cd /workspace/claudomator && go test ./internal/api/... -run "TestApproveStory_AcceptanceCriteriaStored" -v +``` + +Expected: FAIL — acceptance criteria is empty on the created task. + +- [ ] **Step 3: Add `AcceptanceCriteria` to `elaboratedStoryTask` in `elaborate.go`** + +```go +type elaboratedStoryTask struct { + Name string `json:"name"` + Instructions string `json:"instructions"` + AcceptanceCriteria string `json:"acceptance_criteria"` + Subtasks []elaboratedStorySubtask `json:"subtasks"` +} +``` + +- [ ] **Step 4: Update `buildStoryElaboratePrompt` to request acceptance criteria** + +In `buildStoryElaboratePrompt()`, update the JSON schema in the returned string. Replace the tasks section: + +```go +func buildStoryElaboratePrompt() string { + return `You are a software architect. Given a goal, analyze the codebase at /workspace and produce a structured implementation plan as JSON. + +Output ONLY valid JSON matching this schema: +{ + "name": "story name", + "branch_name": "story/kebab-case-name", + "tasks": [ + { + "name": "task name", + "instructions": "detailed instructions including file paths and what to change", + "acceptance_criteria": "specific, verifiable conditions a separate reviewer can check — e.g. 'run go test ./... and verify all pass; confirm GET /api/foo returns 200 with expected JSON shape'", + "subtasks": [ + { "name": "subtask name", "instructions": "..." } + ] + } + ], + "validation": { + "type": "build|test|smoke", + "steps": ["step1", "step2"], + "success_criteria": "what success looks like" + } +} + +Rules: +- Tasks must be independently buildable (each can be deployed alone) +- Subtasks within a task are order-dependent and run sequentially +- Instructions must include specific file paths, function names, and exact changes +- Instructions must end with: git add -A && git commit -m "..." && git push origin <branch> +- acceptance_criteria must be concrete and verifiable by a separate agent — no vague assertions like "code looks good" +- Validation should match the scope: small change = build check; new feature = smoke test` +} +``` + +- [ ] **Step 5: Pass `AcceptanceCriteria` through in `handleApproveStory`** + +In `stories.go`, inside `handleApproveStory`, find the task creation block (the `for _, tp := range input.Tasks` loop). Add `AcceptanceCriteria` to the `task.Task` literal: + +```go +t := &task.Task{ + ID: uuid.New().String(), + Name: tp.Name, + Project: input.ProjectID, + RepositoryURL: repoURL, + StoryID: story.ID, + AcceptanceCriteria: tp.AcceptanceCriteria, + Agent: task.AgentConfig{Type: "claude", Instructions: tp.Instructions}, + Priority: task.PriorityNormal, + Tags: []string{}, + DependsOn: []string{}, + Retry: task.RetryConfig{MaxAttempts: 1, Backoff: "exponential"}, + State: task.StatePending, + CreatedAt: time.Now().UTC(), + UpdatedAt: time.Now().UTC(), +} +``` + +- [ ] **Step 6: Run the test** + +```bash +cd /workspace/claudomator && go test ./internal/api/... -run "TestApproveStory_AcceptanceCriteriaStored" -v +``` + +Expected: PASS. + +- [ ] **Step 7: Run full API tests** + +```bash +cd /workspace/claudomator && go test ./internal/api/... -race -timeout 120s +``` + +Expected: all PASS. + +- [ ] **Step 8: Commit** + +```bash +git add internal/api/elaborate.go internal/api/stories.go internal/api/stories_test.go +git commit -m "feat: acceptance_criteria per story task in elaboration and approval" +``` + +--- + +## Task 6: UI — Ship button and checker report + +**Files:** +- Modify: `web/app.js` + +- [ ] **Step 1: Add "Ship" button to SHIPPABLE story cards** + +In `renderStoryCard`, after the `meta` element is appended to `card`, add: + +```js +export function renderStoryCard(story, doc = document) { + // ... existing code building header, badge, meta ... + + card.appendChild(header); + if (meta.children.length) card.appendChild(meta); + + // Ship button for SHIPPABLE stories. + if (story.status === 'SHIPPABLE') { + const shipBtn = doc.createElement('button'); + shipBtn.className = 'btn-primary story-ship-btn'; + shipBtn.textContent = 'Ship'; + shipBtn.addEventListener('click', async (e) => { + e.stopPropagation(); + shipBtn.disabled = true; + shipBtn.textContent = 'Shipping…'; + try { + const res = await fetch(`${API_BASE}/api/stories/${story.id}/ship`, { method: 'POST' }); + if (!res.ok) { + const body = await res.json().catch(() => ({})); + alert(body.error || `Ship failed (${res.status})`); + shipBtn.disabled = false; + shipBtn.textContent = 'Ship'; + } else { + renderStoriesPanel(); + } + } catch { + shipBtn.disabled = false; + shipBtn.textContent = 'Ship'; + } + }); + card.appendChild(shipBtn); + } + + return card; +} +``` + +> Note: `API_BASE` is a module-level constant already defined in `app.js`. Verify it's accessible in this scope; if not, use `BASE_PATH` (also defined at module level) instead. + +- [ ] **Step 2: Add checker report to READY task cards** + +In `createTaskCard`, after the `// Error message for failed tasks` block, add: + +```js + // Checker report for READY tasks where the checker flagged a problem. + if (task.state === 'READY' && task.checker_report) { + const reportEl = document.createElement('div'); + reportEl.className = 'task-checker-report'; + const label = document.createElement('span'); + label.className = 'task-checker-report-label'; + label.textContent = '⚠ Checker flagged:'; + const text = document.createElement('span'); + text.textContent = task.checker_report; + reportEl.appendChild(label); + reportEl.appendChild(text); + card.appendChild(reportEl); + } +``` + +- [ ] **Step 3: Add CSS for checker report** + +In `web/style.css`, add after the `.ready-completed-label` block: + +```css +.task-checker-report { + margin: 0.5rem 0; + padding: 0.5rem 0.75rem; + background: var(--warning-bg, rgba(255, 180, 0, 0.12)); + border-left: 3px solid var(--warning, #f0a500); + border-radius: 4px; + font-size: 0.8rem; + color: var(--text); +} + +.task-checker-report-label { + font-weight: 600; + margin-right: 0.4rem; +} +``` + +- [ ] **Step 4: Build and verify** + +```bash +cd /workspace/claudomator && go build ./... +``` + +Expected: no errors. + +- [ ] **Step 5: Commit** + +```bash +git add web/app.js web/style.css +git commit -m "feat: Ship button on SHIPPABLE stories; checker report on READY task cards" +``` + +--- + +## Task 7: Full test run and deploy + +**Files:** none + +- [ ] **Step 1: Run full test suite with race detector** + +```bash +cd /workspace/claudomator && go test ./... -race -timeout 120s +``` + +Expected: all PASS. + +- [ ] **Step 2: Push and deploy** + +```bash +git push && sudo scripts/deploy +``` + +Expected: build passes, tests pass, binary installs, service restarts. + +--- + +## Self-Review + +**Spec coverage:** +- ✅ Checker spawned after task → READY (Task 3) +- ✅ Checker uses acceptance_criteria or falls back to task instructions (Task 3) +- ✅ Pass → auto-accept (READY → COMPLETED) (Task 3) +- ✅ Fail → task stays READY + checker_report attached (Task 3) +- ✅ No checker for subtasks or checker tasks (Task 3, guards in spawnCheckerTask) +- ✅ Story elaborator generates acceptance_criteria per task (Task 5) +- ✅ `checkStoryCompletion` no longer auto-deploys (Task 4) +- ✅ `POST /api/stories/{id}/ship` endpoint (Task 4) +- ✅ Ship button in UI (Task 6) +- ✅ Checker report shown on READY task cards (Task 6) +- ✅ New DB columns + migrations (Task 2) + +**Placeholder scan:** none found. + +**Type consistency:** `UpdateTaskCheckerReport(id, report string)`, `GetCheckerTask(checkedTaskID string) (*task.Task, error)`, `ShipStory(ctx context.Context, storyID string) error` — all consistent across Tasks 2, 3, 4. |
