# Task Checker Agent and Story Ship Gate — Implementation Plan > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. **Goal:** Add an async per-task checker agent that auto-accepts passing tasks, and replace the auto-deploy story trigger with an explicit human "Ship" action. **Architecture:** Checker tasks are regular pool tasks with a new `checker_for_task_id` field; when they complete successfully the pool auto-accepts the linked task. `checkStoryCompletion` still transitions stories to SHIPPABLE but no longer fires the deploy — a new `POST /api/stories/{id}/ship` endpoint and "Ship" button do that instead. Story elaboration is extended to produce `acceptance_criteria` per task. **Tech Stack:** Go 1.25, SQLite (database/sql + go-sqlite3), vanilla JS (no framework) --- ## File Map | File | Change | |---|---| | `internal/task/task.go` | Add `AcceptanceCriteria`, `CheckerForTaskID`, `CheckerReport` fields to `Task` | | `internal/storage/db.go` | 3 migrations; extend `CreateTask`, `scanTask`, all SELECT queries; add `UpdateTaskCheckerReport`, `GetCheckerTask` | | `internal/executor/executor.go` | Add 2 methods to `Store` interface; add `spawnCheckerTask`; modify `handleRunResult`; guard `checkStoryCompletion`; remove auto-deploy; add `ShipStory` | | `internal/api/server.go` | Register `POST /api/stories/{id}/ship` | | `internal/api/stories.go` | Add `handleShipStory`; pass `AcceptanceCriteria` in `handleApproveStory` | | `internal/api/elaborate.go` | Add `AcceptanceCriteria` to `elaboratedStoryTask`; update `buildStoryElaboratePrompt` | | `web/app.js` | Ship button on SHIPPABLE story cards; checker report on READY task cards | --- ## Task 1: Task struct — three new fields **Files:** - Modify: `internal/task/task.go` - [ ] **Step 1: Add fields to Task struct** In `internal/task/task.go`, add three fields after `StoryID`: ```go type Task struct { ID string `yaml:"id" json:"id"` ParentTaskID string `yaml:"parent_task_id" json:"parent_task_id"` Name string `yaml:"name" json:"name"` Description string `yaml:"description" json:"description"` Project string `yaml:"project" json:"project"` RepositoryURL string `yaml:"repository_url" json:"repository_url"` Agent AgentConfig `yaml:"agent" json:"agent"` Timeout Duration `yaml:"timeout" json:"timeout"` Retry RetryConfig `yaml:"retry" json:"retry"` Priority Priority `yaml:"priority" json:"priority"` Tags []string `yaml:"tags" json:"tags"` DependsOn []string `yaml:"depends_on" json:"depends_on"` StoryID string `yaml:"-" json:"story_id,omitempty"` BranchName string `yaml:"-" json:"branch_name,omitempty"` AcceptanceCriteria string `yaml:"-" json:"acceptance_criteria,omitempty"` CheckerForTaskID string `yaml:"-" json:"checker_for_task_id,omitempty"` CheckerReport string `yaml:"-" json:"checker_report,omitempty"` State State `yaml:"-" json:"state"` RejectionComment string `yaml:"-" json:"rejection_comment,omitempty"` QuestionJSON string `yaml:"-" json:"question,omitempty"` ElaborationInput string `yaml:"-" json:"elaboration_input,omitempty"` Summary string `yaml:"-" json:"summary,omitempty"` Interactions []Interaction `yaml:"-" json:"interactions,omitempty"` CreatedAt time.Time `yaml:"-" json:"created_at"` UpdatedAt time.Time `yaml:"-" json:"updated_at"` } ``` - [ ] **Step 2: Build to verify no compilation errors** ```bash cd /workspace/claudomator && go build ./... ``` Expected: no output (success). - [ ] **Step 3: Commit** ```bash git add internal/task/task.go git commit -m "feat: add AcceptanceCriteria, CheckerForTaskID, CheckerReport to Task struct" ``` --- ## Task 2: Storage — migrations, queries, two new methods **Files:** - Modify: `internal/storage/db.go` - Test: `internal/storage/db_test.go` - [ ] **Step 1: Write failing tests for the two new storage methods** Find the existing test file and add at the end: ```go func TestUpdateTaskCheckerReport(t *testing.T) { db := openTestDB(t) tk := &task.Task{ ID: "cr-1", Name: "orig", RepositoryURL: "https://github.com/x/y", Agent: task.AgentConfig{Type: "claude", Instructions: "x"}, Priority: task.PriorityNormal, Retry: task.RetryConfig{MaxAttempts: 1, Backoff: "linear"}, Tags: []string{}, DependsOn: []string{}, State: task.StatePending, CreatedAt: time.Now().UTC(), UpdatedAt: time.Now().UTC(), } if err := db.CreateTask(tk); err != nil { t.Fatalf("CreateTask: %v", err) } if err := db.UpdateTaskCheckerReport("cr-1", "Tests failed: missing endpoint"); err != nil { t.Fatalf("UpdateTaskCheckerReport: %v", err) } got, err := db.GetTask("cr-1") if err != nil { t.Fatalf("GetTask: %v", err) } if got.CheckerReport != "Tests failed: missing endpoint" { t.Errorf("expected checker report, got %q", got.CheckerReport) } } func TestGetCheckerTask(t *testing.T) { db := openTestDB(t) checked := &task.Task{ ID: "chk-orig", Name: "orig", RepositoryURL: "https://github.com/x/y", Agent: task.AgentConfig{Type: "claude", Instructions: "x"}, Priority: task.PriorityNormal, Retry: task.RetryConfig{MaxAttempts: 1, Backoff: "linear"}, Tags: []string{}, DependsOn: []string{}, State: task.StatePending, CreatedAt: time.Now().UTC(), UpdatedAt: time.Now().UTC(), } if err := db.CreateTask(checked); err != nil { t.Fatalf("CreateTask checked: %v", err) } checker := &task.Task{ ID: "chk-checker", Name: "Check: orig", CheckerForTaskID: "chk-orig", RepositoryURL: "https://github.com/x/y", Agent: task.AgentConfig{Type: "claude", Instructions: "validate"}, Priority: task.PriorityNormal, Retry: task.RetryConfig{MaxAttempts: 1, Backoff: "linear"}, Tags: []string{}, DependsOn: []string{}, State: task.StatePending, CreatedAt: time.Now().UTC(), UpdatedAt: time.Now().UTC(), } if err := db.CreateTask(checker); err != nil { t.Fatalf("CreateTask checker: %v", err) } // Should find the checker task. got, err := db.GetCheckerTask("chk-orig") if err != nil { t.Fatalf("GetCheckerTask: %v", err) } if got == nil || got.ID != "chk-checker" { t.Errorf("expected checker task ID chk-checker, got %v", got) } // Should return nil when no checker exists. none, err := db.GetCheckerTask("nonexistent") if err != nil { t.Fatalf("GetCheckerTask nonexistent: %v", err) } if none != nil { t.Errorf("expected nil for task with no checker, got %v", none) } } ``` - [ ] **Step 2: Run tests to verify they fail** ```bash cd /workspace/claudomator && go test ./internal/storage/... -run "TestUpdateTaskCheckerReport|TestGetCheckerTask" -v ``` Expected: FAIL — `db.UpdateTaskCheckerReport undefined`, `db.GetCheckerTask undefined`. - [ ] **Step 3: Add three migrations to `db.go`** In the `migrations` slice in `migrate()`, append after the `ALTER TABLE tasks ADD COLUMN story_id TEXT` entry: ```go `ALTER TABLE tasks ADD COLUMN acceptance_criteria TEXT NOT NULL DEFAULT ''`, `ALTER TABLE tasks ADD COLUMN checker_for_task_id TEXT NOT NULL DEFAULT ''`, `ALTER TABLE tasks ADD COLUMN checker_report TEXT NOT NULL DEFAULT ''`, ``` - [ ] **Step 4: Update `CreateTask` INSERT to include the three new columns** Replace the `INSERT INTO tasks` statement in `CreateTask`: ```go _, err = s.db.Exec(` INSERT INTO tasks (id, name, description, elaboration_input, project, repository_url, config_json, priority, timeout_ns, retry_json, tags_json, depends_on_json, parent_task_id, state, created_at, updated_at, story_id, acceptance_criteria, checker_for_task_id, checker_report) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)`, t.ID, t.Name, t.Description, t.ElaborationInput, t.Project, t.RepositoryURL, string(configJSON), string(t.Priority), t.Timeout.Duration.Nanoseconds(), string(retryJSON), string(tagsJSON), string(depsJSON), t.ParentTaskID, string(t.State), t.CreatedAt.UTC(), t.UpdatedAt.UTC(), t.StoryID, t.AcceptanceCriteria, t.CheckerForTaskID, t.CheckerReport, ) ``` - [ ] **Step 5: Update `scanTask` to scan the three new columns** `scanTask` currently declares local vars and calls `row.Scan(...)` with 21 positional arguments. Add three new vars and extend the scan. The new `var` block: ```go func scanTask(row scanner) (*task.Task, error) { var ( t task.Task configJSON string retryJSON string tagsJSON string depsJSON string state string priority string timeoutNS int64 parentTaskID sql.NullString elaborationInput sql.NullString project sql.NullString repositoryURL sql.NullString rejectionComment sql.NullString questionJSON sql.NullString summary sql.NullString interactionsJSON sql.NullString storyID sql.NullString acceptanceCriteria sql.NullString checkerForTaskID sql.NullString checkerReport sql.NullString ) err := row.Scan( &t.ID, &t.Name, &t.Description, &elaborationInput, &project, &repositoryURL, &configJSON, &priority, &timeoutNS, &retryJSON, &tagsJSON, &depsJSON, &parentTaskID, &state, &t.CreatedAt, &t.UpdatedAt, &rejectionComment, &questionJSON, &summary, &interactionsJSON, &storyID, &acceptanceCriteria, &checkerForTaskID, &checkerReport, ) t.ParentTaskID = parentTaskID.String t.ElaborationInput = elaborationInput.String t.Project = project.String t.RepositoryURL = repositoryURL.String t.RejectionComment = rejectionComment.String t.QuestionJSON = questionJSON.String t.Summary = summary.String t.StoryID = storyID.String t.AcceptanceCriteria = acceptanceCriteria.String t.CheckerForTaskID = checkerForTaskID.String t.CheckerReport = checkerReport.String // ... rest of function unchanged ``` - [ ] **Step 6: Update all SELECT queries to include the three new columns** There are five SELECT statements that need `acceptance_criteria, checker_for_task_id, checker_report` appended to the column list. The pattern to find: every query with `story_id FROM tasks`. Update each one: In `GetTask` (line ~185): ```go row := s.db.QueryRow(`SELECT id, name, description, elaboration_input, project, repository_url, config_json, priority, timeout_ns, retry_json, tags_json, depends_on_json, parent_task_id, state, created_at, updated_at, rejection_comment, question_json, summary, interactions_json, story_id, acceptance_criteria, checker_for_task_id, checker_report FROM tasks WHERE id = ?`, id) ``` In `ListTasks` (line ~191): ```go query := `SELECT id, name, description, elaboration_input, project, repository_url, config_json, priority, timeout_ns, retry_json, tags_json, depends_on_json, parent_task_id, state, created_at, updated_at, rejection_comment, question_json, summary, interactions_json, story_id, acceptance_criteria, checker_for_task_id, checker_report FROM tasks WHERE 1=1` ``` In `ListSubtasks` (line ~227): ```go rows, err := s.db.Query(`SELECT id, name, description, elaboration_input, project, repository_url, config_json, priority, timeout_ns, retry_json, tags_json, depends_on_json, parent_task_id, state, created_at, updated_at, rejection_comment, question_json, summary, interactions_json, story_id, acceptance_criteria, checker_for_task_id, checker_report FROM tasks WHERE parent_task_id = ? ORDER BY created_at ASC`, parentID) ``` In `ResetTaskForRetry` (line ~280): ```go t, err := scanTask(tx.QueryRow(`SELECT id, name, description, elaboration_input, project, repository_url, config_json, priority, timeout_ns, retry_json, tags_json, depends_on_json, parent_task_id, state, created_at, updated_at, rejection_comment, question_json, summary, interactions_json, story_id, acceptance_criteria, checker_for_task_id, checker_report FROM tasks WHERE id = ?`, id)) ``` In `ListTasksByStory` (line ~1202): ```go `SELECT id, name, description, elaboration_input, project, repository_url, config_json, priority, timeout_ns, retry_json, tags_json, depends_on_json, parent_task_id, state, created_at, updated_at, rejection_comment, question_json, summary, interactions_json, story_id, acceptance_criteria, checker_for_task_id, checker_report FROM tasks WHERE story_id = ? ORDER BY created_at ASC`, ``` - [ ] **Step 7: Add `UpdateTaskCheckerReport`** Add after `UpdateTaskSummary`: ```go // UpdateTaskCheckerReport sets the checker_report field on a task. func (s *DB) UpdateTaskCheckerReport(id, report string) error { now := time.Now().UTC() _, err := s.db.Exec(`UPDATE tasks SET checker_report = ?, updated_at = ? WHERE id = ?`, report, now, id) return err } ``` - [ ] **Step 8: Add `GetCheckerTask`** Add after `UpdateTaskCheckerReport`: ```go // GetCheckerTask returns the checker task for the given checked task ID, // or nil if no checker task exists. func (s *DB) GetCheckerTask(checkedTaskID string) (*task.Task, error) { row := s.db.QueryRow(`SELECT id, name, description, elaboration_input, project, repository_url, config_json, priority, timeout_ns, retry_json, tags_json, depends_on_json, parent_task_id, state, created_at, updated_at, rejection_comment, question_json, summary, interactions_json, story_id, acceptance_criteria, checker_for_task_id, checker_report FROM tasks WHERE checker_for_task_id = ? LIMIT 1`, checkedTaskID) t, err := scanTask(row) if err == sql.ErrNoRows { return nil, nil } return t, err } ``` - [ ] **Step 9: Run the failing tests to verify they pass** ```bash cd /workspace/claudomator && go test ./internal/storage/... -run "TestUpdateTaskCheckerReport|TestGetCheckerTask" -v ``` Expected: PASS. - [ ] **Step 10: Run full storage tests** ```bash cd /workspace/claudomator && go test ./internal/storage/... -v ``` Expected: all PASS. - [ ] **Step 11: Commit** ```bash git add internal/storage/db.go internal/storage/db_test.go git commit -m "feat: add checker task columns, UpdateTaskCheckerReport, GetCheckerTask" ``` --- ## Task 3: Executor — checker task spawn and completion handling **Files:** - Modify: `internal/executor/executor.go` - Modify: `internal/executor/executor_test.go` - [ ] **Step 1: Add two methods to executor's `Store` interface** In `executor.go`, the `Store` interface (around line 22). Add after `CreateTask`: ```go UpdateTaskCheckerReport(id, report string) error GetCheckerTask(checkedTaskID string) (*task.Task, error) ``` - [ ] **Step 2: Write failing tests** In `executor_test.go`, add: ```go func TestPool_CheckerSpawned_OnReady(t *testing.T) { store := testStore(t) runner := &mockRunner{} // succeeds instantly pool := NewPool(2, map[string]Runner{"claude": runner}, store, slog.New(slog.NewTextHandler(os.Stderr, &slog.HandlerOptions{Level: slog.LevelError}))) tk := makeTask("checker-spawn-1") tk.RepositoryURL = "https://github.com/x/y" store.CreateTask(tk) pool.Submit(context.Background(), tk) <-pool.Results() // wait for original task to finish // Give the async spawnCheckerTask goroutine a moment to run. time.Sleep(200 * time.Millisecond) checker, err := store.GetCheckerTask("checker-spawn-1") if err != nil { t.Fatalf("GetCheckerTask: %v", err) } if checker == nil { t.Fatal("expected a checker task to be created, got nil") } if checker.CheckerForTaskID != "checker-spawn-1" { t.Errorf("expected CheckerForTaskID=checker-spawn-1, got %q", checker.CheckerForTaskID) } } func TestPool_CheckerNotSpawned_ForSubtask(t *testing.T) { store := testStore(t) runner := &mockRunner{} pool := NewPool(2, map[string]Runner{"claude": runner}, store, slog.New(slog.NewTextHandler(os.Stderr, &slog.HandlerOptions{Level: slog.LevelError}))) parent := makeTask("no-checker-parent") parent.RepositoryURL = "https://github.com/x/y" store.CreateTask(parent) sub := makeTask("no-checker-sub") sub.ParentTaskID = "no-checker-parent" sub.RepositoryURL = "https://github.com/x/y" store.CreateTask(sub) pool.Submit(context.Background(), sub) <-pool.Results() time.Sleep(100 * time.Millisecond) checker, err := store.GetCheckerTask("no-checker-sub") if err != nil { t.Fatalf("GetCheckerTask: %v", err) } if checker != nil { t.Error("expected no checker for subtask, but one was created") } } func TestPool_CheckerPass_AutoAcceptsTask(t *testing.T) { store := testStore(t) // Two-phase: first runner succeeds (original task), second also succeeds (checker). callCount := 0 runner := &mockRunner{ onRun: func(t *task.Task, e *storage.Execution) error { callCount++ return nil // both original and checker succeed }, } pool := NewPool(2, map[string]Runner{"claude": runner}, store, slog.New(slog.NewTextHandler(os.Stderr, &slog.HandlerOptions{Level: slog.LevelError}))) tk := makeTask("autoaccept-1") tk.RepositoryURL = "https://github.com/x/y" store.CreateTask(tk) pool.Submit(context.Background(), tk) <-pool.Results() // original finishes → READY + checker spawned // Wait for checker to run and complete. deadline := time.Now().Add(5 * time.Second) for time.Now().Before(deadline) { got, _ := store.GetTask("autoaccept-1") if got != nil && got.State == task.StateCompleted { break } <-pool.Results() } got, err := store.GetTask("autoaccept-1") if err != nil { t.Fatalf("GetTask: %v", err) } if got.State != task.StateCompleted { t.Errorf("expected COMPLETED after checker pass, got %s", got.State) } } func TestPool_CheckerFail_AttachesReport(t *testing.T) { store := testStore(t) callCount := 0 runner := &mockRunner{ onRun: func(t *task.Task, e *storage.Execution) error { callCount++ if t.CheckerForTaskID != "" { return fmt.Errorf("test suite failed: 3 failures") } return nil // original task succeeds }, } pool := NewPool(2, map[string]Runner{"claude": runner}, store, slog.New(slog.NewTextHandler(os.Stderr, &slog.HandlerOptions{Level: slog.LevelError}))) tk := makeTask("fail-checker-1") tk.RepositoryURL = "https://github.com/x/y" store.CreateTask(tk) pool.Submit(context.Background(), tk) <-pool.Results() // original → READY // Wait for checker to fail. deadline := time.Now().Add(5 * time.Second) for time.Now().Before(deadline) { got, _ := store.GetTask("fail-checker-1") if got != nil && got.CheckerReport != "" { break } select { case <-pool.Results(): case <-time.After(100 * time.Millisecond): } } got, err := store.GetTask("fail-checker-1") if err != nil { t.Fatalf("GetTask: %v", err) } if got.State != task.StateReady { t.Errorf("expected task to stay READY after checker fail, got %s", got.State) } if got.CheckerReport == "" { t.Error("expected checker_report to be set after checker failure") } } ``` - [ ] **Step 3: Run tests to verify they fail** ```bash cd /workspace/claudomator && go test ./internal/executor/... -run "TestPool_Checker" -v 2>&1 | head -30 ``` Expected: FAIL — `store.UpdateTaskCheckerReport undefined`, `store.GetCheckerTask undefined`, `spawnCheckerTask undefined`. - [ ] **Step 4: Add `spawnCheckerTask` to `executor.go`** Add this function after `checkStoryCompletion`: ```go // spawnCheckerTask creates and submits a checker task for the given completed task. // Guards: not called for subtasks, checker tasks, or tasks that already have a checker. func (p *Pool) spawnCheckerTask(ctx context.Context, checked *task.Task) { // Never spawn a checker for subtasks or checker tasks themselves. if checked.ParentTaskID != "" || checked.CheckerForTaskID != "" { return } // Idempotent: don't create a second checker if one already exists. existing, err := p.store.GetCheckerTask(checked.ID) if err != nil { p.logger.Error("spawnCheckerTask: GetCheckerTask failed", "taskID", checked.ID, "error", err) return } if existing != nil { return } criteria := checked.AcceptanceCriteria if criteria == "" { criteria = checked.Agent.Instructions } instructions := fmt.Sprintf(`You are validating a completed task. Do not make any changes to the code or repository. Task: %s Instructions given to the implementor: %s Acceptance criteria: %s Steps: 1. Clone the repository and review the changes made. 2. Verify each acceptance criterion is met. Run tests or make HTTP requests as needed. 3. If all criteria are satisfied, exit normally (success). 4. If any criterion is not met, use the Bash tool to exit with a non-zero code: bash -c "exit 1" Before exiting, write a brief summary of what failed.`, checked.Name, checked.Agent.Instructions, criteria) now := time.Now().UTC() checker := &task.Task{ ID: uuid.New().String(), Name: "Check: " + checked.Name, CheckerForTaskID: checked.ID, RepositoryURL: checked.RepositoryURL, Agent: task.AgentConfig{ Type: "claude", Instructions: instructions, MaxBudgetUSD: 0.50, AllowedTools: []string{"Bash", "Read", "Glob", "Grep"}, }, Timeout: task.Duration{Duration: 10 * time.Minute}, Priority: task.PriorityNormal, Tags: []string{}, DependsOn: []string{}, Retry: task.RetryConfig{MaxAttempts: 1, Backoff: "linear"}, State: task.StatePending, CreatedAt: now, UpdatedAt: now, } if err := p.store.CreateTask(checker); err != nil { p.logger.Error("spawnCheckerTask: CreateTask failed", "error", err) return } checker.State = task.StateQueued if err := p.store.UpdateTaskState(checker.ID, task.StateQueued); err != nil { p.logger.Error("spawnCheckerTask: UpdateTaskState failed", "error", err) return } if err := p.Submit(ctx, checker); err != nil { p.logger.Error("spawnCheckerTask: Submit failed", "error", err) } } ``` - [ ] **Step 5: Modify `handleRunResult` — success path** Find the success branch in `handleRunResult` (the `} else {` block after all the error handling). Currently it looks like: ```go } else { p.mu.Lock() p.consecutiveFailures[agentType] = 0 p.mu.Unlock() if t.ParentTaskID == "" { subtasks, subErr := p.store.ListSubtasks(t.ID) // ... if subErr == nil && len(subtasks) > 0 { exec.Status = "BLOCKED" if err := p.store.UpdateTaskState(t.ID, task.StateBlocked); err != nil { ... } } else { exec.Status = "READY" if err := p.store.UpdateTaskState(t.ID, task.StateReady); err != nil { ... } } } else { exec.Status = "COMPLETED" if err := p.store.UpdateTaskState(t.ID, task.StateCompleted); err != nil { ... } p.maybeUnblockParent(t.ParentTaskID) } if t.StoryID != "" { // ...checkStoryCompletion / checkValidationResult } } ``` Replace it with: ```go } else { p.mu.Lock() p.consecutiveFailures[agentType] = 0 p.mu.Unlock() if t.CheckerForTaskID != "" { // Checker task succeeded — auto-accept the checked task. exec.Status = "COMPLETED" if err := p.store.UpdateTaskState(t.ID, task.StateCompleted); err != nil { p.logger.Error("handleRunResult: failed to complete checker task", "taskID", t.ID, "error", err) } checkedTask, getErr := p.store.GetTask(t.CheckerForTaskID) if getErr == nil { if acceptErr := p.store.UpdateTaskState(t.CheckerForTaskID, task.StateCompleted); acceptErr != nil { p.logger.Error("handleRunResult: failed to auto-accept checked task", "taskID", t.CheckerForTaskID, "error", acceptErr) } else if checkedTask.StoryID != "" { go p.checkStoryCompletion(ctx, checkedTask.StoryID) } } else { p.logger.Error("handleRunResult: failed to get checked task", "taskID", t.CheckerForTaskID, "error", getErr) } } else if t.ParentTaskID == "" { subtasks, subErr := p.store.ListSubtasks(t.ID) if subErr != nil { p.logger.Error("failed to list subtasks", "taskID", t.ID, "error", subErr) } if subErr == nil && len(subtasks) > 0 { exec.Status = "BLOCKED" if err := p.store.UpdateTaskState(t.ID, task.StateBlocked); err != nil { p.logger.Error("failed to update task state", "taskID", t.ID, "state", task.StateBlocked, "error", err) } } else { exec.Status = "READY" if err := p.store.UpdateTaskState(t.ID, task.StateReady); err != nil { p.logger.Error("failed to update task state", "taskID", t.ID, "state", task.StateReady, "error", err) } go p.spawnCheckerTask(ctx, t) } } else { exec.Status = "COMPLETED" if err := p.store.UpdateTaskState(t.ID, task.StateCompleted); err != nil { p.logger.Error("failed to update task state", "taskID", t.ID, "state", task.StateCompleted, "error", err) } p.maybeUnblockParent(t.ParentTaskID) } if t.StoryID != "" { storyID := t.StoryID go func() { story, getErr := p.store.GetStory(storyID) if getErr != nil { p.logger.Error("handleRunResult: failed to get story", "storyID", storyID, "error", getErr) return } if story.Status == task.StoryValidating { p.checkValidationResult(ctx, storyID, task.StateCompleted, "") } else { p.checkStoryCompletion(ctx, storyID) } }() } } ``` - [ ] **Step 6: Modify `handleRunResult` — failure path, attach checker report** In the generic failure `else` case (where `exec.Status = "FAILED"` is set and `consecutiveFailures` is incremented), add after the failures increment: ```go p.mu.Lock() p.consecutiveFailures[agentType]++ p.mu.Unlock() // If this is a checker task, attach the failure report to the checked task. if t.CheckerForTaskID != "" { report := exec.ErrorMsg if reportErr := p.store.UpdateTaskCheckerReport(t.CheckerForTaskID, report); reportErr != nil { p.logger.Error("handleRunResult: failed to set checker report", "taskID", t.CheckerForTaskID, "error", reportErr) } } ``` Also update the checker report after summary extraction (around the `summary := exec.Summary` block), to prefer summary over error message when available. After the summary is resolved, add: ```go if t.CheckerForTaskID != "" && exec.Status == "FAILED" && summary != "" { // Overwrite the initial error-message report with the richer summary. if reportErr := p.store.UpdateTaskCheckerReport(t.CheckerForTaskID, summary); reportErr != nil { p.logger.Error("handleRunResult: failed to update checker report with summary", "taskID", t.CheckerForTaskID, "error", reportErr) } } ``` - [ ] **Step 7: Run the checker tests** ```bash cd /workspace/claudomator && go test ./internal/executor/... -run "TestPool_Checker" -v -timeout 30s ``` Expected: all PASS. - [ ] **Step 8: Run full executor tests** ```bash cd /workspace/claudomator && go test ./internal/executor/... -race -timeout 120s ``` Expected: all PASS. - [ ] **Step 9: Commit** ```bash git add internal/executor/executor.go internal/executor/executor_test.go git commit -m "feat: spawn checker task on READY; auto-accept on pass; attach report on fail" ``` --- ## Task 4: Story ship gate — remove auto-deploy, add explicit ship endpoint **Files:** - Modify: `internal/executor/executor.go` - Modify: `internal/api/server.go` - Modify: `internal/api/stories.go` - Test: `internal/api/server_test.go` - [ ] **Step 1: Write failing test for ship endpoint** In `internal/api/server_test.go`, add: ```go func TestShipStory_ShippableStory_Returns202(t *testing.T) { srv, store := testServer(t) // Create a project with a deploy script (empty path — deploy will fail but that's OK for this test). proj := &task.Project{ ID: "ship-proj-1", Name: "test", RemoteURL: "https://github.com/x/y", Type: "web", DeployScript: "", CreatedAt: time.Now().UTC(), UpdatedAt: time.Now().UTC(), } if err := store.CreateProject(proj); err != nil { t.Fatalf("CreateProject: %v", err) } story := &task.Story{ ID: "ship-story-1", Name: "Ship Test", ProjectID: "ship-proj-1", Status: task.StoryShippable, CreatedAt: time.Now().UTC(), UpdatedAt: time.Now().UTC(), } if err := store.CreateStory(story); err != nil { t.Fatalf("CreateStory: %v", err) } req := httptest.NewRequest("POST", "/api/stories/ship-story-1/ship", nil) w := httptest.NewRecorder() srv.Handler().ServeHTTP(w, req) if w.Code != http.StatusAccepted { t.Errorf("expected 202, got %d: %s", w.Code, w.Body.String()) } } func TestShipStory_NonShippable_Returns409(t *testing.T) { srv, store := testServer(t) story := &task.Story{ ID: "nonship-1", Name: "Not Ready", ProjectID: "", Status: task.StoryInProgress, CreatedAt: time.Now().UTC(), UpdatedAt: time.Now().UTC(), } if err := store.CreateStory(story); err != nil { t.Fatalf("CreateStory: %v", err) } req := httptest.NewRequest("POST", "/api/stories/nonship-1/ship", nil) w := httptest.NewRecorder() srv.Handler().ServeHTTP(w, req) if w.Code != http.StatusConflict { t.Errorf("expected 409, got %d", w.Code) } } ``` - [ ] **Step 2: Run tests to verify they fail** ```bash cd /workspace/claudomator && go test ./internal/api/... -run "TestShipStory" -v ``` Expected: FAIL — `404 page not found` (route doesn't exist yet). - [ ] **Step 3: Remove auto-deploy from `checkStoryCompletion` and add status guard** In `executor.go`, replace `checkStoryCompletion`: ```go func (p *Pool) checkStoryCompletion(ctx context.Context, storyID string) { story, err := p.store.GetStory(storyID) if err != nil { p.logger.Error("checkStoryCompletion: failed to get story", "storyID", storyID, "error", err) return } if story.Status != task.StoryInProgress { return // already SHIPPABLE or beyond — nothing to do } tasks, err := p.store.ListTasksByStory(storyID) if err != nil { p.logger.Error("checkStoryCompletion: failed to list tasks", "storyID", storyID, "error", err) return } if len(tasks) == 0 { return } topLevelCount := 0 for _, t := range tasks { if t.ParentTaskID != "" { continue // subtasks are covered by their parent } topLevelCount++ if t.State != task.StateCompleted && t.State != task.StateReady { return // not all top-level tasks done } } if topLevelCount == 0 { return } if err := p.store.UpdateStoryStatus(storyID, task.StoryShippable); err != nil { p.logger.Error("checkStoryCompletion: failed to update story status", "storyID", storyID, "error", err) return } p.logger.Info("story transitioned to SHIPPABLE", "storyID", storyID) // Deploy is now triggered explicitly by the human via POST /api/stories/{id}/ship. } ``` - [ ] **Step 4: Add `ShipStory` to Pool** Add after `checkStoryCompletion`: ```go // ShipStory merges the story branch and runs the deploy script. // Returns an error if the story is not in SHIPPABLE state. func (p *Pool) ShipStory(ctx context.Context, storyID string) error { story, err := p.store.GetStory(storyID) if err != nil { return fmt.Errorf("story not found: %w", err) } if story.Status != task.StoryShippable { return fmt.Errorf("story is not SHIPPABLE (current status: %s)", story.Status) } go p.triggerStoryDeploy(ctx, storyID) return nil } ``` - [ ] **Step 5: Register the route in `server.go`** In the `routes()` method, after the existing story routes, add: ```go s.mux.HandleFunc("POST /api/stories/{id}/ship", s.handleShipStory) ``` - [ ] **Step 6: Add `handleShipStory` to `stories.go`** Add at the end of `stories.go`: ```go // handleShipStory triggers the merge + deploy for a SHIPPABLE story. // POST /api/stories/{id}/ship func (s *Server) handleShipStory(w http.ResponseWriter, r *http.Request) { id := r.PathValue("id") if err := s.pool.ShipStory(r.Context(), id); err != nil { writeJSON(w, http.StatusConflict, map[string]string{"error": err.Error()}) return } writeJSON(w, http.StatusAccepted, map[string]string{"message": "story shipping initiated", "story_id": id}) } ``` - [ ] **Step 7: Run the ship tests** ```bash cd /workspace/claudomator && go test ./internal/api/... -run "TestShipStory" -v ``` Expected: both PASS. - [ ] **Step 8: Run full test suite** ```bash cd /workspace/claudomator && go test ./... -race -timeout 120s ``` Expected: all PASS. - [ ] **Step 9: Commit** ```bash git add internal/executor/executor.go internal/api/server.go internal/api/stories.go internal/api/server_test.go git commit -m "feat: story ship gate — explicit POST /api/stories/{id}/ship; remove auto-deploy" ``` --- ## Task 5: Elaborator — acceptance criteria per story task **Files:** - Modify: `internal/api/elaborate.go` - Modify: `internal/api/stories.go` - Test: `internal/api/stories_test.go` - [ ] **Step 1: Write failing test** In `internal/api/stories_test.go`, find (or add) a test for story approval and verify acceptance criteria flows through: ```go func TestApproveStory_AcceptanceCriteriaStored(t *testing.T) { srv, store := testServer(t) proj := &task.Project{ ID: "ac-proj", Name: "test", RemoteURL: "https://github.com/x/y", Type: "web", DeployScript: "", CreatedAt: time.Now().UTC(), UpdatedAt: time.Now().UTC(), } store.CreateProject(proj) body := `{ "name": "AC Story", "branch_name": "story/ac-test", "project_id": "ac-proj", "tasks": [ { "name": "Add feature", "instructions": "implement the thing", "acceptance_criteria": "run go test ./... and verify all pass", "subtasks": [] } ], "validation": {"type": "test", "steps": [], "success_criteria": "tests pass"} }` req := httptest.NewRequest("POST", "/api/stories/approve", strings.NewReader(body)) req.Header.Set("Content-Type", "application/json") w := httptest.NewRecorder() srv.Handler().ServeHTTP(w, req) if w.Code != http.StatusCreated { t.Fatalf("expected 201, got %d: %s", w.Code, w.Body.String()) } var resp struct { TaskIDs []string `json:"task_ids"` } json.NewDecoder(w.Body).Decode(&resp) if len(resp.TaskIDs) == 0 { t.Fatal("expected task_ids in response") } tk, err := store.GetTask(resp.TaskIDs[0]) if err != nil { t.Fatalf("GetTask: %v", err) } if tk.AcceptanceCriteria != "run go test ./... and verify all pass" { t.Errorf("expected acceptance criteria stored on task, got %q", tk.AcceptanceCriteria) } } ``` - [ ] **Step 2: Run test to verify it fails** ```bash cd /workspace/claudomator && go test ./internal/api/... -run "TestApproveStory_AcceptanceCriteriaStored" -v ``` Expected: FAIL — acceptance criteria is empty on the created task. - [ ] **Step 3: Add `AcceptanceCriteria` to `elaboratedStoryTask` in `elaborate.go`** ```go type elaboratedStoryTask struct { Name string `json:"name"` Instructions string `json:"instructions"` AcceptanceCriteria string `json:"acceptance_criteria"` Subtasks []elaboratedStorySubtask `json:"subtasks"` } ``` - [ ] **Step 4: Update `buildStoryElaboratePrompt` to request acceptance criteria** In `buildStoryElaboratePrompt()`, update the JSON schema in the returned string. Replace the tasks section: ```go func buildStoryElaboratePrompt() string { return `You are a software architect. Given a goal, analyze the codebase at /workspace and produce a structured implementation plan as JSON. Output ONLY valid JSON matching this schema: { "name": "story name", "branch_name": "story/kebab-case-name", "tasks": [ { "name": "task name", "instructions": "detailed instructions including file paths and what to change", "acceptance_criteria": "specific, verifiable conditions a separate reviewer can check — e.g. 'run go test ./... and verify all pass; confirm GET /api/foo returns 200 with expected JSON shape'", "subtasks": [ { "name": "subtask name", "instructions": "..." } ] } ], "validation": { "type": "build|test|smoke", "steps": ["step1", "step2"], "success_criteria": "what success looks like" } } Rules: - Tasks must be independently buildable (each can be deployed alone) - Subtasks within a task are order-dependent and run sequentially - Instructions must include specific file paths, function names, and exact changes - Instructions must end with: git add -A && git commit -m "..." && git push origin - acceptance_criteria must be concrete and verifiable by a separate agent — no vague assertions like "code looks good" - Validation should match the scope: small change = build check; new feature = smoke test` } ``` - [ ] **Step 5: Pass `AcceptanceCriteria` through in `handleApproveStory`** In `stories.go`, inside `handleApproveStory`, find the task creation block (the `for _, tp := range input.Tasks` loop). Add `AcceptanceCriteria` to the `task.Task` literal: ```go t := &task.Task{ ID: uuid.New().String(), Name: tp.Name, Project: input.ProjectID, RepositoryURL: repoURL, StoryID: story.ID, AcceptanceCriteria: tp.AcceptanceCriteria, Agent: task.AgentConfig{Type: "claude", Instructions: tp.Instructions}, Priority: task.PriorityNormal, Tags: []string{}, DependsOn: []string{}, Retry: task.RetryConfig{MaxAttempts: 1, Backoff: "exponential"}, State: task.StatePending, CreatedAt: time.Now().UTC(), UpdatedAt: time.Now().UTC(), } ``` - [ ] **Step 6: Run the test** ```bash cd /workspace/claudomator && go test ./internal/api/... -run "TestApproveStory_AcceptanceCriteriaStored" -v ``` Expected: PASS. - [ ] **Step 7: Run full API tests** ```bash cd /workspace/claudomator && go test ./internal/api/... -race -timeout 120s ``` Expected: all PASS. - [ ] **Step 8: Commit** ```bash git add internal/api/elaborate.go internal/api/stories.go internal/api/stories_test.go git commit -m "feat: acceptance_criteria per story task in elaboration and approval" ``` --- ## Task 6: UI — Ship button and checker report **Files:** - Modify: `web/app.js` - [ ] **Step 1: Add "Ship" button to SHIPPABLE story cards** In `renderStoryCard`, after the `meta` element is appended to `card`, add: ```js export function renderStoryCard(story, doc = document) { // ... existing code building header, badge, meta ... card.appendChild(header); if (meta.children.length) card.appendChild(meta); // Ship button for SHIPPABLE stories. if (story.status === 'SHIPPABLE') { const shipBtn = doc.createElement('button'); shipBtn.className = 'btn-primary story-ship-btn'; shipBtn.textContent = 'Ship'; shipBtn.addEventListener('click', async (e) => { e.stopPropagation(); shipBtn.disabled = true; shipBtn.textContent = 'Shipping…'; try { const res = await fetch(`${API_BASE}/api/stories/${story.id}/ship`, { method: 'POST' }); if (!res.ok) { const body = await res.json().catch(() => ({})); alert(body.error || `Ship failed (${res.status})`); shipBtn.disabled = false; shipBtn.textContent = 'Ship'; } else { renderStoriesPanel(); } } catch { shipBtn.disabled = false; shipBtn.textContent = 'Ship'; } }); card.appendChild(shipBtn); } return card; } ``` > Note: `API_BASE` is a module-level constant already defined in `app.js`. Verify it's accessible in this scope; if not, use `BASE_PATH` (also defined at module level) instead. - [ ] **Step 2: Add checker report to READY task cards** In `createTaskCard`, after the `// Error message for failed tasks` block, add: ```js // Checker report for READY tasks where the checker flagged a problem. if (task.state === 'READY' && task.checker_report) { const reportEl = document.createElement('div'); reportEl.className = 'task-checker-report'; const label = document.createElement('span'); label.className = 'task-checker-report-label'; label.textContent = '⚠ Checker flagged:'; const text = document.createElement('span'); text.textContent = task.checker_report; reportEl.appendChild(label); reportEl.appendChild(text); card.appendChild(reportEl); } ``` - [ ] **Step 3: Add CSS for checker report** In `web/style.css`, add after the `.ready-completed-label` block: ```css .task-checker-report { margin: 0.5rem 0; padding: 0.5rem 0.75rem; background: var(--warning-bg, rgba(255, 180, 0, 0.12)); border-left: 3px solid var(--warning, #f0a500); border-radius: 4px; font-size: 0.8rem; color: var(--text); } .task-checker-report-label { font-weight: 600; margin-right: 0.4rem; } ``` - [ ] **Step 4: Build and verify** ```bash cd /workspace/claudomator && go build ./... ``` Expected: no errors. - [ ] **Step 5: Commit** ```bash git add web/app.js web/style.css git commit -m "feat: Ship button on SHIPPABLE stories; checker report on READY task cards" ``` --- ## Task 7: Full test run and deploy **Files:** none - [ ] **Step 1: Run full test suite with race detector** ```bash cd /workspace/claudomator && go test ./... -race -timeout 120s ``` Expected: all PASS. - [ ] **Step 2: Push and deploy** ```bash git push && sudo scripts/deploy ``` Expected: build passes, tests pass, binary installs, service restarts. --- ## Self-Review **Spec coverage:** - ✅ Checker spawned after task → READY (Task 3) - ✅ Checker uses acceptance_criteria or falls back to task instructions (Task 3) - ✅ Pass → auto-accept (READY → COMPLETED) (Task 3) - ✅ Fail → task stays READY + checker_report attached (Task 3) - ✅ No checker for subtasks or checker tasks (Task 3, guards in spawnCheckerTask) - ✅ Story elaborator generates acceptance_criteria per task (Task 5) - ✅ `checkStoryCompletion` no longer auto-deploys (Task 4) - ✅ `POST /api/stories/{id}/ship` endpoint (Task 4) - ✅ Ship button in UI (Task 6) - ✅ Checker report shown on READY task cards (Task 6) - ✅ New DB columns + migrations (Task 2) **Placeholder scan:** none found. **Type consistency:** `UpdateTaskCheckerReport(id, report string)`, `GetCheckerTask(checkedTaskID string) (*task.Task, error)`, `ShipStory(ctx context.Context, storyID string) error` — all consistent across Tasks 2, 3, 4.