summaryrefslogtreecommitdiff
path: root/docs/adr/007-planning-layer-and-story-model.md
blob: 2ca2afd3a07273a5dc0c344daf577331591872e0 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
# ADR-007: Planning Layer, Task Hierarchy, and Story-Gated Deployment

**Status:** Draft
**Date:** 2026-03-19
**Context:** Design discussion exploring the integration of Claudomator with Doot and a richer task hierarchy model.

---

## Context

Claudomator currently operates as a flat queue of tasks, each with optional subtasks (`parent_task_id`). There is no concept of grouping tasks into shippable units, no deploy automation, and no integration with personal planning tools. Separately, Doot is a personal dashboard that aggregates tasks, meals, calendar events, and bugs from third-party services (Todoist, Trello, PlanToEat, Google Calendar) into a unified `Atom` model.

The goal of this ADR is to capture a design direction that:

1. Integrates Claudomator into Doot as a first-class data source
2. Introduces a four-level task hierarchy (Epic → Story → Task → Subtask)
3. Defines a branching and execution model for stories
4. Establishes stories as the unit that gates deployment

---

## Decision

### 1. Claudomator as an Atom Source in Doot

Doot already normalizes heterogeneous data sources into a unified `Atom` model (see `internal/models/atom.go`). Claudomator tasks are a natural peer to Todoist and Trello — they are their own source of truth (SQLite, full execution history) and should be surfaced in Doot's aggregation views without duplication elsewhere.

**Design:**
- Add `SourceClaudomator AtomSource = "claudomator"` to Doot's atom model
- Implement a Claudomator API client in `internal/api/claudomator.go` (analogous to `todoist.go`, `trello.go`)
- Map Claudomator tasks to `Atom` with appropriate priority, status, and source icon
- Individual subtasks are **not** surfaced in the Doot timeline — they are execution-level details, not planning-level items

**Rationale:** Claudomator is a peer to other task sources, not subordinate to them. Users should not need a Todoist card to track agent work — Claudomator is the source of truth for that domain.

---

### 2. Four-Level Task Hierarchy

The current flat model (task + optional subtask) is insufficient for feature-scale work. The following hierarchy is adopted:

| Level | Name | Description |
|---|---|---|
| 4 | **Epic** | Large design initiative requiring back-and-forth, resulting in a set of stories. Lives primarily in the planning layer (Doot). Not an execution unit. |
| 3 | **Story** | A shippable slice of work. Independent and deployable on its own. Groups tasks that together constitute a releasable change. The unit that gates deployment. |
| 2 | **Task** | A feature- or bug-level unit of work. Individually buildable, but may not make sense to ship alone. Belongs to a story. |
| 1 | **Subtask** | A discrete, ordered agent action. The actual Claudomator execution unit. Belongs to a task. Performed in sequence. |

**Key properties:**
- Stories are independently shippable — deployment is gated at this level
- Tasks are individually buildable but do not gate deployment alone
- Subtasks are the agent execution primitive — what `ContainerRunner` actually runs
- Epics are planning artifacts; they live in Doot or a future planning layer, not in Claudomator's execution model
- Scheduling prefers picking up subtasks from **already-started stories** before beginning new ones (WIP limiting)

**Claudomator data model changes required:**
- Add `stories` table with deploy configuration and status
- Add `story_id` to tasks (foreign key to stories)
- `repository_url` moves from individual tasks to stories (all tasks in a story operate on the same repo)
- Story status is derived: all tasks completed → story is shippable

---

### 3. Story-Level Branching Model

Each story has a dedicated Git branch. Subtasks execute sequentially, each cloning the repository at the story branch's current HEAD, making commits, and pushing back before the next subtask begins.

**Model:** One branch per story. Fresh clone + container per subtask. Subtasks commit to the story branch in sequence.

**Properties:**

- **Each subtask sees all prior subtask work** — it clones the story branch at HEAD, which includes all previous subtask commits
- **Clean environment per subtask** — no filesystem state leaks between subtasks; the container is ephemeral
- **Ordered execution enforced** — subtasks run strictly in order; each depends on the previous commit
- **Reviewable history** — the story branch accumulates one commit per subtask, giving a clean, auditable record before merge
- **Clear recovery points** — if subtask N fails, roll back to subtask N-1's commit, fix the subtask definition, rerun

**Tradeoffs accepted:**
- Clone and container creation cost is paid per subtask (not amortized across the story). Acceptable at current usage scale.
- No parallelism within a story — subtasks are strictly sequential by design
- Concurrency lock required at the story level to prevent two subtasks running simultaneously (e.g., on retry races)

**Rejected alternatives:**

*Isolated commit model (fresh clone per subtask, independent branches):* Clean but subtasks cannot build on each other's work. Requires careful branch ordering and merging to assemble a story.

*Persistent workspace per story (one container, one clone for the life of the story):* More efficient, natural continuity, but a bad subtask can corrupt the workspace for subsequent subtasks. Recovery is harder. Loses the discipline of enforced commit points.

### Sequential Subtask Execution

Subtasks within a story execute sequentially. This is enforced via `depends_on` links set automatically at task creation time — each subtask added to a story gets `depends_on: [previous_subtask_id]`, forming a linear chain. The existing pool dependency mechanism handles the rest.

**Rejected alternative — pool-level story concurrency lock:** Would require the executor to become story-aware, lock state would be in-memory (fragile across restarts), and the ordering would be invisible in the data model. The `depends_on` approach is durable, inspectable, and reuses existing infrastructure. The 5-second polling delay between subtasks is an accepted tradeoff.

---

### 4. Story-Gated Deployment and Agent Validation

Deployment is triggered at the story level, not the task or subtask level.

#### State Machine

```
PENDING → IN_PROGRESS → SHIPPABLE → DEPLOYED → VALIDATING → REVIEW_READY
                                                           ↘ NEEDS_FIX → IN_PROGRESS (retry)
```

- **SHIPPABLE:** All tasks completed. Ready to merge and deploy.
- **DEPLOYED:** Merged to main, deploy triggered.
- **VALIDATING:** Validation agent is running.
- **REVIEW_READY:** Validation passed. Awaiting human sign-off.
- **NEEDS_FIX:** Validation failed. Story returns to `IN_PROGRESS` with the validation report attached.

#### Merge Strategy

Merge to main first, then validate against the live deployment. No branch review phase — tests are the confidence mechanism. If test coverage is insufficient for a given story, the implementor is responsible for adding tests before marking it shippable. Branch review may be introduced later if needed.

#### Deploy Configuration

Stored on the story. Two project types are handled:

| Project | Deploy trigger | What "deployed" means |
|---|---|---|
| claudomator | `git push` to local bare repo → systemd pulls and restarts | Live at `doot.terst.org` |
| nav (Android) | `git push` to GitHub → CI build action fires | APK distributed to testers via Play Store testing track |

For nav, Claudomator does not interact with GitHub CI directly — it pushes the branch/commits; the CI action is an external trigger. "Deployed" is declared once the push succeeds; the CI result is not polled.

#### Agent Validation

After a story is deployed, a validation subtask is automatically created. The elaborator is responsible for specifying how validation should be performed — it has full context of what changed and can prescribe the appropriate check level.

**Validation spec** (produced by elaborator, stored on the story):

```yaml
validation:
  type: curl         # curl | tests | playwright | gradle
  steps:
    - "GET /api/stats — expect 200, body contains throughput[]"
    - "GET /api/agents/status — expect agents array non-empty"
  success_criteria: "All steps return expected responses with correct structure"
```

**Validation types by project:**

| Type | When to use | What the agent does |
|---|---|---|
| `curl` | API changes, data model additions, simple UI text | HTTP requests, check status codes and response shape |
| `tests` | Logic changes with existing test coverage | Runs the project test suite against the live deployment or codebase |
| `playwright` | Subtle UI changes, interactive flows, visual correctness | Browser automation against the deployed URL |
| `gradle` | nav (Android) — any change | `./gradlew test`, `./gradlew lint`; optionally `./gradlew assembleDebug` |

The elaborator selects `type` based on change scope. Curl is the default for small targeted changes; playwright is reserved for changes where visual or interactive correctness cannot be inferred from API responses alone.

**Validation agent inputs:**
- The validation spec (type, steps, success_criteria)
- Deployed URL or project path
- Summary of what changed (story name + task list)

**Validation agent outputs:**
- Structured pass/fail per step
- Evidence (response bodies, test output excerpts, screenshots for playwright)
- Overall verdict: pass → story moves to `REVIEW_READY`; fail → story moves to `NEEDS_FIX` with report attached

#### Failure Recovery

If a subtask fails mid-story: pause the story and require human review before resuming. The options at that point are:
- Roll back to the previous subtask's commit and retry
- Amend the subtask definition and requeue

Policy beyond this is deferred until failure patterns are observed in practice.

---

## Consequences

**Claudomator changes:**
- New `stories` table: `id, name, branch_name, project_id, deploy_config, validation_json, status`
- New `projects` table: `id, name, remote_url, local_path, type, deploy_script`
- `tasks.story_id` FK; `repository_url` removed from tasks (inherited from story → project)
- Sequential subtask ordering via auto-wired `depends_on` at task creation time
- Post-task-completion check: all story tasks COMPLETED → story transitions to SHIPPABLE → merge + deploy trigger
- Post-deploy: auto-create validation subtask from story's `validation_json` spec
- Validation subtask completes → story transitions to REVIEW_READY or NEEDS_FIX
- Story state machine: PENDING → IN_PROGRESS → SHIPPABLE → DEPLOYED → VALIDATING → REVIEW_READY | NEEDS_FIX
- `ContainerRunner`: clone at story branch HEAD; push back to story branch after each subtask
- Deployment status check moves from task level to story level
- Elaborator output extended: `validation` block (type, steps, success_criteria) stored as `validation_json` on story
- Remove `Agent.RepositoryURL`, `Agent.ProjectDir` legacy fields, `skip_planning`, `fallbackGitInit()`
- Remove duplicate changestats extraction (keep pool-side, remove API server-side)

**Doot changes:**
- New `SourceClaudomator` atom source
- Claudomator API client (`internal/api/claudomator.go`)
- Story → Atom mapper (title = story name, description = task progress e.g. "3/5 tasks done", priority from story config, deploy status)
- Task → Atom mapper (optional, feature-level visibility)
- Individual subtasks explicitly excluded from all views

**Doot removals (dead code / superseded):**
- `bugs` table, `BugToAtom`, `SourceBug`, `TypeBug` — bug reporting becomes a thin UI shim that submits to Claudomator; nothing stored in Doot's data model
- `notes` table and all Obsidian skeleton code — never wired up
- `AddMealToPlanner()` stub — never called
- `UpdateCard()` stub — never called
- All bug handlers, templates, and store methods

**Planning layer (future):**
- Epics live here, not in Claudomator
- Story creation via elaboration + validation flow (see below)
- WIP-limiting scheduler that prefers subtasks from started stories

### Story Creation: Elaboration and Validation Flow

Story creation is driven by a beefed-up version of Claudomator's existing elaboration and validation pipeline, not a YAML file or form.

**Flow:**
1. User describes the story goal (rough, high-level) in the UI
2. Elaboration agent runs against a **local working copy** of the project (read-only mount, no clone) — reads the codebase, understands current state, produces story + task + subtask breakdown with `depends_on` chain wired
3. Validation agent checks the structure: tasks are independently buildable, subtasks properly scoped, story has a clear shippable definition, no dependency cycles
4. User reviews and approves in the UI
5. On approval: story branch created (`git checkout -b story/xxx origin/main`, pushed to remote); subtasks queued

**Responsiveness:**
- Elaboration uses a local working copy — no clone cost, near-instant container start
- A `git fetch` (not pull) at elaboration start updates remote refs without touching the working tree
- Branch creation is deferred to approval — elaboration agent is purely read-only
- Execution clones use `git clone --reference /local/path <remote>` — reuses local object store, fetches only the delta; significantly faster than cold clone

### Project Registry

The local working copy model requires a formal project concept. A `projects` table replaces the current ad-hoc `repository_url` + `working_dir` fields:

| Field | Purpose |
|---|---|
| `id` | UUID |
| `name` | Human-readable label |
| `remote_url` | Git remote (clone target for execution) |
| `local_path` | Local working copy path (read cache for elaboration, object store for `--reference` clones) |
| `type` | `web` \| `android` — controls available validation types and deploy semantics |
| `deploy_script` | Optional path to project-specific deploy script |

`repository_url` on stories becomes a FK to `projects`. The existing `project` string field on tasks (currently just a label) is replaced by `project_id`. `Agent.RepositoryURL`, `Agent.ProjectDir`, and `Task.RepositoryURL` are all removed — project is the single source of truth for repo location.

**Initial registered projects:**

| Name | Local path | Remote | Type |
|---|---|---|---|
| claudomator | `/workspace/claudomator` | local bare repo | web |
| nav | `/workspace/nav` | GitHub | android |

---

## Out of Scope (for now)

- Voice interface (noted as a future exploration, not an architectural requirement)
- Epic management tooling
- Parallelism within stories
- Branch review before merge — deferred; merge-first is the current strategy. May be revisited if confidence requires it.
- Polling GitHub CI result for nav deploys — Claudomator declares "deployed" on push success; CI outcome is out of band
- ADB / emulator-based UI validation for nav — `gradle` type covers unit and integration tests; device UI testing deferred