5 files changed, 70 insertions, 9 deletions
diff --git a/ARCHITECT_ROLE.md b/ARCHITECT_ROLE.md
index 5697f69..5e96c4c 100644
--- a/ARCHITECT_ROLE.md
+++ b/ARCHITECT_ROLE.md
@@ -56,6 +56,22 @@
 **Tool Usage Protocol:**
 *   **Execution:** When you state you are creating or updating a file (e.g., `instructions.md`, `SESSION_STATE.md`), you **MUST** execute the `write_file` tool. Do not just describe the content; write it to the disk.
 
-**Self-Improvement:**
-*   **Meta-Review:** Periodically (e.g., after completing a major phase or encountering friction), suggest refinements to this Role Definition (`ARCHITECT_ROLE.md`) to better align with the user's needs and project workflow.
-*   **Reflection:** Ask yourself: "Did my instructions lead to clean code, or did they force the Implementor into a hacky solution?"
+**Self-Improvement Cycle:**
+
+After completing each task (when it reaches `[APPROVED]`), perform this cycle:
+
+1.  **Reflect (mandatory):** Answer these questions honestly:
+    *   Did my instructions lead to clean code, or did they force the Implementor into a hacky solution?
+    *   Did the Implementor need to ask for clarification, or were the instructions unambiguous?
+    *   Did the Reviewer find architectural issues I should have caught during planning?
+    *   Were there repeated `[NEEDS_FIX]` cycles that better instructions could have prevented?
+
+2.  **Improve (1-3 actions):** Based on reflection, perform at least one concrete improvement:
+    *   **Instructions template:** If the Implementor struggled, refine the instruction format in this file (e.g., add a required "Affected Tests" section, add file path specificity requirements).
+    *   **ADR gaps:** If an architectural decision was made implicitly during implementation, capture it now as an ADR in `docs/adr/`.
+    *   **Bug patterns:** If a bug revealed a systemic issue (e.g., missing CSRF, env var dependency), add a "Known Pitfall" to `DESIGN.md` so future instructions proactively address it.
+    *   **Role definition:** If workflow friction occurred (e.g., state handoff confusion, unclear ownership), update this file or the other role files to prevent recurrence.
+    *   **Tooling:** If a manual step was error-prone, propose or create a script in `scripts/` to automate it.
+    *   **SESSION_STATE format:** If state tracking was unclear, refine the template or status tag definitions.
+
+3.  **Record:** Note what was improved and why in `SESSION_STATE.md` under a "Process Improvements" section so the team can track what changed.
diff --git a/CLAUDE.md b/CLAUDE.md
index b62f21e..b773025 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -16,6 +16,8 @@ A unified web dashboard aggregating Trello, Todoist, PlanToEat, Google Calendar,
 
 Role definitions exist for a three-agent pipeline (`ARCHITECT_ROLE.md`, `IMPLEMENTOR_ROLE.md`, `REVIEWER_ROLE.md`) but the **primary workflow is single-agent** — reading CLAUDE.md + DESIGN.md directly. The role files are reference material, not required reading for every session.
 
+**Self-Improvement Cycle:** After completing each task, every role must reflect, perform 1-3 concrete improvements (test helpers, scripts, checklists, role definitions, docs), and record changes in `SESSION_STATE.md` → "Process Improvements". See each role file for role-specific details.
+
 ## Efficiency & Token Management
 - **Context Minimization:** Use Grep/Glob with offset/limit rather than reading entire files when a targeted search suffices.
 - **Surgical Edits:** Perform small, targeted file edits. Avoid rewriting entire files for single changes.
diff --git a/DESIGN.md b/DESIGN.md
index a62ec47..e139ab7 100644
--- a/DESIGN.md
+++ b/DESIGN.md
@@ -71,6 +71,17 @@ Tasks flow through these states:
 5. **Reviewer** reads code, runs tests, writes `review_feedback.md`
 6. **Reviewer** marks `[APPROVED]` or `[NEEDS_FIX]` with specific feedback
 7. If `[NEEDS_FIX]`, **Implementor** addresses feedback, returns to step 4
+8. **All roles** run self-improvement cycle: reflect, perform 1-3 improvements, record in `SESSION_STATE.md`
+
+### Self-Improvement Cycle
+
+After completing each task, every role performs a structured improvement cycle:
+
+1. **Reflect** — Answer role-specific questions about what went well and what caused friction
+2. **Improve (1-3 actions)** — Make concrete changes: extract test helpers, add scripts, update checklists, fix role definitions, document bug patterns, create ADRs for implicit decisions
+3. **Record** — Log improvements in `SESSION_STATE.md` → "Process Improvements" for cross-session visibility
+
+See each role file (`*_ROLE.md`) for the full role-specific checklist of reflection questions and improvement actions.
 
 ### Principles
 
diff --git a/IMPLEMENTOR_ROLE.md b/IMPLEMENTOR_ROLE.md
index be12a9d..330e5e2 100644
--- a/IMPLEMENTOR_ROLE.md
+++ b/IMPLEMENTOR_ROLE.md
@@ -46,6 +46,22 @@
 *   **Terminal:** Use Bash for `go test`, `go build`, `go mod tidy`, etc.
 *   **Editing:** Prefer Edit for targeted edits, Write for new files only.
 
-**Self-Improvement:**
-*   **Reflection:** After completing a task, ask: "Did I follow TDD? Is the code clean enough that the Reviewer won't find major issues?"
-*   **Optimization:** Look for ways to make your edits more surgical and less prone to breaking surrounding code.
+**Self-Improvement Cycle:**
+
+After completing each task (when marking `[REVIEW_READY]`), perform this cycle:
+
+1.  **Reflect (mandatory):** Answer these questions honestly:
+    *   Did I write the test first, or did I skip to production code? If I skipped, why?
+    *   Did I break any existing tests? If so, was it because I didn't check the full suite early enough?
+    *   Did I make surgical edits, or did I rewrite more than necessary?
+    *   Were there patterns I repeated manually that could be extracted into a helper?
+
+2.  **Improve (1-3 actions):** Based on reflection, perform at least one concrete improvement:
+    *   **Test helpers:** If you wrote repetitive test setup or assertions, extract a reusable helper into `handlers_test.go` (e.g., `assertTemplateContains`, `newTestHandler`, mock builders).
+    *   **Scripts:** If a manual verification step was needed (checking logs, restarting, deploying), propose or update a script in `scripts/`.
+    *   **Bug pattern docs:** If you hit a subtle bug (HTMX targeting, CSRF, nil pointers), add it to the "Common Bug Patterns" section of `MEMORY.md` so it's caught proactively next time.
+    *   **Acceptance tests:** If your change broke `test/acceptance_test.go` because of a signature change, consider whether the acceptance test setup could be more resilient (e.g., builder pattern for `handlers.New()`).
+    *   **Role definition:** If the workflow instructions didn't match reality (e.g., missing a step, wrong tool name), update this file.
+    *   **Code patterns:** If you discovered a cleaner way to handle a common operation (e.g., HTMX partial responses, template data structs), document it in `DESIGN.md` → Development Guide.
+
+3.  **Record:** Note what was improved and why in `SESSION_STATE.md` under a "Process Improvements" section so improvements are visible across sessions.
diff --git a/REVIEWER_ROLE.md b/REVIEWER_ROLE.md
index ca3baa4..c4574bb 100644
--- a/REVIEWER_ROLE.md
+++ b/REVIEWER_ROLE.md
@@ -73,6 +73,22 @@
 *   **Execution:** Use Bash to run tests (`go test ./...`, `go test -cover ./...`).
 *   **Reporting:** Use Write to publish `review_feedback.md` and Edit to update `SESSION_STATE.md`.
 
-**Self-Improvement:**
-*   **Reflection:** After a review cycle, ask: "Did my feedback help the Implementor improve the code, or did it just create busy work?"
-*   **Calibration:** Periodically check `ARCHITECT_ROLE.md` to ensure your quality standards align with the architectural vision.
+**Self-Improvement Cycle:**
+
+After completing each review cycle (when marking `[APPROVED]` or `[NEEDS_FIX]`), perform this cycle:
+
+1.  **Reflect (mandatory):** Answer these questions honestly:
+    *   Did my feedback help the Implementor improve the code, or did it just create busy work?
+    *   Did I catch real issues, or did I nitpick style preferences that don't affect correctness?
+    *   Were there bugs or quality issues I missed that surfaced later in production?
+    *   Did the `[NEEDS_FIX]` → `[REVIEW_READY]` cycle resolve quickly, or did it ping-pong?
+
+2.  **Improve (1-3 actions):** Based on reflection, perform at least one concrete improvement:
+    *   **Review checklist:** If you missed an issue category (e.g., HTMX targeting, CSRF in new pages, nil pointer risks), add it to the Test Quality Analysis or Critique sections of this file.
+    *   **Feedback template:** If your feedback was unclear and caused a bad fix, refine the `review_feedback.md` structure (e.g., add a "Reproduction Steps" field for bugs, add "Suggested Fix" for critical issues).
+    *   **Quality standards:** If the Architect's vision evolved (new patterns, deprecated approaches), update the Critique checklist to match. Re-read `ARCHITECT_ROLE.md` and recent ADRs.
+    *   **Coverage gaps:** If you found that a whole category of code lacks tests (e.g., API clients, middleware), flag it in `SESSION_STATE.md` under "Known Gaps" for future work.
+    *   **False positives:** If you raised issues that were intentional design choices, note them as "Accepted Patterns" in this file to avoid re-flagging them.
+    *   **Tooling:** If manual review steps could be automated (e.g., checking for missing CSRF tokens, verifying HTMX targets), propose a linter rule or test helper.
+
+3.  **Record:** Note what was improved and why in `SESSION_STATE.md` under a "Process Improvements" section so the team can track review quality over time.