| Age | Commit message (Collapse) | Author |
|
All coding tasks now follow the same flow regardless of runner: when
project_dir is set, the agent runs in a temp clone, not in the user's
working tree. On success, edits are autocommitted and pushed back to
origin/master and the sandbox is removed. On failure or BLOCKED, the
sandbox is preserved and its path surfaces in the error / BlockedError
so the user can inspect partial work or resume in place.
Before this commit, GeminiRunner.Run set cmd.Dir to project_dir
directly, so an agent run could leave half-done edits in the user's
working tree with no rollback. ClaudeRunner has had the full sandbox
flow for a while; this commit closes the gap.
Reused the existing package-level helpers from claude.go verbatim:
setupSandbox, teardownSandbox, sandboxCloneSource, gitSafe, plus the
resume/stale-sandbox/blocked-error patterns. No new shared abstraction
needed — same package.
LocalRunner intentionally not changed. The OpenAI chat path has no
tool use, so the agent can't edit files; sandbox would be theater.
Tests (6 new):
- Run_ProjectDir_RunsInSandbox: cwd captured by fake binary is a
sandbox path, not project_dir.
- Run_BlockedError_IncludesSandboxDir: when question.json appears,
BlockedError.SandboxDir is set and the dir exists.
- Run_ExecError_PreservesSandbox: failing exit wraps error with
"(sandbox preserved at <path>)" and the path exists on disk.
- Run_ResumeUsesStoredSandboxDir: ResumeSessionID + SandboxDir →
runs in that dir without re-cloning.
- Run_StaleSandboxDir_ClonesAfresh: resume pointing at missing
dir falls back to a fresh clone from project_dir.
- Run_NoProjectDir_SkipsSandbox: tasks without project_dir don't
trigger sandbox setup.
https://claude.ai/code/session_017Edeq947TpSm1vQTxMhi1J
|
|
Adds TestTeardownSandbox_CapturesExplicitCommits to cover the case
where the agent explicitly commits changes (no autocommit needed).
Previously only the autocommit path was tested; this confirms
teardownSandbox populates Commits for any commits ahead of origin.
https://claude.ai/code/session_01G4dT9JBWFFb8xGcSHenzRS
|
|
Close deferred work — real GeminiRunner subprocess, Local UI option
|
|
Closes the three items left on the deferred queue after the post-epic
cleanup.
GeminiRunner.execOnce now actually executes the gemini binary instead
of writing hardcoded stream data. Mirrors ClaudeRunner.execOnce:
- exec.CommandContext with the same env vars (CLAUDOMATOR_API_URL etc.)
- process group SIGKILL on context cancel
- stdout piped through parseGeminiStream → stdoutFile
- stderr to file
- exit codes captured, stderr tail surfaced on failure
Test infrastructure bug uncovered in passing: testServerWithGeminiMockRunner's
mock script used double-quoted echo with literal triple-backticks, which
bash interpreted as command substitution. The script always produced
empty output. The bug was invisible until now because GeminiRunner
ignored the script entirely. Switched to a single-quoted heredoc.
Frontend: index.html dropdown gains a "Local" option. No JS branching
needed — the value flows through to agent.type verbatim and downstream
display reads the type string as-is.
storage/db.go: removed stale debug-comment scaffolding (the "TODO:
Replace with proper logger" block) that was tracking a dead
`fmt.Printf` call. The path it commented on is fine without logging —
unmarshal errors are returned wrapped.
Test status: `go test -race ./...` green across every package, zero
skips, zero excluded tests.
https://claude.ai/code/session_017Edeq947TpSm1vQTxMhi1J
|
|
Post-epic cleanup — green test suite, no skips
|
|
Addresses the cleanup queue captured in docs/plans/local-oss-runner.md
after the local-OSS-models epic landed. After this commit
`go test -race ./...` is green across every package with zero `t.Skip`
calls and no excluded tests.
Real bugs fixed:
- claude.go setupSandbox callsites used `sandboxDir, err := ...` which
shadowed the outer variable, so BlockedError.SandboxDir was always
empty. Resume-after-block was broken for both new and stale-sandbox
paths. TestBlockedError_IncludesSandboxDir now exercises the right
invariant.
- TestPool_ActivePerAgent_DeletesZeroEntries flake under -race: the
cleanup defer in execute()/executeResume() runs AFTER
handleRunResult sends on resultCh, so consumers observing a result
could see a still-counted activePerAgent entry. Extracted
decActiveAgent(agentType, *cleaned) helper; called explicitly before
every resultCh send, defer becomes a no-op via the cleaned flag.
Verified clean over `go test -race -count=10`.
Test infrastructure made hermetic:
- gitSafe now also passes -c commit.gpgsign=false / -c tag.gpgsign=false
so sandbox tests pass on hosts whose global config requires signing.
- Bare repos in tests initialized with `-b main` (HEAD symbolic ref
matched to the branch we push) so `git log` after push works.
- TestSandboxCloneSource_FallsBackToOrigin uses a local-FS origin URL,
matching sandboxCloneSource's intentional filter against network URLs.
- TestGeminiLogs_ParsedCorrectly URL fixed to the actual log route
(/api/executions/{id}/log).
GeminiRunner gap closed (partial):
- parseGeminiStream now walks lines for `result` events, surfacing
is_error as an error and total_cost_usd as the float return value.
- GeminiRunner.Run propagates parsed cost to Execution.CostUSD.
- TestParseGeminiStream_ParsesStructuredOutput unskipped.
Notes:
- GeminiRunner is still simulated end-to-end (Run writes hardcoded
stream data instead of execing the binary). The result/cost parser
now exists; finishing the runner is a smaller, contained follow-up.
Kept on the deferred queue.
- Frontend "Local" agent option and a minor storage.db.go logger TODO
remain on the deferred queue, both intentionally — neither blocks
anything in flight.
https://claude.ai/code/session_017Edeq947TpSm1vQTxMhi1J
|
|
Local OSS models as a third runner (epic)
|
|
Phase 4 of "local OSS models as agents" plan. Closes the epic.
When an execution finishes and the agent did NOT write a "## Summary"
heading in its stdout (so the existing extractSummary path returns
empty), and the Pool has a local LLM configured, we now synthesize a
2-4 sentence summary from the assistant text content of the log tail.
Behavior:
- Primary path unchanged: if the agent wrote "## Summary", that wins
byte-for-byte (TestPool_HandleRunResult_ExtractSummaryWins guards).
- Fallback path: empty extractSummary + Pool.LLM != nil → synthesize.
- All-empty path: when no LLM is configured, summary stays empty —
identical to pre-Phase-4 behavior.
Implementation:
- Pool gains an LLM *llm.Client field, wired in serve.go and run.go
alongside Classifier.LLM (same localClient used everywhere).
- New synthesizeSummary in internal/executor/summary.go:
* 6s timeout so a slow local model can't stall finalization
* 16 KB tail cap on the stdout log
* readAssistantTextTail seeks to the last 16 KB and skips the
first (likely partial) line, parses each line as a stream-json
event, joins assistant `text` blocks (skips system/result/etc).
* Returns "" on any error so the caller's behavior never regresses.
- handleRunResult: 3-tier summary resolution — exec.Summary set by
runner → extractSummary → synthesizeSummary → empty.
- minimalMockStore now records UpdateTaskSummary calls (additive;
existing tests unaffected) so integration tests can assert.
Tests (9 new):
- synthesizeSummary nil client / empty path / missing file all
return "" without HTTP calls.
- empty assistant content short-circuits without LLM call.
- success path returns trimmed body, with both assistant texts in
the user prompt.
- LLM 500 returns "" (caller handles same as no-summary).
- readAssistantTextTail seeks past early content in a large file.
- Pool integration: ## Summary present → LLM not called, agent text
used. ## Summary absent + LLM set → LLM called, synthesized summary
recorded against the right task ID.
Plan: docs/plans/local-oss-runner.md.
Epic complete. Post-epic deep cleanup queue captured in the same plan
file for follow-up.
https://claude.ai/code/session_017Edeq947TpSm1vQTxMhi1J
|
|
Phase 3 of "local OSS models as agents" plan. When the webhook handler
creates a task for a failed CI run AND a local LLM is configured on
the server, the hardcoded 4-step investigation template is replaced
with a project-aware investigation plan generated by the LLM.
Scope adjustment from the original sketch: the original plan said
"summarize fetched workflow logs", but fetching logs requires GitHub
API auth that isn't wired. Narrowed to project-context triage —
recent git log + CLAUDE.md content + webhook metadata, fed to the
LLM with a system prompt asking for 6-12 lines of concrete next
steps. Deferred GitHub log fetching to post-epic cleanup.
Implementation:
- New internal/api/webhook_llm.go holds enrichCIInstructions and its
helpers (readRecentCommits via `git log`, readProjectDoc).
- enrichCIInstructions is truly additive: any failure mode (no client,
HTTP error, empty body, 10s timeout) returns the original fallback
template unchanged. Existing webhook tests pass byte-for-byte.
- Always preserves a metadata header (repo/branch/SHA/check/URL)
ahead of the LLM body so investigators don't lose context if the
LLM is terse.
- Reuses s.llm (set via Server.SetLLM in Phase 2) — no new config
knob, no per-feature gating. Asymmetric opt-out (yes-elaborate,
no-CI-triage) deferred until there's actual demand.
Tests:
- enrichCIInstructions: nil client, LLM 500, empty body all return
fallback unchanged.
- enrichCIInstructions: success path produces enriched body with
metadata header preserved; user prompt contains repo/branch/SHA.
- enrichCIInstructions: real git repo (init + 2 commits) → recent
commits appear in user prompt.
- Webhook handler regression guard: no-LLM path produces the exact
legacy template substrings.
- Webhook handler with LLM stubbed: task instructions contain LLM
body + metadata header.
Plan: docs/plans/local-oss-runner.md.
https://claude.ai/code/session_017Edeq947TpSm1vQTxMhi1J
|
|
Phase 2 of "local OSS models as agents" plan. Adds a third elaboration
path that calls the local OpenAI-compatible LLM via the internal/llm
client, and reorders dispatch so the cheap path is tried first:
local → claude → gemini, with each next attempt only on hard failure
of the prior.
Wiring is opt-out, not opt-in: when [local_model].endpoint is set,
elaboration prefers local by default. Users with a slow or low-quality
local model can disable just elaboration via:
[local_model]
endpoint = "..."
prefer_for_elaborate = false
without giving up the runner or the classifier path.
Implementation:
- Server gains an optional *llm.Client field via SetLLM (matches the
existing SetNotifier/SetWorkspaceRoot setter pattern, no NewServer
signature break).
- elaborateWithLocal() reuses buildElaboratePrompt verbatim and asks
for response_format=json_object so we skip markdown-fence cleanup.
- handleElaborateTask reorders try chain; existing Claude-first
behavior is preserved exactly when SetLLM is not called.
- LocalModel.UseForElaborate() encapsulates the default-true gating
with a *bool so explicit-false survives TOML parse.
Tests:
- elaborateWithLocal: parses valid response, errors on nil client,
errors on bad JSON.
- handler: local preferred when wired; falls back to claude when
local fails; unchanged behavior when no LLM is configured.
- config: UseForElaborate gating across empty/default/explicit-true/
explicit-false cases.
Pre-existing test failures noted in docs/plans/local-oss-runner.md
(post-epic cleanup): TestGeminiLogs_ParsedCorrectly returns 404 for
gemini execution log fetch — predates this change.
Plan: docs/plans/local-oss-runner.md.
https://claude.ai/code/session_017Edeq947TpSm1vQTxMhi1J
|
|
Phase 1 of "local OSS models as agents" plan. Adds a third Runner
backed by any OpenAI-compatible HTTP server (Ollama, vLLM, LM Studio,
llama.cpp), and migrates the Gemini-CLI classifier to route through
the same client when configured.
Two-layer split: internal/llm.Client is the workhorse (HTTP, no Pool,
no DB) used directly by the classifier and any future internal helper
that needs cheap reasoning. internal/executor.LocalRunner is a thin
adapter implementing Runner for user-facing tasks. This avoids
Pool reentrancy/deadlock when sub-second internal calls fire from
inside Pool.execute().
Highlights:
- internal/retry: relocated runWithBackoff/IsRateLimitError/ParseRetryAfter
into a shared package reused by executor and llm.
- internal/llm: Chat (non-streaming) and ChatStream (SSE) over
/chat/completions with optional bearer auth, json_object response
format, retry on 429/503, Retry-After parsing.
- internal/executor/LocalRunner: streams deltas into stdout.log in the
same stream-json envelope ClaudeRunner emits, then writes one
consolidated assistant block plus a result terminator so existing
parsers (extractSummary, ParseChangestatFromOutput) work unchanged.
- internal/executor/Classifier: gains optional LLM field; uses
json_object response format (no markdown-fence cleanup needed).
Falls back to Gemini-CLI subprocess when LLM is nil.
- Pool.skipClassification: now skips only when the requested agent
type is registered, so unknown types still reach the load balancer.
- Storage: additive tokens_in/tokens_out ALTERs on executions; CLI
runners record cost_usd as before, LocalRunner records 0 + tokens.
- Config: [local_model] section (endpoint, model, timeout_seconds,
default_temperature, api_key). Empty endpoint = no LocalRunner
registered, classifier falls back to Gemini.
Pre-existing test issues fixed in passing:
- claude_test.go setupSandbox callsites updated to current signature.
- gemini_test.go TestParseGeminiStream skipped (asserts unimplemented
GeminiRunner stream-error parsing; tracked separately).
Plan: docs/plans/local-oss-runner.md.
https://claude.ai/code/session_017Edeq947TpSm1vQTxMhi1J
|
|
Adds POST /api/webhooks/github that receives check_run and workflow_run
events and creates a Claudomator task to investigate and fix the failure.
- Config: new webhook_secret and [[projects]] fields in config.toml
- HMAC-SHA256 validation when webhook_secret is configured
- Ignores non-failure events (success, skipped, etc.) with 204
- Matches repo name to configured project dirs (case-insensitive)
- Falls back to single project when no name match found
- 11 new tests covering all acceptance criteria
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
Adds GET /api/tasks/{id}/deployment-status which checks whether the
currently-deployed server binary includes the fix commits from the
task's latest execution. Uses git merge-base --is-ancestor to compare
commit hashes against the running version.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
- next-task script: exclude rejected tasks from fallback selection; only
pick PENDING tasks with no rejection comment and no prior executions,
or QUEUED tasks (e.g. BUDGET_EXCEEDED retries)
- web/app.js: prompt for optional rejection comment when rejecting a task,
passing it through to the API instead of always sending an empty string
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
elaboration fallback
|
|
- Add ElaborationInput field to Task struct (task.go)
- Add DB migration and update CREATE/SELECT/scan in storage/db.go
- Update handleCreateTask to accept elaboration_input from API
- Update renderSubtaskRollup in app.js to prefer elaboration_input over description
- Capture elaborate prompt in createTask() form submission
- Update subtask-placeholder tests to cover elaboration_input priority
- Fix missing io import in gemini.go
When a task card is waiting for subtasks, it now shows:
1. The raw user prompt from elaboration (if stored)
2. The task description truncated at word boundary (~120 chars)
3. The task name as fallback
4. 'Waiting for subtasks…' only when all fields are empty
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
When a BLOCKED/READY task has no subtasks yet, show the task description
(truncated to ~120 chars at a word boundary) instead of the generic
'Waiting for subtasks…' text. Falls back to task.name if no description,
and finally to the original generic text if neither is present.
- Add truncateToWordBoundary(text, maxLen=120) helper
- Update renderSubtaskRollup(task, footer) to use task object instead of taskId
- Update both READY and BLOCKED call sites
- Add web/test/subtask-placeholder.test.mjs with 11 tests
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
- poll() now calls renderActiveTab(cache) on early-return so switching
tabs always renders immediately instead of leaving the panel blank
- renderRunningView unchanged check now requires running.length > 0,
fixing the empty-state message never appearing when no tasks run
- Extract renderActiveTab() to avoid duplicating the tab switch logic
- Throttle execution history fetch to once per 60s (was every poll)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
When the server restarts after all subtasks complete, the parent task
was left stuck in BLOCKED state because maybeUnblockParent only fires
during a live executor run. RecoverStaleBlocked() scans all BLOCKED
tasks on startup and re-evaluates them using the existing logic.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
updates
|
|
|
|
READY tasks now call renderSubtaskRollup identical to BLOCKED tasks
(without a question). The rollup appears above Accept/Reject buttons.
New test: web/test/ready-subtasks.test.mjs (10 assertions, all pass).
|
|
|
|
- Fix ephemeral sandbox deletion issue by passing $CLAUDOMATOR_PROJECT_DIR to agents and using it for subtask project_dir.
- Implement sandbox autocommit in teardown to prevent task failures from uncommitted work.
- Track git commits created during executions and persist them in the DB.
- Display git commits and changestats badges in the Web UI execution history.
- Add badge counts to Web UI tabs for Interrupted, Ready, and Running states.
- Improve scripts/next-task to handle QUEUED tasks and configurable DB path.
|
|
- Add computeTabBadgeCounts(tasks) exported pure function
- Add updateTabBadges(tasks) that updates badge spans in tab buttons
- Call updateTabBadges on every poll regardless of active tab
- Add .tab-count-badge spans to interrupted/ready/running tab buttons in HTML
- Add CSS for .tab-count-badge pill styling (hidden when count is zero)
- Add 11 tests in web/test/tab-badges.test.mjs covering all states
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
Files changed: CLAUDE.md, internal/api/changestats.go,
internal/executor/executor.go, internal/executor/executor_test.go,
internal/task/changestats.go (new)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
- Move git diff stat parser to internal/task/changestats.go (shared)
- Add UpdateExecutionChangestats to executor.Store interface
- Extract changestats in Pool.handleRunResult after every execution
- Add three TDD tests: ExtractAndStore, NoChangestats, MalformedChangestats
- Update CLAUDE.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
- Add parseChangestatFromOutput/File helpers in internal/api/changestats.go
to parse git diff --stat summary lines from execution stdout logs
- Wire parser in processResult: after each execution completes, scan the
stdout log for git diff stats and persist via UpdateExecutionChangestats
- Tests: TestGetTask_IncludesChangestats (verifies processResult wiring),
TestListExecutions_IncludesChangestats (verifies storage round-trip)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
- Add task.Changestats{FilesChanged, LinesAdded, LinesRemoved}
- Add changestats_json column to executions via additive migration
- Add Changestats field to storage.Execution struct
- Add UpdateExecutionChangestats(execID, *task.Changestats) method
- Update all SELECT/INSERT/scan paths for executions
- Test: TestExecution_StoreAndRetrieveChangestats (was red, now green)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
sandboxes
#1 - Diagnostics: tailFile() reads last 20 lines of subprocess stderr and
appends to error message when claude/gemini exits non-zero. Previously all
exit-1 failures were opaque; now the error_msg carries the actual subprocess
output.
#4 - Restart recovery: RecoverStaleRunning() now re-queues tasks after
marking them FAILED, so tasks killed by a server restart automatically
retry on the next boot rather than staying permanently FAILED.
#2 - Stale sandbox: If a resume execution's preserved SandboxDir no longer
exists (e.g. /tmp purge after reboot), clone a fresh sandbox instead of
failing immediately with "no such file or directory".
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
Sandbox setup runs git commands against project_dir which may be owned
by a different OS user, triggering git's 'dubious ownership' error.
Fix by passing -c safe.directory=* on all git commands that touch
project directories. Also add wildcard to global config for immediate
effect on the running server.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
- Add UpdateTaskAgent to Store interface and DB implementation
- Call UpdateTaskAgent in Pool.execute to persist assigned agent/model
to database before the runner starts
- Update runTask in app.js to pass selected agent as query param
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
|
|
- Added an agent selector (Auto, Claude, Gemini) to the Start Next Task button.
- Updated the backend to pass query parameters as environment variables to scripts.
- Modified the executor pool to skip classification when a specific agent is requested.
- Added --agent flag to claudomator start command.
- Updated tests to cover the new functionality.
|
|
When a task is BLOCKED due to spawned subtasks (no question), the card
footer now fetches and renders a list of subtask names with their state
emoji instead of showing the question/answer input UI. The Cancel button
remains in both cases.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
Two fixes for BLOCKED task issues:
1. Allow BLOCKED → CANCELLED state transition so users can cancel tasks
stuck waiting for input. Adds Cancel button to BLOCKED task cards in
the UI alongside the question/answer controls.
2. Detect when agents write completion reports to $CLAUDOMATOR_QUESTION_FILE
instead of real questions. If the question JSON has no options and no "?"
in the text, treat it as a summary (stored on the execution) and fall
through to normal completion + sandbox teardown rather than blocking.
Also tightened the preamble to make the distinction explicit.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
The summary+interactions feature was already fully implemented but
lacked storage-layer tests. Added tests covering round-trip persistence
of task summaries, accumulation of Q&A interactions, and error handling
for nonexistent tasks.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
|
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
All, Stats, Settings)
- Replace Tasks/Active tabs with Queue (QUEUED+PENDING), Interrupted, Ready top-level tabs
- Add All tab (COMPLETED, TIMED_OUT, BUDGET_EXCEEDED within last 24h) and Settings placeholder
- Export filterQueueTasks, filterReadyTasks, filterAllDoneTasks from app.js
- Refactor poll() to dispatch to active tab's render function instead of always rendering all panels
- Add renderQueuePanel, renderInterruptedPanel, renderReadyPanel, renderAllPanel helpers
- Add tests in web/test/tab-filters.test.mjs covering all new filter functions (16 tests)
- All 165 JS tests and all Go tests pass
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
Previously appendRawNarrative was called with the server's default workDir
(os.Getwd()) when no project_dir was in the request, causing test runs and
any elaboration without a project to pollute the repo's own RAW_NARRATIVE.md.
The narrative is per-project human input — only write it when the caller
explicitly specifies which project they're working in.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
|
|
GeminiRunner.buildArgs was missing --yolo (auto-approve all tools)
so the gemini CLI only registered 3 tools (read_file, write_todos,
cli_help) and write_file was not available. Agents that needed to
create files silently failed (exit 0, no files written).
Also switch instructions from bare positional arg to -p flag, which
is required for non-interactive headless mode.
Update preamble tests to match file-based summary approach
(CLAUDOMATOR_SUMMARY_FILE) kept from the merge conflict resolution.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
|
|
Removed the switchTab() logic that hid btn-new-task on non-tasks tabs.
The button lives in the global header so no structural changes were needed.
Added new-task-button.test.mjs to contract-test the always-visible behavior.
|
|
Keep file-based summary approach (CLAUDOMATOR_SUMMARY_FILE) from HEAD.
Combine Q&A History and Stats tab CSS from both branches.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
Add Pool.RecoverStaleQueued() that lists all QUEUED tasks from the DB on
startup and re-submits them to the in-memory pool. Previously, tasks that
were QUEUED when the server restarted would remain stuck indefinitely since
only RUNNING tasks were recovered (and marked FAILED).
Called in serve.go immediately after RecoverStaleRunning().
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
practices
Add sanitizeElaboratedTask() called after every elaboration response:
- Infers missing allowed_tools from instruction keywords (Write/Edit/Read/Bash/Grep/Glob)
- Auto-adds Read when Edit is present
- Appends Acceptance Criteria section if none present
- Appends TDD reminder for coding tasks without test mention
Also tighten buildElaboratePrompt to require acceptance criteria and
list concrete tool examples, reducing how often the model omits tools.
Fixes class of failures where agents couldn't create files because
the elaborator omitted Write from allowed_tools.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
- Extend Resume to CANCELLED, FAILED, and BUDGET_EXCEEDED tasks
- Add summary extraction from agent stdout stream-json output
- Fix storage: persist stdout/stderr/artifact_dir paths in UpdateExecution
- Clear question_json on ResetTaskForRetry
- Resume BLOCKED tasks in preserved sandbox so Claude finds its session
- Add planning preamble: CLAUDOMATOR_SUMMARY_FILE env var + summary step
- Update ADR-002 with new state transitions
- UI style improvements
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
Interrupted tasks (CANCELLED, FAILED, BUDGET_EXCEEDED) now support session
resume in addition to restart. Both buttons are shown on the task card.
- executor: extend resumablePoolStates to include CANCELLED, FAILED, BUDGET_EXCEEDED
- api: extend handleResumeTimedOutTask to accept all resumable states with
state-specific resume messages; replace hard-coded TIMED_OUT check with a
resumableStates map
- web: add RESUME_STATES set; render Resume + Restart buttons for interrupted
states; TIMED_OUT keeps Resume only
- tests: 5 new Go tests (TestResumeInterrupted_*); updated task-actions.test.mjs
with 17 tests covering dual-button behaviour
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|