| Age | Commit message (Collapse) | Author |
|
Merges 12 commits from github/main (formerly master) that were developed
independently. Key additions:
- LocalRunner: OpenAI-compatible local LLM execution (Ollama, LM Studio)
- Real GeminiRunner with full sandbox parity to ClaudeRunner
- llm.Client for enriching CI failures and elaboration via local model
- retry.ParseRetryAfter moved to shared package
- tokens_in/tokens_out columns in executions table
Conflict resolutions:
- Kept local main's VAPID/push, stories, projects, agent events schema
- Merged both sets of Config fields (local + LocalModel from github/main)
- Unified activePerAgent accounting (decActiveAgent helper)
- Removed duplicate helpers from claude.go (now in helpers.go)
- Fixed double-decrement bug in handleRunResult vs decActiveAgent
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
All coding tasks now follow the same flow regardless of runner: when
project_dir is set, the agent runs in a temp clone, not in the user's
working tree. On success, edits are autocommitted and pushed back to
origin/master and the sandbox is removed. On failure or BLOCKED, the
sandbox is preserved and its path surfaces in the error / BlockedError
so the user can inspect partial work or resume in place.
Before this commit, GeminiRunner.Run set cmd.Dir to project_dir
directly, so an agent run could leave half-done edits in the user's
working tree with no rollback. ClaudeRunner has had the full sandbox
flow for a while; this commit closes the gap.
Reused the existing package-level helpers from claude.go verbatim:
setupSandbox, teardownSandbox, sandboxCloneSource, gitSafe, plus the
resume/stale-sandbox/blocked-error patterns. No new shared abstraction
needed — same package.
LocalRunner intentionally not changed. The OpenAI chat path has no
tool use, so the agent can't edit files; sandbox would be theater.
Tests (6 new):
- Run_ProjectDir_RunsInSandbox: cwd captured by fake binary is a
sandbox path, not project_dir.
- Run_BlockedError_IncludesSandboxDir: when question.json appears,
BlockedError.SandboxDir is set and the dir exists.
- Run_ExecError_PreservesSandbox: failing exit wraps error with
"(sandbox preserved at <path>)" and the path exists on disk.
- Run_ResumeUsesStoredSandboxDir: ResumeSessionID + SandboxDir →
runs in that dir without re-cloning.
- Run_StaleSandboxDir_ClonesAfresh: resume pointing at missing
dir falls back to a fresh clone from project_dir.
- Run_NoProjectDir_SkipsSandbox: tasks without project_dir don't
trigger sandbox setup.
https://claude.ai/code/session_017Edeq947TpSm1vQTxMhi1J
|
|
Adds TestTeardownSandbox_CapturesExplicitCommits to cover the case
where the agent explicitly commits changes (no autocommit needed).
Previously only the autocommit path was tested; this confirms
teardownSandbox populates Commits for any commits ahead of origin.
https://claude.ai/code/session_01G4dT9JBWFFb8xGcSHenzRS
|
|
Add CreateExecutionAndSetRunning to storage.DB and Store interface,
replacing the two sequential CreateExecution/UpdateTaskState calls in
executor.go. Eliminates the crash window where a task stays PENDING
with an orphaned RUNNING execution record.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
Three related bugs fixed:
1. maybeUnblockParent: guard against promoting QUEUED leaf tasks (no
subtasks) to READY. The vacuously-true 'all subtasks done' check was
advancing tasks that stalled in QUEUED (due to a prior SQLite lock
error) to READY on server restart via RecoverStaleBlocked, despite
having only failed executions and no commits.
2. checkStoryCompletion: require COMPLETED (not just READY) for all
top-level tasks before advancing a story to SHIPPABLE. READY means
the checker agent is still pending or the task awaits human review;
a story with READY tasks is not ready to ship.
3. handleAcceptTask: call CheckStoryCompletion after a task is accepted
so stories with parent tasks (whose subtasks are all done and then
the parent is manually accepted) can auto-advance to SHIPPABLE.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
Closes the three items left on the deferred queue after the post-epic
cleanup.
GeminiRunner.execOnce now actually executes the gemini binary instead
of writing hardcoded stream data. Mirrors ClaudeRunner.execOnce:
- exec.CommandContext with the same env vars (CLAUDOMATOR_API_URL etc.)
- process group SIGKILL on context cancel
- stdout piped through parseGeminiStream → stdoutFile
- stderr to file
- exit codes captured, stderr tail surfaced on failure
Test infrastructure bug uncovered in passing: testServerWithGeminiMockRunner's
mock script used double-quoted echo with literal triple-backticks, which
bash interpreted as command substitution. The script always produced
empty output. The bug was invisible until now because GeminiRunner
ignored the script entirely. Switched to a single-quoted heredoc.
Frontend: index.html dropdown gains a "Local" option. No JS branching
needed — the value flows through to agent.type verbatim and downstream
display reads the type string as-is.
storage/db.go: removed stale debug-comment scaffolding (the "TODO:
Replace with proper logger" block) that was tracking a dead
`fmt.Printf` call. The path it commented on is fine without logging —
unmarshal errors are returned wrapped.
Test status: `go test -race ./...` green across every package, zero
skips, zero excluded tests.
https://claude.ai/code/session_017Edeq947TpSm1vQTxMhi1J
|
|
Addresses the cleanup queue captured in docs/plans/local-oss-runner.md
after the local-OSS-models epic landed. After this commit
`go test -race ./...` is green across every package with zero `t.Skip`
calls and no excluded tests.
Real bugs fixed:
- claude.go setupSandbox callsites used `sandboxDir, err := ...` which
shadowed the outer variable, so BlockedError.SandboxDir was always
empty. Resume-after-block was broken for both new and stale-sandbox
paths. TestBlockedError_IncludesSandboxDir now exercises the right
invariant.
- TestPool_ActivePerAgent_DeletesZeroEntries flake under -race: the
cleanup defer in execute()/executeResume() runs AFTER
handleRunResult sends on resultCh, so consumers observing a result
could see a still-counted activePerAgent entry. Extracted
decActiveAgent(agentType, *cleaned) helper; called explicitly before
every resultCh send, defer becomes a no-op via the cleaned flag.
Verified clean over `go test -race -count=10`.
Test infrastructure made hermetic:
- gitSafe now also passes -c commit.gpgsign=false / -c tag.gpgsign=false
so sandbox tests pass on hosts whose global config requires signing.
- Bare repos in tests initialized with `-b main` (HEAD symbolic ref
matched to the branch we push) so `git log` after push works.
- TestSandboxCloneSource_FallsBackToOrigin uses a local-FS origin URL,
matching sandboxCloneSource's intentional filter against network URLs.
- TestGeminiLogs_ParsedCorrectly URL fixed to the actual log route
(/api/executions/{id}/log).
GeminiRunner gap closed (partial):
- parseGeminiStream now walks lines for `result` events, surfacing
is_error as an error and total_cost_usd as the float return value.
- GeminiRunner.Run propagates parsed cost to Execution.CostUSD.
- TestParseGeminiStream_ParsesStructuredOutput unskipped.
Notes:
- GeminiRunner is still simulated end-to-end (Run writes hardcoded
stream data instead of execing the binary). The result/cost parser
now exists; finishing the runner is a smaller, contained follow-up.
Kept on the deferred queue.
- Frontend "Local" agent option and a minor storage.db.go logger TODO
remain on the deferred queue, both intentionally — neither blocks
anything in flight.
https://claude.ai/code/session_017Edeq947TpSm1vQTxMhi1J
|
|
Phase 4 of "local OSS models as agents" plan. Closes the epic.
When an execution finishes and the agent did NOT write a "## Summary"
heading in its stdout (so the existing extractSummary path returns
empty), and the Pool has a local LLM configured, we now synthesize a
2-4 sentence summary from the assistant text content of the log tail.
Behavior:
- Primary path unchanged: if the agent wrote "## Summary", that wins
byte-for-byte (TestPool_HandleRunResult_ExtractSummaryWins guards).
- Fallback path: empty extractSummary + Pool.LLM != nil → synthesize.
- All-empty path: when no LLM is configured, summary stays empty —
identical to pre-Phase-4 behavior.
Implementation:
- Pool gains an LLM *llm.Client field, wired in serve.go and run.go
alongside Classifier.LLM (same localClient used everywhere).
- New synthesizeSummary in internal/executor/summary.go:
* 6s timeout so a slow local model can't stall finalization
* 16 KB tail cap on the stdout log
* readAssistantTextTail seeks to the last 16 KB and skips the
first (likely partial) line, parses each line as a stream-json
event, joins assistant `text` blocks (skips system/result/etc).
* Returns "" on any error so the caller's behavior never regresses.
- handleRunResult: 3-tier summary resolution — exec.Summary set by
runner → extractSummary → synthesizeSummary → empty.
- minimalMockStore now records UpdateTaskSummary calls (additive;
existing tests unaffected) so integration tests can assert.
Tests (9 new):
- synthesizeSummary nil client / empty path / missing file all
return "" without HTTP calls.
- empty assistant content short-circuits without LLM call.
- success path returns trimmed body, with both assistant texts in
the user prompt.
- LLM 500 returns "" (caller handles same as no-summary).
- readAssistantTextTail seeks past early content in a large file.
- Pool integration: ## Summary present → LLM not called, agent text
used. ## Summary absent + LLM set → LLM called, synthesized summary
recorded against the right task ID.
Plan: docs/plans/local-oss-runner.md.
Epic complete. Post-epic deep cleanup queue captured in the same plan
file for follow-up.
https://claude.ai/code/session_017Edeq947TpSm1vQTxMhi1J
|
|
Phase 1 of "local OSS models as agents" plan. Adds a third Runner
backed by any OpenAI-compatible HTTP server (Ollama, vLLM, LM Studio,
llama.cpp), and migrates the Gemini-CLI classifier to route through
the same client when configured.
Two-layer split: internal/llm.Client is the workhorse (HTTP, no Pool,
no DB) used directly by the classifier and any future internal helper
that needs cheap reasoning. internal/executor.LocalRunner is a thin
adapter implementing Runner for user-facing tasks. This avoids
Pool reentrancy/deadlock when sub-second internal calls fire from
inside Pool.execute().
Highlights:
- internal/retry: relocated runWithBackoff/IsRateLimitError/ParseRetryAfter
into a shared package reused by executor and llm.
- internal/llm: Chat (non-streaming) and ChatStream (SSE) over
/chat/completions with optional bearer auth, json_object response
format, retry on 429/503, Retry-After parsing.
- internal/executor/LocalRunner: streams deltas into stdout.log in the
same stream-json envelope ClaudeRunner emits, then writes one
consolidated assistant block plus a result terminator so existing
parsers (extractSummary, ParseChangestatFromOutput) work unchanged.
- internal/executor/Classifier: gains optional LLM field; uses
json_object response format (no markdown-fence cleanup needed).
Falls back to Gemini-CLI subprocess when LLM is nil.
- Pool.skipClassification: now skips only when the requested agent
type is registered, so unknown types still reach the load balancer.
- Storage: additive tokens_in/tokens_out ALTERs on executions; CLI
runners record cost_usd as before, LocalRunner records 0 + tokens.
- Config: [local_model] section (endpoint, model, timeout_seconds,
default_temperature, api_key). Empty endpoint = no LocalRunner
registered, classifier falls back to Gemini.
Pre-existing test issues fixed in passing:
- claude_test.go setupSandbox callsites updated to current signature.
- gemini_test.go TestParseGeminiStream skipped (asserts unimplemented
GeminiRunner stream-error parsing; tracked separately).
Plan: docs/plans/local-oss-runner.md.
https://claude.ai/code/session_017Edeq947TpSm1vQTxMhi1J
|
|
Fix 1: Remove QuestionRegistry and related types (QuestionHandler, PendingQuestion)
from question.go -- nothing reads Pool.Questions or uses the registry. Remove
NewQuestionRegistry() call from NewPool and the Questions field from Pool.
Remove the now-superfluous registry tests; keep stream/parse helpers which are
still used by the claude runner.
Fix 2: Check scanner.Err() after the parseStream loop so I/O errors from the
scanner are not silently swallowed when streamErr is still nil.
Fix 3: Delete internal/api/changestats.go -- the parseChangestatFromFile and
parseChangestatFromOutput wrappers were only needed to support processResult(),
which no longer calls them; they are unreachable dead code.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
cancellation killing deploy
|
|
auto-deploy
- checkStoryCompletion now guards against re-running on already-SHIPPABLE stories
and no longer auto-triggers triggerStoryDeploy on completion
- New Pool.ShipStory method validates SHIPPABLE state then fires triggerStoryDeploy
- POST /api/stories/{id}/ship route registered and handleShipStory handler added
- Two new tests: 202 for SHIPPABLE story, 409 for non-SHIPPABLE story
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
failure states
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
consecutive failures
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
When a story is approved with pre-created subtasks, parent tasks are
QUEUED but never run. Their subtasks complete, but:
- maybeUnblockParent only handled BLOCKED parents, not QUEUED ones
- checkStoryCompletion required ALL tasks (incl. subtasks) to be done
Fixes:
- maybeUnblockParent now also promotes QUEUED parents to READY when all
subtasks are COMPLETED
- checkStoryCompletion only checks top-level tasks (parent_task_id="")
- RecoverStaleBlocked now also scans QUEUED parents on startup and
triggers checkStoryCompletion if it promotes them
- Add QUEUED→READY to valid state transitions (subtask delegation path)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
- Add workerWg to Pool; Shutdown() closes workCh and waits for all
in-flight execute/executeResume goroutines to finish
- Signal handler now shuts down HTTP first, then drains the pool
- ShutdownTimeout config field (toml: shutdown_timeout); default 3m
- Tests: WaitsForWorkers and TimesOut
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
Add ensureStoryBranch() that runs git ls-remote to check, then clones
into a temp dir to create and push the branch if missing. Called before
the task's own clone so checkout is guaranteed to succeed.
Removes the post-checkout fallback hack added in the previous commit.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
If git checkout of the story branch fails (branch never pushed to bare
repo), create it from HEAD and push to origin instead of hard-failing
the task.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
QUEUED→FAILED is not a valid state transition. When a dependency enters a
terminal failure state, cancel the waiting task instead.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
createStoryBranch was pushing to 'origin' which doesn't exist — branches
never landed in the bare repo so agents couldn't clone them. Now uses
the project's RemoteURL (bare repo path) directly for fetch and push.
Raise drain threshold from 2 to 3 consecutive failures to reduce false
positives from transient errors.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
With maxPerAgent=1, tasks with DependsOn were entering waitForDependencies
while holding the per-agent slot, preventing the dependency from ever running.
Fix: check deps before taking the slot. If not ready, requeue without holding
activePerAgent. Also accept StateReady (leaf tasks) as a satisfied dependency,
not just StateCompleted.
Add startedCh to pool and broadcast task_started WebSocket event when a task
transitions to RUNNING, so the UI immediately shows the running state during
the clone phase instead of waiting for completion.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
AgentStatusInfo was missing drained field so UI couldn't show drain lock.
AgentEvent had no JSON tags so ev.agent/event/timestamp were undefined in
the stats timeline. UI now shows "Drain locked" card state with undrain CTA.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
- executor.go: merge story branch to main before deploy
- container.go: error messages reference git push origin main
- api/stories.go: create story branch from origin/main (drop master fallback)
- executor_test.go: test setup uses main branch
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
- triggerStoryDeploy: fetch/checkout/merge --no-ff/push before running deploy script (ADR-007)
- executor_test: TestPool_StoryDeploy_MergesStoryBranch proves merge happens
- seed.go: add doot project with deploy script; wire claudomator deploy script
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
|
|
Add checkValidationResult which inspects the final task.State of a
completed validation task and updates the story to REVIEW_READY (pass)
or NEEDS_FIX (fail). Wire into handleRunResult so stories in
VALIDATING state are dispatched to checkValidationResult instead of
checkStoryCompletion, covering both success and FAILED terminal paths.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
|
|
Add triggerStoryDeploy to Pool: fetches story's project, runs its
DeployScript via exec.CommandContext, and advances story to DEPLOYED on
success. Wire into checkStoryCompletion with go p.triggerStoryDeploy
after the SHIPPABLE transition. Covered by TestPool_StoryDeploy_RunsDeployScript.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
Agent added: Store on ContainerRunner (direct story/project lookup), --reference
clone for speed, explicit story branch push, checkStoryCompletion → SHIPPABLE.
My additions: BranchName on Task as fallback when Store is nil, tests updated
to match checkout-after-clone approach.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
- Add BranchName field to task.Task (populated from story at execution time)
- Add GetStory to executor Store interface; resolve BranchName from story in both
execute() and executeResume() parallel to RepositoryURL resolution
- Pass --branch <name> to git clone when BranchName is set; default clone otherwise
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
check, deployment status
- ContainerRunner: add Store field; clone with --reference when story has a
local project path; checkout story branch after clone; push to story branch
instead of HEAD
- executor.Store interface: add GetStory, ListTasksByStory, UpdateStoryStatus
- Pool.handleRunResult: trigger checkStoryCompletion when a story task succeeds
- Pool.checkStoryCompletion: transitions story to SHIPPABLE when all tasks done
- serve.go: wire Store into each ContainerRunner
- stories.go: update createStoryBranch to fetch+checkout from origin/master base;
add GET /api/stories/{id}/deployment-status endpoint
- server.go: register deployment-status route
- Tests: TestPool_CheckStoryCompletion_AllComplete/PartialComplete,
TestHandleStoryDeploymentStatus
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
- Add GetProject to Store interface used by executor
- Resolve RepositoryURL from project registry when task.RepositoryURL is empty
- Call SeedProjects at server startup so the project registry is populated
- Add GetProject stub to minimalMockStore in executor tests
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
TestPool_MaxPerAgent_BlocksSecondTask
|
|
creds, auth recovery
- maxPerAgent=1: only 1 in-flight execution per agent type at a time; excess tasks are requeued after 30s
- Drain gate: after 2 consecutive failures the agent is drained and a question is set on the task; reset on first success; POST /api/pool/agents/{agent}/undrain to acknowledge
- Pre-flight credential check: verify .credentials.json and .claude.json exist in agentHome before spinning up a container
- Auth error auto-recovery: detect auth errors (Not logged in, OAuth token has expired, etc.) and retry once after running sync-credentials and re-copying fresh credentials
- Extracted runContainer() helper from ContainerRunner.Run() to support the retry flow
- Wire CredentialSyncCmd in serve.go for all three ContainerRunner instances
- Tests: TestPool_MaxPerAgent_*, TestPool_ConsecutiveFailures_*, TestPool_Undrain_*, TestContainerRunner_Missing{Credentials,Settings}_FailsFast, TestIsAuthError_*, TestContainerRunner_AuthError_SyncsAndRetries
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
ClaudeConfigDir moved from /root/.claude to credentials/claude/, but
container.go was still deriving .claude.json from filepath.Dir which
no longer pointed anywhere useful. Claude CLI needs .claude.json for
OAuth account info or it says "Not logged in".
Also update sync-credentials to copy /root/.claude.json into the
credentials dir so it stays fresh alongside the token.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
After a successful run with no commits pushed, detectUncommittedChanges
checks for modified tracked files and untracked source files. If any
exist the task fails with an explicit error rather than silently
succeeding while the work evaporates when the sandbox is deleted.
Scaffold files written by the harness (.claudomator-env,
.claudomator-instructions.txt, .agent-home/) are excluded from the check.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
- task.Project type + storage CRUD + UpsertProject + SeedProjects
- Remove AgentConfig.ProjectDir, RepositoryURL, SkipPlanning
- Remove ContainerRunner fallback git init logic
- Project API endpoints: GET/POST /api/projects, GET/PUT /api/projects/{id}
- processResult no longer extracts changestats (pool-side only)
- claude_config_dir config field; default to credentials/claude/
- New scripts: sync-credentials, fix-permissions, check-token
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
detection
- Detect Gemini TerminalQuotaError (daily quota) as BUDGET_EXCEEDED, not generic FAILED
- Surface container stderr tail in error so quota/rate-limit classifiers can match it
- Add agent_events table to persist rate-limit start/recovery events across restarts
- Add GET /api/agents/status endpoint returning live agent state + 24h event history
- Stats dashboard: agent status cards, 24h availability timeline, per-run execution table
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
- ContainerRunner replaces ClaudeRunner/GeminiRunner; all agent types run
in Docker containers via claudomator-agent:latest
- Writable agentHome staging dir (/home/agent) satisfies home-dir
requirements for both claude and gemini CLIs without exposing host creds
- Copy .credentials.json and .claude.json into staging dir at run time;
GEMINI_API_KEY passed via env file
- Fix git clone: remove MkdirTemp-created dir before cloning (git rejects
pre-existing dirs even when empty)
- Replace localhost with host.docker.internal in APIURL so container can
reach host API; add --add-host=host.docker.internal:host-gateway
- Run container as --user=$(uid):$(gid) so host-owned workspace files are
readable; chmod workspace 0755 and instructions file 0644 after clone
- Pre-create .gemini/ in staging dir to avoid atomic-rename ENOENT on first
gemini-cli run
- Add ct CLI tool to container image: pre-built Bash wrapper for
Claudomator API (ct task submit/create/run/wait/status/list)
- Document ct tool in CLAUDE.md agent instructions section
- Add drain-failed-tasks script: retries failed tasks on a 5-minute interval
- Update Dockerfile: Node 22 via NodeSource, Go 1.24, gemini-cli,
git safe.directory=*, default ~/.claude.json
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
|
|
- Fix push failure swallowing and ensure workspace preservation on push error
- Fix wrong session ID in --resume flag and BlockedError
- Implement safer shell quoting for instructions in buildInnerCmd
- Capture and propagate actual Claude session ID from stream init message
- Clean up redundant image resolution and stale TODOs
- Mark ADR-005 as Superseded
- Consolidate RepositoryURL to Task level (removed from AgentConfig)
- Add unit test for session ID extraction in parseStream
|
|
- Fix host/container path confusion for --env-file
- Fix --resume flag to only be used during resumptions
- Fix instruction passing to Claude CLI via shell-wrapped cat
- Restore streamErr return logic to detect task-level failures
- Improve success flag logic for workspace preservation
- Remove duplicate RepositoryURL from AgentConfig
- Fix app.js indentation and reformat DOMContentLoaded block
- Restore behavioral test coverage in container_test.go
|
|
- Fix Critical Bug 1: Only remove workspace on success, preserve on failure/BLOCKED.
- Fix Critical Bug 2: Use correct Claude flag (--resume) and pass instructions via file.
- Fix Critical Bug 3: Actually mount and use the instructions file in the container.
- Address Design Issue 4: Implement Resume/BLOCKED detection and host-side workspace re-use.
- Address Design Issue 5: Consolidate RepositoryURL to Task level and fix API fallback.
- Address Design Issue 6: Make agent images configurable per runner type via CLI flags.
- Address Design Issue 7: Secure API keys via .claudomator-env file and --env-file flag.
- Address Code Quality 8: Add unit tests for ContainerRunner arg construction.
- Address Code Quality 9: Fix indentation regression in app.js.
- Address Code Quality 10: Clean up orphaned Claude/Gemini runner files and move helpers.
- Fix tests: Update server_test.go and executor_test.go to work with new model.
|
|
This commit implements the architectural shift from local directory-based
sandboxing to containerized execution using canonical repository URLs.
Key changes:
- Data Model: Added RepositoryURL and ContainerImage to task/agent configs.
- Storage: Updated SQLite schema and queries to handle new fields.
- Executor: Implemented ContainerRunner using Docker/Podman for isolation.
- API/UI: Overhauled task creation to use Repository URLs and Image selection.
- Webhook: Updated GitHub webhook to derive Repository URLs automatically.
- Docs: Updated ADR-005 with risk feedback and added ADR-006 to document the
new containerized model.
- Defaults: Updated serve command to use ContainerRunner for all agents.
This fixes systemic task failures caused by build dependency and permission
issues on the host system.
|
|
- Deployment badge now returns null (hidden) when includes_fix is false instead of showing "Not deployed" noise
- Badge also suppressed when fix_commits is empty (no tracked commits to check)
- Notification button label trimmed to just the bell emoji
- Preamble: warn agents not to use absolute paths in git commands (sandbox bypass)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
Web Push:
- WebPushNotifier with VAPID auth; urgency mapped to event type
(BLOCKED=urgent, FAILED=high, COMPLETED=low)
- Auto-generates VAPID keys on first serve, persists to config file
- push_subscriptions table in SQLite (upsert by endpoint)
- GET /api/push/vapid-key, POST/DELETE /api/push/subscribe endpoints
- Service worker (sw.js) handles push events and notification clicks
- Notification bell button in web UI; subscribes on click
File Drop:
- GET /api/drops, GET /api/drops/{filename}, POST /api/drops
- Persistent ~/.claudomator/drops/ directory
- CLAUDOMATOR_DROP_DIR env var passed to agent subprocesses
- Drops tab (📁) in web UI with file listing and download links
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
Move activePerAgent decrement/deletion out of execute() and
executeResume() defers and into the code paths immediately before each
resultCh send (handleRunResult and early-return paths). This guarantees
that when a result consumer reads from the channel the map is already
clean, eliminating a race between defer and result receipt.
Remove the polling loop from TestPool_ActivePerAgent_DeletesZeroEntries
and check the map state immediately after reading the result instead.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
The deferred activePerAgent cleanup in execute() runs after resultCh is
sent, so a consumer reading Results() could observe the map entry before
it was removed. Poll briefly (100ms max) instead of checking immediately.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|