The big feature-and-integration cut. Verification intelligence (LLM
diagnosis + fix suggestions), full GitHub workflow (tab, merge
button, PR create, diagnosis-to-PR-comment), Slack alerts via
Composio, the embedded AI Chat agent, and E2E / status / visual
verification modes. Under the hood, the app also got its "symbiosis"
hook architecture — Claude Code's PostToolUse / UserPromptSubmit
hooks now piggyback verification results back into the agent's
context mid-conversation.
AI verification intelligence
- Fireworks AI LLM cascade powering failure diagnosis
(
gpt-oss-120b, ~1.4 s) and fix suggestions (kimi-k2p5), with
OpenRouter as a secondary cascade when Fireworks is unavailable.
- AI diagnosis — every non-healthy check gets a one-sentence
correlation between the failure and recent code changes, with
RAG few-shot context from this project's past failures.
- Suggest Fix — concrete remediation generator using past
failure patterns from verification history.
- AI-generated E2E test definitions — describe what to test in
English, the LLM emits the structured test case.
GitHub tab + PR workflows
- Dedicated GitHub tab alongside Activity and History, showing
every open PR across every project with live CI status, review
state, and merge readiness.
- Inline Merge button performs squash merge and polls the merge
commit's required checks live; fires a deployment-live toast when
the merge is up.
- Create PR from inside the app with fork detection,
base-branch auto-pick (development → main), and structured
commit-log body.
- Diagnosis-to-PR-comment — posts the AI-generated diagnosis
to the failing PR (falls back to Create PR if no open PR exists).
- File Issue on GitHub with fork detection + existing-issue
dedup, labels inferred from task type.
- Auto-refresh on
git push via the hook receiver.
Slack alerts
- Per-project alert config — each project picks its own Slack
channel and its own event types (deploy / OAuth / PR CI / E2E /
status / content). Global master toggle in Settings gates all
dispatch.
- Rich alert formatting varies by task type: PR alerts include
CI counts + review state; deployment alerts include the verified
URL; E2E alerts include failing-step observations; visual alerts
list affected page paths.
- Channel picker with live Slack-channel listing via OAuth.
Composio integration
- GitHub + Slack OAuth flows handled via Composio's
auth-config system — no secrets stored client-side, connection IDs
persisted per service.
- Powers every GitHub interaction (
GITHUB_LIST_PULL_REQUESTS,
GITHUB_LIST_CHECK_RUNS_FOR_A_GIT_REFERENCE,
GITHUB_LIST_REVIEWS_FOR_A_PULL_REQUEST, issue creation, fork
detection) and every Slack dispatch.
Embedded AI Chat agent
- Every session card gets a built-in chat panel that spawns the
Claude Code CLI (
claude-haiku-4-5) inline, streaming NDJSON back
through Tauri events for real-time tool-use + permission approval
in the UI.
- Context pre-loaded from verification history, project config,
E2E definitions, and latest results so the agent starts informed.
- Allow / Deny prompt UI for every Edit / Write / Bash tool call.
New verification modes
- E2E flow testing — multi-step scripted browser flows via
TinyFish, executed like a real user.
- Status checks — batch page rendering via TinyFish Fetch API
for external dependency / status-page monitoring.
- Visual / content regression — baseline capture per page,
diffed on each deploy via OpenRouter-powered analysis.
- Cross-session verification history persisted locally with FIFO
eviction, queried for diagnosis RAG context.
Dev loop + UX
- Piggyback hook architecture — Claude Code's PostToolUse /
UserPromptSubmit hooks return
additionalContext so verification
results are injected directly into the agent's conversation
mid-workflow, without blocking its work.
- Tabbed dashboard — GitHub / Activity / History replace the
prior single-pane layout.
- Auto-tunnel localhost via
tinyfi.sh so http://localhost:3000
URLs can be verified without manual exposure.
- Auto-detect real dev-server ports via
lsof.
- Multi-agent debounce (60 s per project × task type) so a burst
of pushes doesn't fire ten verifications.
- Smart result interpretation + project-scoped delivery —
results route only to the agent session that triggered them.
In-house audit tooling
- Lighthouse integration — run local site audits, get actionable
performance / accessibility output.
- npm audit integration — dependency vulnerability scanning.
- Pa11y accessibility audit — WCAG compliance checking.
Settings + onboarding
- Settings dialog with masked API-key inputs and 0600 file
permissions for stored secrets.
- Built-in TinyFish + Fireworks API keys (obfuscated in binary)
so first-run users can try the app without signing up; banner
prompts them to add their own when they're ready.
- Integrations tab with GitHub + Slack OAuth flow.
- Claude Code CLI prerequisite note on the empty-state onboarding.
- GitHub Actions workflow for automated macOS releases.
Changed
- Activity Feed replaces the slide-out Web Agent panel; checks
grouped per project, streaming fixed.
- Deployment check redesigned — extraction-based goal, nav-link
dropdown, responsive grid.
- OAuth goal rewritten — provider-agnostic, numbered steps.
- GitHub PR goal rewritten — linear steps, no re-checking.
- External links now open in the real browser via the Tauri opener
plugin instead of inside the webview.
Removed
- AeroSpace workspace integration (added complexity for limited
benefit; focus narrowed to verification).
- MCP integration explored + discarded — blocked the agent for
3-4 minutes and prevented parallel work. Hooks + piggyback
replaced it.
Fixed
- Fallback to built-in key on user-key auth failure.
- Allow clearing personal API key to fall back to shared key.
- Compact card layout with inline grouped buttons restored.
- Shell PATH resolution for the desktop app.
picomatch high-severity npm vulnerability.