AgentDesk
1
TinyFish Accelerator/Verification layer

Agents ship.
AgentDesk verifies.

Every AI coding agent can ship code. None of them verifies it works. AgentDesk does. A real browser verifies every push — diagnoses failures, drafts the fix, comments on the PR, ships the merge.

RequiresmacOS 13+·Apple Silicon·Claude Code CLI

AgentDesk orchestrates Claude Code sessions — without the CLI installed, there’s nothing to verify.

Powered by a best-in-class stack
TinyFish
Fireworks AI
OpenRouter
Composio
Claude Code
AgentDesk.app — 3 sessions live
rec
AgentDesk dashboard: active Claude Code sessions with verification status
dashboard.viewnative macOS · not a browser extension2026.04
Section 02 · Why the extra step

Unsigned
by design.
For now.

AgentDesk is indie-built. No Apple Developer certificate yet — $99/year, on the roadmap. macOS quarantines the app on first run. One terminal command strips the flag. The details below explain why that's safe.

?Why is this safe?+

The released .dmg is built by a pinned GitHub Actions pipeline — not hand-crafted, not uploaded from a local machine. Every binary is traceable to a git tag in the source tree it was built from. No telemetry phones home from the app. No bundled analytics. No auto-update pinging.

The quarantine flag is macOS’s “this came from the internet” marker — not a malware detection. It trips by default on every download, signed or not. xattr -rd com.apple.quarantine is a built-in macOS utility (man xattr) that deletes one specific extended attribute on the app bundle. It doesn’t run any code, change any system setting, or touch anything else on your machine.

Source is private for now — this is an indie project, not an open-source product, and keeping the source closed while we’re pre-revenue is a deliberate call. Reach out if you need to audit it for enterprise use.

Once Apple Dev certs are on the budget the app will be signed and notarized, and this whole section gets deleted.

Section 03 · First run

Configure once. Never again.

Shared API keys are pre-wired so the app works on first launch. You just point each session card at its live URL and its repo.

API keys · optional

Shared keys, ready to go.

AgentDesk Settings modal showing TinyFish, Fireworks AI, OpenRouter key fields and the Composio GitHub + Slack integration row

Shared TinyFish, Fireworks AI, and OpenRouter keys are wired on install — AgentDesk works out of the box. Paste your own in Settings only if you want isolation from the shared quotas.

Connect GitHub in the same panel to unlock the PR flow — Slack is optional, only if you want verification-failure alerts in a channel.

Per project · required

One card. Three fields. Ship.

AgentDesk Project configuration modal for the devauth project showing APP URL, GITHUB REPO, and OAUTH TEST URL fields

Every session card wires to one live URL, one GitHub repo, and — if you want OAuth verification — one OAuth test URL. One click per project, saved forever.

E2E scenarios, content regex, status-URL probing, and Slack alerts each live in their own tab on the same modal. Empty tabs mean no checks of that type — AgentDesk only runs what you configure.

Section 04 · Walkthrough

How a push becomes a verified merge.

Six steps, end to end, lifted from the app as it runs today. No reconstruction, no marketing mock-ups — these are the screens you land on five minutes after install.

  1. STEP
    01
    sessions · detected

    Claude Code runs. AgentDesk watches.

    Scans ~/.claude/projects every few seconds. Each session card carries the project name, branch, last commit, deploy URL, and a live pulse when the agent is still working.

    • Hook receiver on localhost:9876 fires on SessionStart / Stop
    • Groups sessions by real cwd — cd into another project mid-session and it lands on the right card
    • Idle sessions tuck into a Past Sessions folder so the dashboard stays focused on what's running
    01
    dashboard.view
    Dashboard with three active Claude Code sessions and verification status dots
  2. STEP
    02
    push · detected

    Agent pushes. A toast fires.

    The moment a git push lands on a branch tracked by AgentDesk, a verification job is queued. Claude Code's Stop hook fires when the agent finishes a turn — AgentDesk reads the session transcript, spots the push command, and kicks off the checks. No polling, no manual trigger.

    • Pre-flight push check catches branches never sent to origin
    • One toast per push — deduplicated across every session card
    • Native macOS notification when the dashboard isn't focused
    02
    viewport.live
    Claude Code session with a verification toast firing after a push
  3. STEP
    03
    browser · live

    TinyFish drives your URL.

    TinyFish navigates the actual production site over SSE. You watch the page render in a live iframe inside AgentDesk while structured JSON streams in alongside — which selectors fired, which buttons clicked, what each network response was.

    • Deploy / OAuth / E2E / Visual / Status — five check modes
    • Goals in natural language. No Playwright boilerplate.
    • Live streaming_url iframe = you see exactly what the bot sees
    03
    https://parksmart-lime.vercel.app/checkout
    live
    tinyfish · run-ssestep 4 / 7
    click button[data-test=“signup-start”]
    fill input[name=“email”] afshal@agentdesk.dev
    submit form#onboarding
    await confirmation “thanks! check your email”
    nav
    click
    fill
    assert
    real chromium · not headless · not a screenshot
  4. STEP
    04
    failure · explained

    Fireworks AI writes the one-sentence why.

    When a check fails, Fireworks AI’s gpt-oss-120b diagnoses it in about 1.4 seconds and correlates it to the commit that caused the break. Click “Suggest Fix” and Kimi K2.5 does few-shot RAG over past failures on this project and drafts the patch.

    • Diagnosis decoupled from the result emit — never blocks the toast
    • OpenRouter cascade auto-fallbacks when Fireworks AI is rate-limited
    • In-app AI Chat picks up the same context and keeps investigating
    04
    chat.session
    AI Chat panel with Fireworks diagnosis of a failing OAuth redirect
  5. STEP
    05
    pr · shipped

    Composio creates, polishes, merges. One tab.

    GitHub lives in a dedicated tab, wired through Composio. Click Create PR and an editable draft opens with an AI-polished title and body written from your actual commit log. Click Merge PR and the squash commit is already staged — green-check watch runs live.

    • Fork-aware. Works on forks you don't own.
    • Post the diagnosis straight to the failing PR as a comment
    • Deployment-live toast fires the moment required checks turn green
    05
    viewport.live
    GitHub tab listing open PRs across projects with inline Create PR and Merge PR buttons
  6. STEP
    06
    history · searchable

    Every run remembered.

    Every verification run — checks, duration, commit SHA, diagnosis, Suggest Fix output — persists locally. Filter by project, date range, or failure class. Export as JSON. Re-run any check straight from the row.

    • Plain JSON on disk at ~/.config/agentdesk — zero cloud dependency
    • Diagnosis + Suggest Fix saved alongside each run
    • Powers the few-shot RAG that makes Kimi's fixes better over time
    06
    viewport.live
    Verification history with filter, date-range, and per-run diagnosis
Section 05 · The loop

Push to merged. Hands-off.

One agent push. No manual clicking through the app to see if it still works. Five events, a real browser, one sentence of diagnosis, one click to ship.

  1. push
    beat 01

    Agent pushes

    Claude Code finishes a feature branch. It commits and pushes to origin like any developer would.

  2. instant
    beat 02

    AgentDesk fires

    The toast lights up. A verification job is queued against the deployed URL the project is mapped to.

  3. live
    beat 03

    TinyFish verifies

    A real Chromium session navigates the production URL, runs the goal end-to-end, and streams every step into the app. You watch it happen.

  4. moments later
    beat 04

    Fireworks AI diagnoses

    If it fails, gpt-oss-120b writes the one-sentence cause and maps it to the offending commit SHA.

  5. one click
    beat 05

    You merge via Composio

    Open the PR in the GitHub tab. Click Merge. AgentDesk watches the required checks go green. Deployment-live toast fires.

the loop closes

No human ever opened a browserto find out if production still worked.

Section 06 · Traction

What people who don't work for us said about it.

No paid distribution. No asks. Every quote below is from a public post made by the named person. Screenshots of the originals are available on request.

JZ
Jason Zhou
founder · superdesign · his own post
linkedin
Great to see awesome projects coming out of this. My fav ones: AgentDesk by Afshal Gulam gives AI coding agents the ability to self-test and verify apps/features they've built. Vibe code + vibe verify in parallel.
angel investor · listed AgentDesk first of three
LinkedIn · build-in-public post

One post. No paid distribution. Organic reach only.

impressions
13,935
video views
5,447
reactions
162
comments
27
read the post →
TF
@Tiny_Fish
quote-tweet · mar 6 2026
x / twitter
Look at this 🤩 ai coding agents paired with web agent to verify what each of them created 🔥 all powers users of these agents would love it!
tinyfish · official account@Tiny_Fish →
More public voices
  • Trevor I. Lasn · CEO 0xinsider
    “This is exactly where I’m at right now — running multiple Claude Code sessions in parallel, and the actual bottleneck is me clicking through the app to make sure everything still works.”
  • Paolo Perrone · ML engineer
    “Three agents shipping simultaneously. Verification became the human bottleneck. Full circle.”
  • Yangshun Tay · GreatFrontEnd founder, ex-Meta · TinyFish advisor
    “Very cool! I didn’t know this was by an NUS student. Good stuff and all the best.”
Section 07 · Pricing

Free while we’re pre-pricing.

Shared keys cover everyone today. Below is the ladder we’re building toward — live once paid tiers turn on.

Today every tier is free. You run on our pooled TinyFish / Fireworks AI / Composio keys at our cost while we figure the mix out. No credit card, no sign-up — you just install and use it.

Free
$0bring your own keys

Paste your TinyFish, Fireworks AI, and OpenRouter keys in Settings. Unlimited checks at your own cost.

  • Unlimited verification runs
  • All check types — deploy, OAuth, E2E, visual, status
  • Full GitHub + Slack integrations
  • Local history, diagnosis, Suggest Fix
recommended
Pro
$19per user / month

Our pooled API keys. No key management. Fixed monthly check quota, zero per-run cost anxiety.

  • Shared pooled TinyFish + Fireworks AI quotas
  • No key management, no provider signups
  • Every Free-tier feature included
  • Priority support via email
Team
$39per seat / month

Multi-user projects. Shared Slack alerts. Higher quotas. For 2–20-dev shops running AI agents in parallel.

  • Everything in Pro
  • Multi-user project access
  • Shared Slack alert routing
  • Higher verification quota per seat
Enterprise
$149 – $199per seat / month

For orgs whose legal and IT teams need SSO, audit logs, and an SLA before anything can ship.

  • Single sign-on (Okta, Google Workspace)
  • Admin roles, audit logs
  • Enterprise SLA
  • Dedicated onboarding + Slack channel
Section 08 · Releases

Shipping weekly.

Full changelog →
v2.2.1
Apr 20, 2026

Patch release. One bug fix, no feature changes.

Fixed

  • Toast spam on failing verifications.
read the full notes →
v2.2.0
Apr 19, 2026

A serious polish pass on AI Chat, GitHub workflows, and the first-run experience. The app is now much more honest about what's configured, much more forgiving when things go wrong, and much harder to misuse.

Highlights

  • Install Claude Code right from the app — no terminal dance.
  • AI Chat got edit / rewind / delete on past messages, GitHub- flavored markdown rendering, and a new Focus mode that tucks the chat beside the dashboard so you can reference checks while chatting.
read the full notes →
Ready · macOS 13+ · Apple Silicon

Every agent ships.
None of them verifies.
You deserve both.

Free to download. No signup. No email. No telemetry. If it works for you, tell a friend.

v2.2.1 · aarch64-apple-darwin