Skip to content

feat: ai-chat reference project + MCP agent-chat tooling (4/4)#3546

Open
ericallam wants to merge 1 commit into
feature/agent-view-dashboardfrom
feature/ai-chat-reference-and-cli
Open

feat: ai-chat reference project + MCP agent-chat tooling (4/4)#3546
ericallam wants to merge 1 commit into
feature/agent-view-dashboardfrom
feature/ai-chat-reference-and-cli

Conversation

@ericallam
Copy link
Copy Markdown
Member

@ericallam ericallam commented May 10, 2026

Summary

A complete Next.js reference project that exercises chat.agent end-to-end, plus the CLI MCP tools that let Claude Code, Cursor, and similar IDE agents drive a deployed chat.agent task from the editor. Builds on #3545.

Design

references/ai-chat is a full Next.js app: prisma-backed persistence, multi-chat sidebar, per-chat model picker, debug panel, tool examples (getCurrentTime, searchHackerNews, createGithubIssue, PR review helpers, code sandbox), and smoke tests. It's intended both as a copy-paste starting point and as a place to regression-test SDK changes.

The CLI gains MCP tools (start_agent_chat, send_agent_message, close_agent_chat, list_agents) so an IDE agent can converse with a deployed chat.agent task. The dev runtime adds one-shot OOM kill on the run controller and skills bundling in the build pipeline.

Test plan

  • cd references/ai-chat && pnpm install && pnpm trigger:dev
  • Open the Next.js app, create a chat, exchange messages, verify persistence across reloads
  • Trigger a tool requiring HITL approval, approve from the UI, verify resume
  • From an IDE agent, call start_agent_chat against the running dev task, send a message, verify the response streams back

Stack

Part of a 4-PR stack. Merge bottom-up.

  1. feat: Sessions dashboard, task_kind, and chat-ready hardening (1/4) #3542main — Sessions dashboard + chat-ready hardening
  2. feat(sdk): chat.agent — runtime + browser transport (2/4) #3543feat: Sessions dashboard, task_kind, and chat-ready hardening (1/4) #3542chat.agent runtime + browser transport
  3. feat(webapp): agent-view dashboard for chat.agent runs (3/4) #3545feat(sdk): chat.agent — runtime + browser transport (2/4) #3543 — agent-view dashboard
  4. This PR (feat: ai-chat reference project + MCP agent-chat tooling (4/4) #3546) → feat(webapp): agent-view dashboard for chat.agent runs (3/4) #3545

This is part 2 of 5 in a stack made with GitButler:

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 10, 2026

🦋 Changeset detected

Latest commit: 2cd4c8f

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 32 packages
Name Type
@trigger.dev/sdk Patch
trigger.dev Patch
@trigger.dev/python Patch
@internal/sdk-compat-tests Patch
references-ai-chat Patch
d3-chat Patch
references-d3-openai-agents Patch
references-nextjs-realtime Patch
references-realtime-hooks-test Patch
references-realtime-streams Patch
references-telemetry Patch
@trigger.dev/build Patch
@trigger.dev/core Patch
@trigger.dev/plugins Patch
@trigger.dev/react-hooks Patch
@trigger.dev/redis-worker Patch
@trigger.dev/rsc Patch
@trigger.dev/schema-to-json Patch
@trigger.dev/database Patch
@trigger.dev/otlp-importer Patch
@trigger.dev/rbac Patch
@internal/cache Patch
@internal/clickhouse Patch
@internal/llm-model-catalog Patch
@internal/redis Patch
@internal/replication Patch
@internal/run-engine Patch
@internal/schedule-engine Patch
@internal/testcontainers Patch
@internal/tracing Patch
@internal/tsql Patch
@internal/zod-worker Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 10, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 0b819165-181f-40b7-8624-8fd9b2d2ced5

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/ai-chat-reference-and-cli

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 potential issues.

View 4 additional findings in Devin Review.

Open in Devin Review

},
});
session.runId = result.id;
session.lastEventId = undefined;
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot May 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Resetting lastEventId to undefined on upgrade causes SSE replay of entire session history

In collectAgentResponse, when a trigger:upgrade-required chunk is received, session.lastEventId is set to undefined (line 390) before recursively calling collectAgentResponse. The recursive call creates a new SSEStreamSubscription using session.lastEventId (now undefined) as the lastEventId option (agentChat.ts:338), which means the new subscription replays the entire session .out stream from the very beginning.

Since session .out is a durable stream containing all historical chunks across runs, the replayed events will include old turns' trigger:turn-complete chunks. The first historical trigger:turn-complete hit (line 366-368) immediately breaks the collection loop, causing the function to return with empty/partial text from a previous turn instead of the new run's response.

Expected fix

Keep session.lastEventId as-is (pointing to the trigger:upgrade-required chunk's SSE id) so the recursive subscription resumes right after the upgrade marker, where the new run's output will appear.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment thread packages/cli-v3/src/entryPoints/managed-index-controller.ts
@ericallam ericallam force-pushed the feature/agent-view-dashboard branch from ecfac76 to 7ee523e Compare May 11, 2026 19:01
@ericallam ericallam force-pushed the feature/ai-chat-reference-and-cli branch from 882544c to 5baac29 Compare May 11, 2026 19:01
@ericallam ericallam force-pushed the feature/agent-view-dashboard branch from 7ee523e to fdc61c6 Compare May 12, 2026 08:23
@ericallam ericallam force-pushed the feature/ai-chat-reference-and-cli branch from 5baac29 to e16efbb Compare May 12, 2026 08:23
@ericallam ericallam force-pushed the feature/agent-view-dashboard branch from fdc61c6 to 96700b1 Compare May 12, 2026 08:35
@ericallam ericallam force-pushed the feature/ai-chat-reference-and-cli branch from e16efbb to f4a7923 Compare May 12, 2026 08:35
@ericallam ericallam force-pushed the feature/agent-view-dashboard branch from 96700b1 to 482d752 Compare May 12, 2026 08:40
@ericallam ericallam force-pushed the feature/ai-chat-reference-and-cli branch from f4a7923 to 04b98af Compare May 12, 2026 08:40
@ericallam ericallam force-pushed the feature/agent-view-dashboard branch from 482d752 to 920e876 Compare May 12, 2026 08:46
@ericallam ericallam force-pushed the feature/ai-chat-reference-and-cli branch from 04b98af to 417344a Compare May 12, 2026 08:46
@ericallam ericallam force-pushed the feature/agent-view-dashboard branch from 920e876 to d96e2f7 Compare May 12, 2026 08:52
@ericallam ericallam force-pushed the feature/ai-chat-reference-and-cli branch from 417344a to 43fde3f Compare May 12, 2026 08:52
@ericallam ericallam force-pushed the feature/agent-view-dashboard branch from d96e2f7 to 3256f42 Compare May 12, 2026 09:44
@ericallam ericallam force-pushed the feature/ai-chat-reference-and-cli branch from 43fde3f to c27cef9 Compare May 12, 2026 09:44
@ericallam ericallam force-pushed the feature/agent-view-dashboard branch from 3256f42 to 220b33c Compare May 12, 2026 09:46
@ericallam ericallam force-pushed the feature/ai-chat-reference-and-cli branch from c27cef9 to 48f030a Compare May 12, 2026 09:47
@ericallam ericallam force-pushed the feature/agent-view-dashboard branch from 220b33c to 067109f Compare May 12, 2026 09:48
@ericallam ericallam force-pushed the feature/ai-chat-reference-and-cli branch from 48f030a to 3c71eeb Compare May 12, 2026 09:49
@ericallam ericallam force-pushed the feature/agent-view-dashboard branch from 067109f to c1f6db7 Compare May 12, 2026 09:52
@ericallam ericallam force-pushed the feature/ai-chat-reference-and-cli branch from 3c71eeb to 866b175 Compare May 12, 2026 09:52
@ericallam ericallam force-pushed the feature/agent-view-dashboard branch from c1f6db7 to f75bcd8 Compare May 12, 2026 10:02
@ericallam ericallam force-pushed the feature/ai-chat-reference-and-cli branch from 866b175 to 2d9bcdb Compare May 12, 2026 10:02
@ericallam ericallam force-pushed the feature/agent-view-dashboard branch from f75bcd8 to 40a3dff Compare May 12, 2026 10:07
@ericallam ericallam force-pushed the feature/ai-chat-reference-and-cli branch from 2d9bcdb to d9fa1be Compare May 12, 2026 10:07
@ericallam ericallam force-pushed the feature/agent-view-dashboard branch from 40a3dff to 1748445 Compare May 12, 2026 10:16
@ericallam ericallam force-pushed the feature/ai-chat-reference-and-cli branch from d9fa1be to b7ed332 Compare May 12, 2026 10:16
@ericallam ericallam force-pushed the feature/ai-chat-reference-and-cli branch from 78a26a6 to 5e0526a Compare May 13, 2026 09:05
@ericallam ericallam force-pushed the feature/agent-view-dashboard branch from 8615ad0 to f7197ef Compare May 13, 2026 14:21
@ericallam ericallam force-pushed the feature/ai-chat-reference-and-cli branch from 5e0526a to 36ccf22 Compare May 13, 2026 14:21
@ericallam ericallam force-pushed the feature/ai-chat-reference-and-cli branch from 36ccf22 to a90949b Compare May 13, 2026 15:46
@ericallam ericallam force-pushed the feature/agent-view-dashboard branch from f7197ef to 5c937df Compare May 14, 2026 09:10
@ericallam ericallam force-pushed the feature/ai-chat-reference-and-cli branch from 75369f3 to cf2fa07 Compare May 14, 2026 09:11
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 3 new potential issues.

View 7 additional findings in Devin Review.

Open in Devin Review

Comment on lines +107 to 108
prompts: workerManifest.prompts,
queues: workerManifest.queues,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Managed-index-controller omits skills from deployment registration body

Both dev-index-worker.ts:187 and managed-index-worker.ts:183 now include skills: resourceCatalog.listSkillManifests() in the worker manifest, and the managed-index-controller was updated to forward the new prompts field (prompts: workerManifest.prompts at line 107). However, the corresponding skills: workerManifest.skills line was not added to the backgroundWorkerBody.metadata object. This means skill manifests registered via skills.define() will be collected by the worker during managed (production) deployments but silently dropped when the controller sends the registration request to the server — skills won't be available server-side in deployed environments.

Suggested change
prompts: workerManifest.prompts,
queues: workerManifest.queues,
prompts: workerManifest.prompts,
skills: workerManifest.skills,
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment thread packages/cli-v3/src/mcp/tools/agentChat.ts
@@ -0,0 +1 @@
lib/generated/
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Generated Prisma files committed in references/ai-chat

The references/ai-chat/lib/generated/prisma/ directory contains Prisma-generated client files checked into git. The .gitignore at references/ai-chat/.gitignore lists lib/generated/ which should exclude these, but the files are present in the diff. This might indicate the gitignore was added after the files were committed, or the files were force-added. These files add significant noise to the PR diff (~5000+ lines of generated code). Worth verifying the gitignore is working and removing tracked generated files if appropriate.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@ericallam ericallam force-pushed the feature/agent-view-dashboard branch from 5c937df to f5e3067 Compare May 14, 2026 09:32
@ericallam ericallam force-pushed the feature/ai-chat-reference-and-cli branch from cf2fa07 to fef74f4 Compare May 14, 2026 09:33
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 10 additional findings in Devin Review.

Open in Devin Review

@@ -0,0 +1 @@
{"sessionId":"a0e063d3-034b-40fe-90d0-7a6aff597e26","pid":72012,"procStart":"Tue May 12 17:34:30 2026","acquiredAt":1778608778207} No newline at end of file
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Committed lock file with active session data

.claude/scheduled_tasks.lock contains a live session ID, PID, and timestamp from a specific development machine. This file appears to be machine-local state that shouldn't be committed to the repository. It could cause conflicts for other developers and doesn't serve a purpose in version control.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@ericallam ericallam force-pushed the feature/agent-view-dashboard branch from f5e3067 to 6a1e9af Compare May 14, 2026 11:06
@ericallam ericallam force-pushed the feature/ai-chat-reference-and-cli branch from fef74f4 to 343c204 Compare May 14, 2026 11:06
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 12 additional findings in Devin Review.

Open in Devin Review

Comment on lines +228 to +252
try {
await session.apiClient.appendToSessionStream(
session.sessionId,
"in",
serializeInputChunk({ kind: "message", payload: wirePayload })
);
} catch (sendErr: any) {
const result = await session.apiClient.triggerTask(session.agentId, {
payload: {
message: userMessage,
chatId: session.chatId,
sessionId: session.sessionId,
trigger: "submit-message",
metadata: session.clientData,
continuation: true,
previousRunId: session.runId,
},
options: {
payloadType: "application/json",
tags: [`chat:${session.chatId}`],
},
});
session.runId = result.id;
session.lastEventId = undefined;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 MCP sendAgentMessage fallback triggers a new run on any appendToSessionStream failure

In sendAgentMessageTool, when session.runId is set, the code tries appendToSessionStream and on any catch falls back to triggerTask to start a new run (line 234-252). This means transient network errors (timeouts, 500s) also trigger a brand-new run rather than retrying the append. The comment describes this as intentional ("run ended, token expired, etc."), but it could lead to orphaned runs if the real cause was a transient failure and the original run is still alive. In practice, the session's run-manager server-side deduplication likely prevents harmful consequences, but this is worth being aware of.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@ericallam ericallam force-pushed the feature/agent-view-dashboard branch from 6a1e9af to 04e1747 Compare May 14, 2026 12:13
@ericallam ericallam force-pushed the feature/ai-chat-reference-and-cli branch from 343c204 to 480c5ea Compare May 14, 2026 12:13
@ericallam ericallam force-pushed the feature/agent-view-dashboard branch from 04e1747 to 2a7d030 Compare May 14, 2026 12:30
@ericallam ericallam force-pushed the feature/ai-chat-reference-and-cli branch 2 times, most recently from 8954526 to 289ab08 Compare May 14, 2026 12:32
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 13 additional findings in Devin Review.

Open in Devin Review

!completion.ok &&
(isOOMRunError(completion.error) || isManualOutOfMemoryError(completion.error))
) {
this.discardProcessOnReturn = true;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 discardProcessOnReturn never reset between retry attempts, causing healthy processes to be force-killed

The discardProcessOnReturn flag is set to true when an OOM is detected (lines 554 and 689) but is never reset back to false. When the completion result triggers a RETRY_IMMEDIATELY at packages/cli-v3/src/entryPoints/dev-run-controller.ts:784, it calls startAndExecuteRunAttemptexecuteRun, which obtains a new process from the pool (line 611). After the retry completes — even if it succeeds cleanly — the process is returned with forceKill: this.discardProcessOnReturn (line 696), which is still true from the previous OOM attempt. This means a perfectly healthy process is force-killed instead of being returned to the pool for reuse. The fix is to reset this.discardProcessOnReturn = false at the top of executeRun (near line 608 where isCompletingRun is reset).

Prompt for agents
The `discardProcessOnReturn` flag is set to `true` on OOM detection but never reset between retry attempts. When `handleCompletionResult` triggers RETRY_IMMEDIATELY, the subsequent call to `executeRun` gets a fresh process from the pool, but the stale flag causes that healthy process to be force-killed on return.

Fix: Reset `this.discardProcessOnReturn = false` at the top of `executeRun()` in `dev-run-controller.ts`, near line 608 where `this.isCompletingRun = false` is already reset. This ensures each attempt starts with a clean slate — only the attempt that actually OOMs will force-kill its process.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

ericallam added a commit that referenced this pull request May 14, 2026
…3542)

## Summary

A `/sessions` dashboard for inspecting durable Sessions, an `AGENT` /
`SCHEDULED` task-kind filter for the runs list, and the server-side
hardening (rate-limit exemption for packets, retry-with-backoff on
stream appends, typed too-large-chunk error) that the `chat.agent`
runtime in #3543 needs. Builds on the Sessions primitive shipped in
#3417.

## Design

The Sessions list + detail routes mirror the run inspector pattern.
`TaskTriggerSource` gains `AGENT` and `SCHEDULED` values, persisted on
`BackgroundWorker.taskKind` and `TaskRun.taskKind` (plus a matching
Clickhouse column), so the runs list can filter by kind.

New `@trigger.dev/core` modules — `sessionStreams`, `inputStreams`, a
`sessionStreamInstance` for realtime streams, and the
`realtime-streams-api` / `session-streams-api` surfaces — expose the
typed shapes that chat.agent will use to drive `session.out`.
`ChatChunkTooLargeError` lets the runtime drop oversized chunks with a
typed surface instead of failing the run. `s2Append` retries transient
failures with exponential backoff. `/api/v[12]/packets/*` is exempt from
customer rate limits so chat snapshot reads and writes don't get
throttled under load.

## Stack

Part of a 4-PR stack. Merge bottom-up.

1. **This PR** (#3542) → `main`
2. #3543#3542 — `chat.agent` runtime + browser transport
3. #3545#3543 — agent-view dashboard
4. #3546#3545 — ai-chat reference + MCP tooling

Replaces #3173 (closed).

<!-- GitButler Footer Boundary Top -->
---
This is **part 5 of 5 in a stack** made with GitButler:
- <kbd>&nbsp;5&nbsp;</kbd> #3612
- <kbd>&nbsp;4&nbsp;</kbd> #3546
- <kbd>&nbsp;3&nbsp;</kbd> #3545
- <kbd>&nbsp;2&nbsp;</kbd> #3543
- <kbd>&nbsp;1&nbsp;</kbd> #3542 👈 
<!-- GitButler Footer Boundary Bottom -->
@ericallam ericallam force-pushed the feature/agent-view-dashboard branch from 2a7d030 to 5fa5d3d Compare May 14, 2026 12:44
@ericallam ericallam force-pushed the feature/ai-chat-reference-and-cli branch from 289ab08 to e373556 Compare May 14, 2026 12:45
@ericallam ericallam force-pushed the feature/agent-view-dashboard branch from 5fa5d3d to f8d5198 Compare May 14, 2026 13:10
@ericallam ericallam force-pushed the feature/ai-chat-reference-and-cli branch from e373556 to 55d9543 Compare May 14, 2026 13:11
Top of the chat.agent stack: a full Next.js reference project that
exercises chat.agent end-to-end, plus the CLI MCP tools that drive
agent runs from Claude Code / Cursor / etc.

references/ai-chat:
- Full Next.js app with prisma persistence, multi-chat sidebar,
  per-chat model picker, debug panel, tool examples, smoke tests
- Reference tools: getCurrentTime, searchHackerNews, createGithubIssue,
  PR review helpers, code sandbox
- chat-client-test orchestrator for concurrent-send stress
- references/hello-world chatAgent + triggerAndSubscribe examples

CLI MCP tooling for chat.agent:
- mcp/tools/agentChat.ts (start_agent_chat, send_agent_message,
  close_agent_chat)
- mcp/tools/agents.ts + tasks.ts (list agents, agent run details)
- dev-run-controller OOM kill + taskRunProcessPool tweaks
- dev/managed entry-point hooks for skills bundling
- buildWorker + bundleSkills (agent skills support)

Includes ai-tool-helpers + mcp-agent-chat-sessions changesets, plus
the streamdown@2 patch and pnpm-lock reconciliation.

(Will be renamed to feature/ai-chat-reference-and-cli before push.)
@ericallam ericallam force-pushed the feature/agent-view-dashboard branch from f8d5198 to a08c3a4 Compare May 14, 2026 14:11
@ericallam ericallam force-pushed the feature/ai-chat-reference-and-cli branch from 55d9543 to 2cd4c8f Compare May 14, 2026 14:11
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 13 additional findings in Devin Review.

Open in Devin Review

Comment on lines +107 to 108
prompts: workerManifest.prompts,
queues: workerManifest.queues,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 managed-index-controller.ts forwards prompts but not skills to worker registration

This PR adds skills: resourceCatalog.listSkillManifests() to both dev-index-worker.ts:187 and managed-index-worker.ts:183, so the worker manifest now includes skills. Meanwhile managed-index-controller.ts:107 adds prompts: workerManifest.prompts to CreateBackgroundWorkerRequestBody but does NOT add skills: workerManifest.skills. If the server-side schema (CreateBackgroundWorkerRequestBody) accepts a skills field, this is an incomplete transformation — skills discovered during deployment indexing won't be registered with the webapp. If the schema doesn't have a skills field yet (planned for a later PR), this is fine. Worth confirming which is the case.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants