feat: Sessions dashboard, task_kind, and chat-ready hardening (1/4) by ericallam · Pull Request #3542 · triggerdotdev/trigger.dev

ericallam · 2026-05-10T21:36:08Z

Summary

A /sessions dashboard for inspecting durable Sessions, an AGENT / SCHEDULED task-kind filter for the runs list, and the server-side hardening (rate-limit exemption for packets, retry-with-backoff on stream appends, typed too-large-chunk error) that the chat.agent runtime in #3543 needs. Builds on the Sessions primitive shipped in #3417.

Design

The Sessions list + detail routes mirror the run inspector pattern. TaskTriggerSource gains AGENT and SCHEDULED values, persisted on BackgroundWorker.taskKind and TaskRun.taskKind (plus a matching Clickhouse column), so the runs list can filter by kind.

New @trigger.dev/core modules — sessionStreams, inputStreams, a sessionStreamInstance for realtime streams, and the realtime-streams-api / session-streams-api surfaces — expose the typed shapes that chat.agent will use to drive session.out. ChatChunkTooLargeError lets the runtime drop oversized chunks with a typed surface instead of failing the run. s2Append retries transient failures with exponential backoff. /api/v[12]/packets/* is exempt from customer rate limits so chat snapshot reads and writes don't get throttled under load.

Test plan

Open /sessions, run a session via the SDK, verify it appears in the list with record counts
Open a session detail view, confirm .in / .out records render
Close a session from the close action, verify the status flips
Filter runs by AGENT kind, verify only agent-shaped runs appear
Run prisma migrate dev against a fresh database — all three new migrations apply

Stack

Part of a 4-PR stack. Merge bottom-up.

This PR (feat: Sessions dashboard, task_kind, and chat-ready hardening (1/4) #3542) → main
feat(sdk): chat.agent — runtime + browser transport (2/4) #3543 → feat: Sessions dashboard, task_kind, and chat-ready hardening (1/4) #3542 — chat.agent runtime + browser transport
feat(webapp): agent-view dashboard for chat.agent runs (3/4) #3545 → feat(sdk): chat.agent — runtime + browser transport (2/4) #3543 — agent-view dashboard
feat: ai-chat reference project + MCP agent-chat tooling (4/4) #3546 → feat(webapp): agent-view dashboard for chat.agent runs (3/4) #3545 — ai-chat reference + MCP tooling

Replaces #3173 (closed).

This is part 5 of 5 in a stack made with GitButler:

changeset-bot · 2026-05-10T21:36:13Z

🦋 Changeset detected

Latest commit: be1a6cf

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 31 packages

Name	Type
@trigger.dev/core	Minor
@trigger.dev/sdk	Minor
@trigger.dev/build	Minor
trigger.dev	Minor
@trigger.dev/plugins	Minor
@trigger.dev/python	Minor
@trigger.dev/redis-worker	Minor
@trigger.dev/schema-to-json	Minor
@internal/cache	Patch
@internal/clickhouse	Patch
@internal/llm-model-catalog	Patch
@trigger.dev/rbac	Minor
@internal/redis	Patch
@internal/replication	Patch
@internal/run-engine	Patch
@internal/schedule-engine	Patch
@internal/testcontainers	Patch
@internal/tracing	Patch
@internal/tsql	Patch
@internal/zod-worker	Patch
@internal/sdk-compat-tests	Patch
d3-chat	Patch
references-d3-openai-agents	Patch
references-nextjs-realtime	Patch
references-realtime-hooks-test	Patch
references-realtime-streams	Patch
references-telemetry	Patch
@trigger.dev/react-hooks	Minor
@trigger.dev/rsc	Minor
@trigger.dev/database	Minor
@trigger.dev/otlp-importer	Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

coderabbitai · 2026-05-10T21:36:40Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Introduces sessions listing/detail UI and server presenters, realtime session-stream SSE/JSON endpoints, and a session stream manager in the core SDK. Adds run “source” filtering with taskKind propagation end-to-end, including icons and filters in the runs UI, repositories, and presenters. Extends API client with sessions lifecycle and stream operations. Updates ClickHouse and Prisma schemas for task kind and playground conversations. Adds input/session streams management APIs, improved SSE retry logic, and tests.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feature/sessions-primitive

…mpling (#3567) ## Summary Follow-up to #3561. The drift-audit workflow timed out on PR #3542 (92 files, +5962 lines) by hitting `--max-turns 15` before reaching a verdict, leaving a red ❌ on that PR with no sticky comment. ## Changes - `--max-turns` bumped from 15 to 30. - Prompt now opens with an explicit "Strategy" section: read REVIEW.md once, scan the file-list only, open at most 5 files (3-5 on PRs >50 files), and bias toward finishing over exploring. - Final rule: *"when in doubt between one more file read and finish now — finish now."* The audit is allowed to miss things. It is not allowed to time out and leave a red X. ## Test plan - [ ] Verify this PR's audit posts `✅ REVIEW.md looks current for this PR.` (small diff) - [ ] After merge, retry the audit on #3542 or a similarly large PR and confirm it completes

Four fixes from the #3542 review pass. webapp/runEngine/queues.server.ts — non-locked-worker path of getTaskQueueInfo skipped the task lookup when the caller provided both a queue override and a per-trigger TTL, leaving `taskKind` undefined. AGENT/SCHEDULED runs hitting this path got stamped as STANDARD in ClickHouse and disappeared from the dashboard's Source filter. Mirrors the locked-worker fix above (always fetch triggerSource). webapp/presenters/v3/SessionListPresenter.server.ts — current-run lookup wasn't scoped to projectId + runtimeEnvironmentId. Session .currentRunId has no FK, so a stale or corrupted pointer could surface another tenant's run. The list query is env-scoped; this adds the same fence to the run lookup. webapp/services/realtime/sessionRunManager.server.ts — after a lost claim race, the post-reload probe of fresh.currentRunId went through getRunStatusAndFriendlyId which reads from $replica. The replica can lag behind the writer the winner just wrote to, miss the live run, and force another trigger+recurse up to ENSURE_RUN_FOR_SESSION_MAX_ATTEMPTS. Probe the writer for the same read-after-write reason the fresh reload already used. trigger-sdk/v3/shared.ts — triggerAndSubscribe leaked the abort listener on normal completion. `{ once: true }` only auto-removes after firing; long-lived signals shared across many calls accumulated dead listeners pinning apiClient + response.id until GC. Wrap the subscribe loop in try/finally and removeEventListener on every exit path. Also switched the synchronous-pre-aborted throw to a DOMException with name AbortError so callers can detect the abort with the standard err.name === 'AbortError' check.

…ign-note divergence Three more #3542 review fixes addressing the design-question bucket. sessionStreams/manager.ts + inputStreams/manager.ts — both #runTail loops swallowed errors and the .finally reconnected immediately whenever hasHandlers || hasWaiters. A persistent backend failure (auth rejection, 5xx, DNS) would reconnect in a tight loop with no rate limiting. Both managers now exponentially back off: 1s base, doubling per attempt, capped at 30s, plus 0–1s jitter. A reconnectAttempts counter resets to 0 on every successful #dispatch (any record flowing through = healthy connection), so transient blips don't accumulate. Per-waiter timeouts still bound how long any once() waits regardless. realtimeStreams/streamsWriterV2.ts + .test.ts — extracted the size-check + discriminant-extraction logic into encodeChunkOrError, a pure helper. Tests now exercise it directly, no `vi.mock("@s2-dev/ streamstore")` shim. The original vi.mock conflicted with the codebase rule of using testcontainers / not mocking; the new tests are framework-pure and faster. trigger-sdk/v3/shared.ts — added an in-code comment in triggerAndSubscribe explaining the error-shape divergence from triggerAndWait. The SerializedError surfaced by subscribeToRun strips the TaskRunError type discriminator at the server boundary (createJsonErrorObject in errors.ts:274), so the SDK can't reconstruct the discriminator on the receive side. Callers needing exact error-type matching should use triggerAndWait.

Four follow-up nits from the second-pass review on #3542. .server-changes/sessions-dashboard-and-task-source-filter.md — adds the missing high-level entry for the webapp surface (Sessions page + task-source filter on Runs). The two existing changesets only cover @trigger.dev/sdk and @trigger.dev/core, so the dashboard work wouldn't have shown up in a future server changelog. apps/webapp/app/routes/realtime.v1.sessions.$session.$io.records.ts — switched `const loader = ...; export { loader }` to `export const loader = ...` to match the sibling `api.v1.deployments.current.ts` and the rest of the route file convention. Functionally identical. packages/core/src/v3/sessionStreams/manager.ts + packages/core/src/v3/inputStreams/manager.ts — two clarifications: (1) added a JSDoc to `disconnect()` documenting that it intentionally leaves handlers and waiters in place, so any registered listener will trigger an auto-reconnect with backoff. Distinguishes from `reset()` (full clean state, rejects waiters) and `disconnectStream` (single key, stays down until fresh `on()`/`once()`). (2) `disconnectStream` now clears `reconnectAttempts` for the key — an explicit teardown is not evidence of a broken backend, and a future re-attach should start the backoff at attempt 0.

Four fixes from the #3542 review pass. webapp/runEngine/queues.server.ts — non-locked-worker path of getTaskQueueInfo skipped the task lookup when the caller provided both a queue override and a per-trigger TTL, leaving `taskKind` undefined. AGENT/SCHEDULED runs hitting this path got stamped as STANDARD in ClickHouse and disappeared from the dashboard's Source filter. Mirrors the locked-worker fix above (always fetch triggerSource). webapp/presenters/v3/SessionListPresenter.server.ts — current-run lookup wasn't scoped to projectId + runtimeEnvironmentId. Session .currentRunId has no FK, so a stale or corrupted pointer could surface another tenant's run. The list query is env-scoped; this adds the same fence to the run lookup. webapp/services/realtime/sessionRunManager.server.ts — after a lost claim race, the post-reload probe of fresh.currentRunId went through getRunStatusAndFriendlyId which reads from $replica. The replica can lag behind the writer the winner just wrote to, miss the live run, and force another trigger+recurse up to ENSURE_RUN_FOR_SESSION_MAX_ATTEMPTS. Probe the writer for the same read-after-write reason the fresh reload already used. trigger-sdk/v3/shared.ts — triggerAndSubscribe leaked the abort listener on normal completion. `{ once: true }` only auto-removes after firing; long-lived signals shared across many calls accumulated dead listeners pinning apiClient + response.id until GC. Wrap the subscribe loop in try/finally and removeEventListener on every exit path. Also switched the synchronous-pre-aborted throw to a DOMException with name AbortError so callers can detect the abort with the standard err.name === 'AbortError' check.

…ign-note divergence Three more #3542 review fixes addressing the design-question bucket. sessionStreams/manager.ts + inputStreams/manager.ts — both #runTail loops swallowed errors and the .finally reconnected immediately whenever hasHandlers || hasWaiters. A persistent backend failure (auth rejection, 5xx, DNS) would reconnect in a tight loop with no rate limiting. Both managers now exponentially back off: 1s base, doubling per attempt, capped at 30s, plus 0–1s jitter. A reconnectAttempts counter resets to 0 on every successful #dispatch (any record flowing through = healthy connection), so transient blips don't accumulate. Per-waiter timeouts still bound how long any once() waits regardless. realtimeStreams/streamsWriterV2.ts + .test.ts — extracted the size-check + discriminant-extraction logic into encodeChunkOrError, a pure helper. Tests now exercise it directly, no `vi.mock("@s2-dev/ streamstore")` shim. The original vi.mock conflicted with the codebase rule of using testcontainers / not mocking; the new tests are framework-pure and faster. trigger-sdk/v3/shared.ts — added an in-code comment in triggerAndSubscribe explaining the error-shape divergence from triggerAndWait. The SerializedError surfaced by subscribeToRun strips the TaskRunError type discriminator at the server boundary (createJsonErrorObject in errors.ts:274), so the SDK can't reconstruct the discriminator on the receive side. Callers needing exact error-type matching should use triggerAndWait.

Four follow-up nits from the second-pass review on #3542. .server-changes/sessions-dashboard-and-task-source-filter.md — adds the missing high-level entry for the webapp surface (Sessions page + task-source filter on Runs). The two existing changesets only cover @trigger.dev/sdk and @trigger.dev/core, so the dashboard work wouldn't have shown up in a future server changelog. apps/webapp/app/routes/realtime.v1.sessions.$session.$io.records.ts — switched `const loader = ...; export { loader }` to `export const loader = ...` to match the sibling `api.v1.deployments.current.ts` and the rest of the route file convention. Functionally identical. packages/core/src/v3/sessionStreams/manager.ts + packages/core/src/v3/inputStreams/manager.ts — two clarifications: (1) added a JSDoc to `disconnect()` documenting that it intentionally leaves handlers and waiters in place, so any registered listener will trigger an auto-reconnect with backoff. Distinguishes from `reset()` (full clean state, rejects waiters) and `disconnectStream` (single key, stays down until fresh `on()`/`once()`). (2) `disconnectStream` now clears `reconnectAttempts` for the key — an explicit teardown is not evidence of a broken backend, and a future re-attach should start the backoff at attempt 0.

Adds Sessions, a durable, run-aware stream primitive that scopes session.in / session.out records to a session (not a single run). Records survive run boundaries; reconnect-from-last-event-id is built in. Server foundation: - New /realtime/v1/sessions/:session/:io/append + /records routes - sessionRunManager + sessionsRepository + clickhouseSessionsRepository - mintRunToken for short-lived per-session tokens - s2Append retry-with-backoff + undici cause diagnostics - /api/v[12]/packets/* exempt from customer rate limits - BackgroundWorker schema gains taskKind enum (TASK, AGENT, SCHEDULED) - TaskRun.taskKind column + clickhouse 029_add_task_kind_to_task_runs_v2 Core types: - new sessionStreams, inputStreams, realtimeStreams packages in @trigger.dev/core - session-streams-api / realtime-streams-api surface Sessions dashboard UI (the primitive's own viewer): - /sessions index + detail routes - SessionsTable, SessionFilters, SessionStatus, CloseSessionDialog - AGENT/SCHEDULED filter in RunFilters + TaskTriggerSource Includes the sessions-primitive changeset.

devin-ai-integration

Devin Review found 1 new potential issue.

View 14 additional findings in Devin Review.

ericallam mentioned this pull request May 10, 2026

feat(sdk): AI SDK custom useChat transport & chat.task harness #3173

Closed

This comment was marked as resolved.

Sign in to view

ericallam force-pushed the feature/sessions-primitive branch from b84d537 to ed7bf97 Compare May 11, 2026 19:01

This comment was marked as resolved.

Sign in to view

ericallam force-pushed the feature/sessions-primitive branch from ed7bf97 to 365e73b Compare May 12, 2026 08:23

ericallam changed the title ~~feat: Sessions primitive — durable run-aware streams + dashboard (1/5)~~ feat: Sessions dashboard, task_kind, and chat-ready hardening (1/5) May 12, 2026

This comment was marked as resolved.

Sign in to view

ericallam force-pushed the feature/sessions-primitive branch from 365e73b to b4a0986 Compare May 12, 2026 08:35

This comment was marked as resolved.

Sign in to view

ericallam force-pushed the feature/sessions-primitive branch 2 times, most recently from 1712b59 to 3721c34 Compare May 12, 2026 08:46

This comment was marked as resolved.

Sign in to view

ericallam force-pushed the feature/sessions-primitive branch from 3721c34 to f240799 Compare May 12, 2026 08:52

ericallam changed the title ~~feat: Sessions dashboard, task_kind, and chat-ready hardening (1/5)~~ feat: Sessions dashboard, task_kind, and chat-ready hardening (1/4) May 12, 2026

ericallam force-pushed the feature/sessions-primitive branch from 9a09da4 to 84c717c Compare May 12, 2026 19:09

ericallam mentioned this pull request May 12, 2026

chore: raise REVIEW.md drift-audit turn budget and steer selective sampling #3567

Merged

2 tasks

ericallam force-pushed the feature/sessions-primitive branch from 2218110 to 5359eda Compare May 13, 2026 06:19

ericallam mentioned this pull request May 13, 2026

feat(webapp,core,cli): filter runs by region in dashboard, API, and MCP #3612

Open

6 tasks

ericallam force-pushed the feature/sessions-primitive branch from 37ea386 to 32b0e42 Compare May 14, 2026 09:32

This comment was marked as resolved.

Sign in to view

0ski approved these changes May 14, 2026

View reviewed changes

This comment was marked as resolved.

Sign in to view

ericallam force-pushed the feature/sessions-primitive branch from 219e550 to be1a6cf Compare May 14, 2026 12:13

devin-ai-integration Bot reviewed May 14, 2026

View reviewed changes

Comment thread packages/core/src/v3/apiClient/runStream.ts

ericallam merged commit 979655c into main May 14, 2026
48 of 49 checks passed

ericallam deleted the feature/sessions-primitive branch May 14, 2026 12:41

github-actions Bot mentioned this pull request May 14, 2026

chore: release v4.5.0 #3563

Open

Uh oh!

Conversation

ericallam commented May 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Design

Test plan

Stack

Uh oh!

changeset-bot Bot commented May 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

coderabbitai Bot commented May 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Estimated code review effort

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ericallam commented May 10, 2026 •

edited

Loading

changeset-bot Bot commented May 10, 2026 •

edited

Loading

coderabbitai Bot commented May 10, 2026 •

edited

Loading