improvement(style-api): fix auth order, add size guard, extract pptx slide meta + skill fixes#4524
improvement(style-api): fix auth order, add size guard, extract pptx slide meta + skill fixes#4524waleedlatif1 wants to merge 16 commits into
Conversation
…ers, MIME guards, and 256 MB isolate limit
…nt origin placement
- DOCX addImage: upfront width/height validation (matches PDF/PPTX pattern) - PDF embedImage: remove dead Buffer ternary; drop redundant size guard already enforced in getFileBase64 - isolated-vm-worker: add friendly MemoryLimitError branch in both execute paths so OOM produces a clear message instead of a raw V8 error
…ror is reachable
Isolate OOM auto-disposes the isolate before throwing, meaning isDisposed
is true on the way into the catch block. The previous ordering caused OOM
to surface as AbortError ('Execution cancelled') instead of MemoryLimitError.
Move the message-based OOM check above the isDisposed check in both
executeCode and executeTask so the friendly message is actually shown.
PptxViewJS renders slides as canvas->JPEG which has known layout fidelity issues with complex overlapping shapes. The downloaded binary is always correct — add a note so users know to download for exact rendering.
…ts/BodyText, PDF support
|
The latest updates on your projects. Learn more about Vercel for GitHub. |
PR SummaryMedium Risk Overview Hardens the style API/VFS path with earlier auth checks, clearer 422 errors, and a 100MB pre-download size limit to reduce OOM risk. Improves sandbox execution stability by increasing Reviewed by Cursor Bugbot for commit 3310321. Configure here. |
Greptile SummaryThis PR hardens the style extraction API (auth order, 100 MB size guard, PDF support) and upgrades the three sandbox document generators with geometry constants, validated image helpers, and better error messages. The isolated-VM memory limit is also doubled to 256 MB with an explicit OOM error path.
Confidence Score: 4/5Safe to merge with the VFS size guard added — all other changes are straightforward improvements The auth ordering fix, size guard, and PDF extraction in route.ts are all correct. The larger refactor in document-style.ts is well-structured. The main gap is that the 100 MB download guard was not applied to the WorkspaceVFS.read() path — the copilot-facing route — so a large file accessed there would still be fetched in full without any size check. apps/sim/lib/copilot/vfs/workspace-vfs.ts — needs the same size check before fetchWorkspaceFileBuffer that was added to route.ts Important Files Changed
|
| if (rawExt !== 'docx' && rawExt !== 'pptx' && rawExt !== 'pdf') return null | ||
| const ext: 'docx' | 'pptx' | 'pdf' = rawExt | ||
| const buffer = await fetchWorkspaceFileBuffer(record) | ||
| const summary = await extractDocumentStyle(buffer, ext) |
There was a problem hiding this comment.
Missing size guard before buffer download
The API route correctly checks fileRecord.size > MAX_STYLE_FILE_BYTES before downloading, but the VFS path calls fetchWorkspaceFileBuffer(record) unconditionally. This means a copilot-triggered style read on a large PDF (or DOCX/PPTX) still fetches the entire file into memory, defeating the purpose of the guard added in this PR.
| if (rawExt !== 'docx' && rawExt !== 'pptx' && rawExt !== 'pdf') return null | |
| const ext: 'docx' | 'pptx' | 'pdf' = rawExt | |
| const buffer = await fetchWorkspaceFileBuffer(record) | |
| const summary = await extractDocumentStyle(buffer, ext) | |
| const rawExt = record.name.split('.').pop()?.toLowerCase() | |
| if (rawExt !== 'docx' && rawExt !== 'pptx' && rawExt !== 'pdf') return null | |
| const ext: 'docx' | 'pptx' | 'pdf' = rawExt | |
| const MAX_STYLE_FILE_BYTES = 100 * 1024 * 1024 // 100 MB | |
| if (record.size > MAX_STYLE_FILE_BYTES) return null | |
| const buffer = await fetchWorkspaceFileBuffer(record) |
| ): { font?: string; themeFont?: string } { | ||
| const asciiLit = /\bw:ascii="([^"]+)"/.exec(fontAttrsXml) | ||
| // LibreOffice DOCX files may put a semicolon-separated fallback list in w:ascii — take the first | ||
| if (asciiLit) return { font: asciiLit[1].split(';')[0].trim() || asciiLit[1] } |
There was a problem hiding this comment.
Semicolon-only font name falls back to
";"
When w:ascii is ";" (malformed DOCX), split(';')[0].trim() returns "" which is falsy, so the expression falls back to asciiLit[1] which is ";". The resulting font field would propagate into the response as a font name. Using || undefined instead would silently drop the malformed value.
| if (asciiLit) return { font: asciiLit[1].split(';')[0].trim() || asciiLit[1] } | |
| if (asciiLit) return { font: asciiLit[1].split(';')[0].trim() || undefined } |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 3310321. Configure here.
| ...parent, | ||
| ...s, | ||
| fontSize: s.fontSize ?? parent.fontSize, | ||
| bold: s.bold ?? parent.bold, |
There was a problem hiding this comment.
Style inheritance loses explicit bold-off override
Medium Severity
The style parsing only ever sets bold: true or omits the property entirely. When a child style explicitly disables bold via <w:b w:val="0"/>, no bold property is stored. During resolveInheritance, s.bold ?? parent.bold then falls through to the parent's bold: true, making the child appear bold even though it explicitly opted out. The StyleRaw type allows boolean, but the parsing never sets bold: false, so the nullish coalescing cannot distinguish "not specified" from "explicitly off."
Reviewed by Cursor Bugbot for commit 3310321. Configure here.
| const slideCount = (sldIdLst.match(/<p:sldId\b/g) ?? []).length | ||
|
|
||
| // Slide size in EMU — 1 inch = 914400 EMU | ||
| const sldSzMatch = /<p:sldSz\b[^>]*\bcx="(\d+)"[^>]*\bcy="(\d+)"/.exec(xml) |
There was a problem hiding this comment.
Slide size regex assumes fixed attribute ordering
Low Severity
The regex /<p:sldSz\b[^>]*\bcx="(\d+)"[^>]*\bcy="(\d+)"/ requires cx to appear before cy in the XML attributes. XML attribute order is not guaranteed by the spec, so if a PPTX generator writes cy before cx, the regex won't match and aspectRatio silently defaults to 'custom' instead of the correct '16:9' or '4:3'.
Reviewed by Cursor Bugbot for commit 3310321. Configure here.
… font suffix stripping
|
Superseded by #4526, which includes all of these changes plus sandbox hardening, OOM error handling, docx __docxDocOptions support, PPTX finalize guard, bold detection fixes, and font suffix stripping — rebased cleanly onto latest staging. |


Summary
parseRequestper CLAUDE.md mandatepdfto format enum, makethemeoptional, adddefaults/pageSize/fonts/slideCount/aspectRatio/backgroundfieldsdrawCirclesize pitfall in PDF SKILL.md —sizeis the radius, not diameter (size: 40= 40pt radius / 80pt wide)ExternalHyperlinkpattern to DOCX SKILL.mdslide.addNotes()speaker notes pattern to PPTX SKILL.mdType of Change
Testing
Tested manually — verified style extraction on real PPTX, DOCX, and PDF files; confirmed auth order, size guard, and new PPTX fields in response
Checklist