fix(knowledge): infer MIME type from file extension in create/upsert tools#3651
fix(knowledge): infer MIME type from file extension in create/upsert tools#3651waleedlatif1 merged 6 commits intostagingfrom
Conversation
…tools Both create_document and upsert_document forced .txt extension and text/plain MIME type regardless of the document name. Now the tools infer the correct MIME type from the file extension (html, md, csv, json, yaml, xml) and only default to .txt when no extension is given. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
You have used all Bugbot PR reviews included in your free trial for your GitHub account on this workspace. To continue using Bugbot reviews, enable Bugbot for your team in the Cursor dashboard. |
|
The latest updates on your projects. Learn more about Vercel for GitHub. |
…oads Replace duplicate EXTENSION_MIME_MAP and getMimeTypeFromExtension with the existing, more comprehensive version from lib/uploads/utils/file-utils. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Greptile SummaryThis PR fixes a bug where both Key changes:
Migration note: The upsert-by-filename path may silently create a duplicate document for any document that was previously stored under the old logic (e.g., Confidence Score: 4/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A([documentName]) --> B[getFileExtension]
B --> C{ext is non-empty?}
C -- No --> G
C -- Yes --> D[getUploadMimeType ext]
D --> E{MIME in\nTEXT_COMPATIBLE_MIME_TYPES?}
E -- Yes --> F[Return original filename\n+ recognized MIME type]
E -- No --> G{ext non-empty AND\ndocumentName has a dot?}
G -- Yes --> H[base = name up to last dot\neg. report.v2 → report]
G -- No --> I[base = full documentName\neg. notes → notes]
H --> J{base is empty?\neg. .env}
I --> K[Return base + .txt\ntext/plain]
J -- Yes --> L[Return documentName + .txt\neg. .env → .env.txt]
J -- No --> K
Last reviewed commit: "fix(knowledge): hand..." |
…ate_document Same fixes as upsert_document: use loop-based String.fromCharCode instead of spread, consolidate duplicate TextEncoder calls, and check byte length instead of character length for 1MB limit. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@greptile |
…FileInfo Use an explicit allowlist instead of only checking for octet-stream, preventing binary MIME types (image/jpeg, audio/mpeg, etc.) from leaking through when a user names a document with a binary extension. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@greptile |
… extensions - Remove application/pdf and application/rtf from TEXT_COMPATIBLE_MIME_TYPES since these tools pass plain text content, not binary - Normalize unrecognized extensions (e.g. report.v2) to .txt instead of preserving the original extension with text/plain MIME type Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@greptile |
Dotfiles like .env would produce an empty base, resulting in '.txt'. Now falls back to the original name so .env becomes .env.txt. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@greptile |
Summary
create_documentandupsert_documenttools previously forced.txtextension andtext/plainMIME type regardless of the document name.txtwhen no extension is giveninferDocumentFileInfohelper intools/knowledge/types.tsused by both toolsTest plan
report.html— verify filename staysreport.htmland MIME istext/htmldata.csv— verify filename staysdata.csvand MIME istext/csvnotes(no extension) — verify filename becomesnotes.txtand MIME istext/plainconfig.json— verify correct MIME type propagation