-
Notifications
You must be signed in to change notification settings - Fork 51
feat: add automatic conversation summarization for agent mode #60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
WalkthroughAdds end-to-end conversation summarization: schema and Convex ops for summaries, backend summary lifecycle and invalidation, a summarization utility and chat-handler integration (events + persistence), UI parts and components for summarization, token-limit refactors, and assorted UI/tool adjustments. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Client
participant Handler as Chat Handler
participant Summarizer as MessageSummarizer
participant Model as Language Model
participant Convex as Convex Backend
participant DB as DB Actions
participant UI as UI Stream
Client->>Handler: prepareStep (non-temp)
activate Handler
Handler->>Summarizer: checkAndSummarizeIfNeeded(messages, subscription, model)
activate Summarizer
Summarizer->>Summarizer: compute tokens, decide
alt needsSummarization
Summarizer->>Model: generate summary
Model-->>Summarizer: summaryText
Summarizer-->>Handler: summarizedMessages, cutoffId, summaryText
Handler->>UI: emit data-summarization (started)
Handler->>Convex: saveLatestSummary(chatId, summaryText, cutoffId)
Convex->>DB: persist chat_summaries & update chats.latest_summary_id
Handler->>UI: emit data-summarization (completed)
Handler->>Handler: use summarizedMessages for model input
else noSummarization
Summarizer-->>Handler: needsSummarization = false
end
Handler->>Convex: save assistant message(s) (+ summarization parts)
Convex-->>DB: persist messages
Handler-->>UI: stream assistant response
deactivate Handler
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (5)
app/components/tools/SummarizationHandler.tsx (2)
5-9: Consider adding a more specific type for thepartprop.The
partprop is currently typed asany, which reduces type safety and could lead to runtime errors if the expected structure isn't present.Consider defining a specific type for summarization parts:
+interface SummarizationPart { + data: { + status: "started" | "completed"; + message: string; + }; + type: "data-summarization"; +} + interface SummarizationHandlerProps { message: UIMessage; - part: any; + part: SummarizationPart; partIndex: number; }
22-28: Add optional chaining for safer property access.Accessing
part.data.statusandpart.data.messagewithout validation could cause runtime errors if the structure is unexpected.Apply this diff to add defensive checks:
- {part.data.status === "started" ? ( - <ShimmerText className="text-sm">{part.data.message}...</ShimmerText> + {part.data?.status === "started" ? ( + <ShimmerText className="text-sm">{part.data?.message ?? "Summarizing"}...</ShimmerText> ) : ( <span className="text-sm text-muted-foreground"> - {part.data.message} + {part.data?.message ?? "Summary completed"} </span> )}convex/messages.ts (1)
19-60: Add proper type for thectxparameter.The
ctxparameter is typed asany, reducing type safety. Since this is used within mutation handlers, it should be typed asMutationCtxor the appropriate Convex context type.+import type { MutationCtx } from "./_generated/server"; + -const checkAndInvalidateSummary = async ( - ctx: any, +const checkAndInvalidateSummary = async ( + ctx: MutationCtx, chatId: string, deletedMessageIds: (string | undefined)[], ) => {lib/db/actions.ts (1)
279-312: Consider mode verification before summary injection.The comment indicates "agent mode only", but there's no runtime check to prevent summary injection when the current request is in "ask" mode. If a chat was previously used in agent mode (with a summary) and then accessed in ask mode, the summary would still be injected. Consider adding a mode parameter and guard condition if this behavior is unintended.
lib/utils/message-summarization.ts (1)
47-50: Remove unnecessary type cast.Line 48 casts
fileIds as any, butextractAllFileIdsFromMessagesreturnsArray<Id<"files">>andgetFileTokensByIdsexpectsId<"files">[], which are equivalent types. The cast is unnecessary and should be removed for type safety.Apply this diff:
- const tokens = await getFileTokensByIds(fileIds as any); + const tokens = await getFileTokensByIds(fileIds);
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (4)
convex/_generated/api.d.tsis excluded by!**/_generated/**convex/_generated/api.jsis excluded by!**/_generated/**convex/_generated/server.d.tsis excluded by!**/_generated/**convex/_generated/server.jsis excluded by!**/_generated/**
📒 Files selected for processing (13)
app/components/ChatInput.tsx(2 hunks)app/components/MessagePartHandler.tsx(2 hunks)app/components/tools/SummarizationHandler.tsx(1 hunks)convex/chats.ts(5 hunks)convex/messages.ts(2 hunks)convex/schema.ts(2 hunks)lib/ai/providers.ts(1 hunks)lib/api/chat-handler.ts(6 hunks)lib/db/actions.ts(4 hunks)lib/token-utils.ts(2 hunks)lib/utils/file-token-utils.ts(1 hunks)lib/utils/message-summarization.ts(1 hunks)lib/utils/terminal-executor.ts(0 hunks)
💤 Files with no reviewable changes (1)
- lib/utils/terminal-executor.ts
🧰 Additional context used
📓 Path-based instructions (4)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/convex_rules.mdc)
**/*.{ts,tsx}: Use Id from ./_generated/dataModel for document IDs (e.g., Id<'users'>) instead of string
Ensure Record key/value types align with validators (e.g., v.record(v.id('users'), v.string()) -> Record<Id<'users'>, string>)
Use as const on string literals in discriminated unions
When using Array type, explicitly declare arrays with const array: Array = [...]
When using Record type, explicitly declare with const record: Record<KeyType, ValueType> = {...}
Files:
app/components/tools/SummarizationHandler.tsxconvex/schema.tslib/utils/message-summarization.tslib/utils/file-token-utils.tsconvex/messages.tslib/token-utils.tslib/db/actions.tslib/ai/providers.tsconvex/chats.tsapp/components/ChatInput.tsxapp/components/MessagePartHandler.tsxlib/api/chat-handler.ts
convex/**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (.cursor/rules/convex_rules.mdc)
convex/**/*.{ts,tsx,js,jsx}: ALWAYS use the new Convex function syntax (query/mutation/action object with args/returns/handler) when defining Convex functions
When a Convex function returns null, include returns: v.null() and return null
Use internalQuery/internalMutation/internalAction for private functions; do not expose sensitive logic with public query/mutation/action
Do NOT register functions via the api or internal objects
ALWAYS provide args and returns validators for query, mutation, action and their internal variants; use v.null() when no value is returned
From Convex functions, use ctx.runQuery/ctx.runMutation/ctx.runAction to call other functions (not direct function calls)
Pass a FunctionReference (from api/internal) to ctx.runQuery/ctx.runMutation/ctx.runAction; do NOT pass the callee function directly
Only call an action from another action when crossing runtimes (e.g., V8 to Node); otherwise refactor shared code into helpers
Minimize calls from actions to queries/mutations to reduce race conditions; prefer fewer transactional boundaries
Organize public query/mutation/action functions under convex/ to align with file-based routing
For pagination, validate args with paginationOptsValidator and call .paginate(args.paginationOpts) on queries
Use v.int64() instead of deprecated v.bigint() for signed 64-bit integers
Use v.record() for record-like structures; v.map() and v.set() are not supported
Do NOT use query().filter(...); define an index in schema and use withIndex instead
Convex queries do not support .delete(); collect or iterate results and call ctx.db.delete(_id) for each
Use .unique() to fetch exactly one document and fail on multiples
When async-iterating query results, prefer for await (const row of query) over .collect() or .take(n)
Use ctx.db.replace to fully replace an existing document; ctx.db.patch to shallow-merge updates
Files defining actions that use Node built-ins must start with "use node"
Never access ctx.db from within an action
D...
Files:
convex/schema.tsconvex/messages.tsconvex/chats.ts
convex/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/convex_rules.mdc)
When run-calling a function in the same file, add an explicit return type annotation on the await result to avoid TS circularity issues
Files:
convex/schema.tsconvex/messages.tsconvex/chats.ts
convex/schema.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (.cursor/rules/convex_rules.mdc)
convex/schema.{ts,tsx,js,jsx}: Define the database schema in convex/schema.ts
Import schema helpers (defineSchema, defineTable) from convex/server in the schema file
Include all index fields in the index name (e.g., by_field1_and_field2 for ["field1","field2"])
Files:
convex/schema.ts
🧬 Code graph analysis (7)
app/components/tools/SummarizationHandler.tsx (1)
app/components/ShimmerText.tsx (1)
ShimmerText(12-39)
lib/utils/message-summarization.ts (3)
types/chat.ts (1)
SubscriptionTier(7-7)lib/utils/file-token-utils.ts (2)
extractAllFileIdsFromMessages(75-88)getFileTokensByIds(43-68)lib/token-utils.ts (2)
countMessagesTokens(92-101)getMaxTokensForSubscription(13-20)
lib/db/actions.ts (1)
lib/errors.ts (1)
ChatSDKError(36-74)
convex/chats.ts (1)
convex/tempStreams.ts (1)
validateServiceKey(4-8)
app/components/ChatInput.tsx (1)
lib/token-utils.ts (1)
getMaxTokensForSubscription(13-20)
app/components/MessagePartHandler.tsx (1)
app/components/tools/SummarizationHandler.tsx (1)
SummarizationHandler(11-31)
lib/api/chat-handler.ts (3)
lib/utils/message-summarization.ts (1)
checkAndSummarizeIfNeeded(27-118)lib/db/actions.ts (2)
saveChatSummary(606-634)saveMessage(56-91)convex/messages.ts (1)
saveMessage(93-201)
🔇 Additional comments (21)
lib/utils/file-token-utils.ts (1)
90-96: LGTM: Documentation updates align with centralized token handling.The JSDoc updates accurately reflect the function's responsibilities and align with the new centralized token limit management approach introduced in this PR.
lib/ai/providers.ts (1)
16-16: LGTM: New summarization model provider added.The new provider entry follows the established pattern and integrates seamlessly with the existing tracing infrastructure.
app/components/MessagePartHandler.tsx (1)
10-10: LGTM: Summarization handler integrated correctly.The new case follows the established pattern for handling custom part types, and the props passed match the component's interface.
Also applies to: 61-68
lib/token-utils.ts (2)
6-6: LGTM: Constant renamed to reflect team tier support.The rename from
MAX_TOKENS_PROtoMAX_TOKENS_PRO_AND_TEAMaccurately reflects that both subscription tiers receive the same token limit.Also applies to: 17-19
23-24: Token limit reduction is intentional but lacks test coverage verification.The 50% reduction from 4096 to 2048 tokens is confirmed as intentional, implemented in commit
bf8a4aaas part of the automatic conversation summarization feature (Oct 21, 2025). This aligns with the summarization feature goals.However, verification reveals no test files validate these token limits for existing use cases. Without test results confirming behavior with the new limits, the impact on tool output truncation (file reads, terminal output, Python execution) cannot be definitively assessed.
To address the review concerns:
- Whether users will experience unexpected truncation with the new limits requires testing the affected code paths (Python tool execution, terminal output streaming, file read operations)
- Recommend adding test coverage for token truncation edge cases before merging
convex/messages.ts (2)
28-31: LGTM: Proper type guard for filtering undefined values.The type guard correctly narrows the type from
(string | undefined)[]tostring[].
984-989: LGTM: Summary invalidation integrated into regeneration flow.The invalidation check is correctly placed after message deletion and passes the appropriate message IDs.
app/components/ChatInput.tsx (1)
37-37: LGTM: Token limit calculation centralized.The change correctly uses the new
getMaxTokensForSubscriptionhelper, improving maintainability by consolidating token limit logic in a single location.Also applies to: 183-183
convex/schema.ts (1)
31-31: LGTM: Schema changes support the summarization feature.Verification confirms the assessment:
summary_up_to_message_idis never used in query filters. Summaries are fetched by_idand the field is checked in memory (e.g.,convex/messages.ts:46). The index onchat_idsufficiently enables efficient summary lookups for a given chat. No additional index onsummary_up_to_message_idis necessary.lib/api/chat-handler.ts (4)
7-7: LGTM! Imports and initialization are correct.The new imports and
summarizationPartsinitialization properly support the summarization feature.Also applies to: 34-34, 46-46, 171-172
245-251: LGTM! ThefinalMessagespattern enables clean summarization.The mutable reference allows
prepareStepto substitute summarized messages without side effects onprocessedMessages.
456-473: LGTM! Summarization parts are correctly persisted.The logic properly prepends summarization events to assistant messages for persistence, ensuring the summarization context is retained in the database.
254-314: No issues found. The "summarization-model" identifier is properly defined.The identifier is correctly configured in
lib/ai/providers.ts:16as"summarization-model": xai("grok-4-fast-non-reasoning")and is part of the valid provider configuration. Thexaiprovider is properly imported from@ai-sdk/xai. The usage inchat-handler.ts:266correctly references this definition.Likely an incorrect or invalid review comment.
lib/db/actions.ts (3)
216-219: LGTM! Summary fetching is properly guarded.The conditional check ensures
getLatestSummaryis only called when a summary exists.
606-634: LGTM! Error handling is comprehensive.The function properly logs errors with context and throws a structured
ChatSDKError.
636-647: LGTM! Graceful error handling for summary retrieval.Returning
nullon error is appropriate for this getter function and maintains consistency with other similar functions in this file.lib/utils/message-summarization.ts (2)
61-76: LGTM! Message splitting logic is correct.The slice operations properly partition messages into those to summarize and those to keep unsummarized, with the cutoff ID correctly identified.
78-117: LGTM! AI summary generation with proper error handling.The fallback summary ensures the feature degrades gracefully if the AI generation fails, and the XML-wrapped user message correctly injects the summary context into the conversation.
convex/chats.ts (3)
46-46: LGTM! Return types properly extended.The
latest_summary_idfield is correctly typed as optional with proper table reference.Also applies to: 130-130
555-581: LGTM! Comprehensive cleanup with defensive programming.The deletion logic handles both the referenced
latest_summary_idand queries for any orphaned summaries, ensuring complete cleanup even if prior operations failed. The error handling allows deletion to proceed despite cleanup failures.
712-738: LGTM! Consistent cleanup logic.The summary cleanup in
deleteAllChatsfollows the same pattern asdeleteChat, ensuring consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
convex/chats.ts (2)
756-811: UseinternalMutationinstead ofmutation.This mutation is backend-only (validates service key) and should use
internalMutationto properly restrict access. This issue was already raised in a previous review comment but hasn't been addressed yet.As per coding guidelines: "Use internalQuery/internalMutation/internalAction for private functions; do not expose sensitive logic with public query/mutation/action."
Based on learnings.
817-859: UseinternalQueryinstead ofquery.This query is backend-only (validates service key) and should use
internalQueryto properly restrict access. This issue was already raised in a previous review comment but hasn't been addressed yet.As per coding guidelines: "Use internalQuery/internalMutation/internalAction for private functions; do not expose sensitive logic with public query/mutation/action."
Based on learnings.
🧹 Nitpick comments (2)
convex/chats.ts (2)
555-581: Consider optimizing summary deletion to avoid potential double deletion.The code first deletes
chat.latest_summary_id(lines 556-566), then queries and deletes all summaries bychat_id(lines 569-581). If the latest summary is still in the database when queried, it will be deleted twice—the second attempt will fail gracefully but generates an unnecessary operation and error log.Consider one of these approaches:
Option 1: Query all summaries once and delete them without special-casing latest_summary_id:
- // Delete chat summaries - if (chat.latest_summary_id) { - try { - await ctx.db.delete(chat.latest_summary_id); - } catch (error) { - console.error( - `Failed to delete summary ${chat.latest_summary_id}:`, - error, - ); - // Continue with deletion even if summary cleanup fails - } - } - // Delete all historical summaries for this chat const summaries = await ctx.db .query("chat_summaries") .withIndex("by_chat_id", (q) => q.eq("chat_id", args.chatId)) .collect(); for (const summary of summaries) { try { await ctx.db.delete(summary._id); } catch (error) { console.error(`Failed to delete summary ${summary._id}:`, error); // Continue with deletion even if summary cleanup fails } }Option 2: Skip the latest_summary_id when iterating:
for (const summary of summaries) { + if (summary._id === chat.latest_summary_id) { + continue; // Already deleted above + } try { await ctx.db.delete(summary._id);
712-738: Consider optimizing summary deletion to avoid potential double deletion.Same issue as in
deleteChat(lines 555-581): the latest summary may be deleted twice—once vialatest_summary_idand again when iterating all summaries bychat_id. While error handling prevents failure, it's inefficient.Apply the same optimization as suggested for
deleteChatabove: either query and delete all summaries in one pass without special-casinglatest_summary_id, or skiplatest_summary_idwhen iterating the results.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
convex/chats.ts(5 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
convex/**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (.cursor/rules/convex_rules.mdc)
convex/**/*.{ts,tsx,js,jsx}: ALWAYS use the new Convex function syntax (query/mutation/action object with args/returns/handler) when defining Convex functions
When a Convex function returns null, include returns: v.null() and return null
Use internalQuery/internalMutation/internalAction for private functions; do not expose sensitive logic with public query/mutation/action
Do NOT register functions via the api or internal objects
ALWAYS provide args and returns validators for query, mutation, action and their internal variants; use v.null() when no value is returned
From Convex functions, use ctx.runQuery/ctx.runMutation/ctx.runAction to call other functions (not direct function calls)
Pass a FunctionReference (from api/internal) to ctx.runQuery/ctx.runMutation/ctx.runAction; do NOT pass the callee function directly
Only call an action from another action when crossing runtimes (e.g., V8 to Node); otherwise refactor shared code into helpers
Minimize calls from actions to queries/mutations to reduce race conditions; prefer fewer transactional boundaries
Organize public query/mutation/action functions under convex/ to align with file-based routing
For pagination, validate args with paginationOptsValidator and call .paginate(args.paginationOpts) on queries
Use v.int64() instead of deprecated v.bigint() for signed 64-bit integers
Use v.record() for record-like structures; v.map() and v.set() are not supported
Do NOT use query().filter(...); define an index in schema and use withIndex instead
Convex queries do not support .delete(); collect or iterate results and call ctx.db.delete(_id) for each
Use .unique() to fetch exactly one document and fail on multiples
When async-iterating query results, prefer for await (const row of query) over .collect() or .take(n)
Use ctx.db.replace to fully replace an existing document; ctx.db.patch to shallow-merge updates
Files defining actions that use Node built-ins must start with "use node"
Never access ctx.db from within an action
D...
Files:
convex/chats.ts
convex/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/convex_rules.mdc)
When run-calling a function in the same file, add an explicit return type annotation on the await result to avoid TS circularity issues
Files:
convex/chats.ts
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/convex_rules.mdc)
**/*.{ts,tsx}: Use Id from ./_generated/dataModel for document IDs (e.g., Id<'users'>) instead of string
Ensure Record key/value types align with validators (e.g., v.record(v.id('users'), v.string()) -> Record<Id<'users'>, string>)
Use as const on string literals in discriminated unions
When using Array type, explicitly declare arrays with const array: Array = [...]
When using Record type, explicitly declare with const record: Record<KeyType, ValueType> = {...}
Files:
convex/chats.ts
🧠 Learnings (1)
📚 Learning: 2025-10-03T13:15:14.787Z
Learnt from: CR
PR: hackerai-tech/hackerai#0
File: .cursor/rules/convex_rules.mdc:0-0
Timestamp: 2025-10-03T13:15:14.787Z
Learning: Applies to convex/**/*.{ts,tsx,js,jsx} : Use internalQuery/internalMutation/internalAction for private functions; do not expose sensitive logic with public query/mutation/action
Applied to files:
convex/chats.ts
🧬 Code graph analysis (1)
convex/chats.ts (1)
convex/tempStreams.ts (1)
validateServiceKey(4-8)
🔇 Additional comments (1)
convex/chats.ts (1)
46-46: LGTM! Return types correctly extended.The
latest_summary_idfield additions properly type the new optional reference to chat summaries usingv.optional(v.id("chat_summaries")), which aligns with the schema changes.Also applies to: 130-130
- Remove file count display from PythonToolHandler UI - Remove file paths from Python tool backend response - Files are still saved to Convex but not reported in UI/output
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
app/components/MessageActions.tsx(3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/convex_rules.mdc)
**/*.{ts,tsx}: Use Id from ./_generated/dataModel for document IDs (e.g., Id<'users'>) instead of string
Ensure Record key/value types align with validators (e.g., v.record(v.id('users'), v.string()) -> Record<Id<'users'>, string>)
Use as const on string literals in discriminated unions
When using Array type, explicitly declare arrays with const array: Array = [...]
When using Record type, explicitly declare with const record: Record<KeyType, ValueType> = {...}
Files:
app/components/MessageActions.tsx
🔇 Additional comments (4)
app/components/MessageActions.tsx (4)
60-60: LGTM! Good addition of regeneration guard state.The state variable is well-named and properly initialized to prevent concurrent regenerations.
91-95: LGTM! Well-implemented regeneration guard.The early return and state update effectively prevent concurrent regenerations.
220-220: LGTM! Correctly wires up the new handler.The onClick handler properly invokes
handleRegenerateto ensure the regeneration guard is enforced.
221-221: LGTM! Appropriate disabled state update.Including
isRegeneratingin the disabled condition provides proper visual feedback and prevents concurrent regeneration attempts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (9)
lib/db/actions.ts (3)
299-308: Prefer “system” role for synthetic summary messageUsing role "user" can skew downstream logic (e.g., user/assistant turn heuristics, tools). A “system” message better reflects metadata/context.
- const summaryMessage: UIMessage = { - id: uuidv4(), - role: "user", + const summaryMessage: UIMessage = { + id: uuidv4(), + role: "system", parts: [ { type: "text", text: `<conversation_summary>\n${latestSummary.summary_text}\n</conversation_summary>`, }, ], };
288-297: Cutoff not found → duplicate context riskIf summaryUpToId isn’t found, you prepend the summary and still include all messages, duplicating summarized content.
Fallback to “no injection” when cutoffIndex < 0:
- const messagesAfterCutoff = - cutoffIndex >= 0 - ? truncatedFromLoop.slice(cutoffIndex + 1) - : truncatedFromLoop; + if (cutoffIndex < 0) { + // Can't safely align summary → skip injection + return { truncatedMessages: truncatedFromLoop, chat, isNewChat }; + } + const messagesAfterCutoff = truncatedFromLoop.slice(cutoffIndex + 1);
610-639: saveChatSummary error handling OK; consider returning a booleanYou log and throw ChatSDKError—good. Returning void makes callers infer success by lack of throw; returning a boolean success can simplify branching and retries.
-export async function saveChatSummary({...}): Promise<void> { +export async function saveChatSummary({...}): Promise<boolean> { try { await convex.mutation(...); - return; + return true; } catch (error) { ... } }Callers can check truthiness to gate UI events.
lib/utils/message-summarization.ts (4)
141-151: Use “system” role for the injected summarySame rationale as in actions.ts: treat the synthetic summary as context, not a user turn.
- const summaryMessage: UIMessage = { - id: uuidv4(), - role: "user", + const summaryMessage: UIMessage = { + id: uuidv4(), + role: "system", parts: [ { type: "text", text: `<conversation_summary>\n${summaryText}\n</conversation_summary>`, }, ], };
27-57: Token counting heuristic may overcountAdding countTokens(message.role) and JSON.stringify for non-text parts can materially overcount vs. provider tokenization. It’s acceptable as a heuristic, but note it may trigger premature summarization.
- Consider aligning with your truncateMessagesToTokenLimit logic.
- Or use provider/tokenizer-aware estimation if available from the model adapter.
13-13: Type‑only import for SubscriptionTierPrefer import type for TS‑only entities in server modules.
-import { SubscriptionTier } from "@/types"; +import type { SubscriptionTier } from "@/types";As per coding guidelines.
99-118: Expose MESSAGES_TO_KEEP_UNSUMMARIZED as a param (optional)Hard‑coding 2 works, but different models/contexts may need 3–4 turns. Consider making it a function param or config.
lib/api/chat-handler.ts (1)
275-314: Event sequencing and persistence couplingEmitting “started” before persistence and “completed” after is correct. Consider emitting a “failed” event on error to keep UI consistent.
Add a writer.write({ type: "data-summarization", data: { status: "failed" }}) in the catch around saveChatSummary.
app/components/MessageEditor.tsx (1)
22-23: Token‑based guard on edited messages — good; small UX nits
- Good: trim + token gating + plan‑aware toast.
- Nit: use import type for SubscriptionTier if you pass it here later; and consider i18n for toast text.
Also applies to: 38-55
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
app/components/MessageEditor.tsx(3 hunks)lib/api/chat-handler.ts(7 hunks)lib/db/actions.ts(7 hunks)lib/token-utils.ts(5 hunks)lib/utils/file-token-utils.ts(1 hunks)lib/utils/message-summarization.ts(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/convex_rules.mdc)
**/*.{ts,tsx}: Use Id from ./_generated/dataModel for document IDs (e.g., Id<'users'>) instead of string
Ensure Record key/value types align with validators (e.g., v.record(v.id('users'), v.string()) -> Record<Id<'users'>, string>)
Use as const on string literals in discriminated unions
When using Array type, explicitly declare arrays with const array: Array = [...]
When using Record type, explicitly declare with const record: Record<KeyType, ValueType> = {...}
Files:
lib/utils/message-summarization.tslib/api/chat-handler.tslib/db/actions.tslib/utils/file-token-utils.tslib/token-utils.tsapp/components/MessageEditor.tsx
🧬 Code graph analysis (5)
lib/utils/message-summarization.ts (2)
types/chat.ts (1)
SubscriptionTier(7-7)lib/token-utils.ts (1)
getMaxTokensForSubscription(13-20)
lib/api/chat-handler.ts (3)
lib/utils/message-summarization.ts (1)
checkAndSummarizeIfNeeded(63-159)lib/db/actions.ts (2)
saveChatSummary(611-639)saveMessage(56-91)convex/messages.ts (1)
saveMessage(93-201)
lib/db/actions.ts (3)
types/chat.ts (1)
SubscriptionTier(7-7)convex/_generated/api.js (2)
api(21-21)api(21-21)lib/errors.ts (1)
ChatSDKError(36-74)
lib/utils/file-token-utils.ts (1)
types/chat.ts (1)
SubscriptionTier(7-7)
app/components/MessageEditor.tsx (2)
app/contexts/GlobalState.tsx (1)
useGlobalState(585-591)lib/token-utils.ts (2)
countInputTokens(150-160)getMaxTokensForSubscription(13-20)
🔇 Additional comments (12)
lib/api/chat-handler.ts (2)
461-470: Prepending summarization parts for assistant messages — LGTMParts-only augmentation is localized to assistant messages and won’t affect user messages; works with your Convex schema (parts: any[]).
103-111: Pass mode through to getMessagesByChatId — goodThis enables skipFileTokens and, with the proposed fix in actions.ts, will gate summary injection by mode.
app/components/MessageEditor.tsx (1)
30-36: Auto‑focus and select all — LGTMNice productivity touch; safe guard with optional chaining is present.
lib/db/actions.ts (2)
255-259: All call sites verified and correctly updatedBoth
truncateMessagesWithFileTokensinvocations (lines 254 and 340 in lib/db/actions.ts) correctly passmode === "agent"as theskipFileTokensparameter. No additional call sites exist, confirming consistent implementation across the codebase.
641-652: All validators correctly defined and return type matches backend shapeVerification confirms:
chat_summariesschema properly definessummary_textandsummary_up_to_message_idasv.string()getLatestSummaryForBackendhas explicit return type validation matching these fields- Frontend function correctly returns the backend shape or
nullon error- No type mismatches or validator issues
lib/utils/file-token-utils.ts (1)
90-115: All callers correctly pass skipFileTokens parameterBoth call sites in lib/db/actions.ts already properly pass
mode === "agent"as the third argument totruncateMessagesWithFileTokens, aligning with the agent flow where files go to sandbox. No callers are missing this parameter.lib/token-utils.ts (6)
5-20: LGTM! Clear naming and consistent subscription handling.The renaming to
MAX_TOKENS_PRO_AND_TEAMbetter reflects that both subscription tiers share the same limit, and the function logic correctly implements this equivalence.
23-24: Verify the impact of halving token limits.Reducing
STREAM_MAX_TOKENSandTOOL_DEFAULT_MAX_TOKENSfrom 4096 to 2048 is a significant change (50% reduction) that will affect streaming responses and tool output truncation. Ensure this change:
- Doesn't break existing functionality or degrade user experience
- Has been tested with typical streaming and tool usage scenarios
- Aligns with the conversation summarization feature (possibly freeing up tokens for summaries)
28-29: LGTM! Good separation of truncation messages.The file-specific truncation message provides clearer user feedback when file content is truncated.
53-69: LGTM! Cleaner token counting logic.The refactoring to use
countPartTokenswithreduceis more maintainable and handles all message part types consistently. The logic correctly filters out reasoning blocks and sums tokens across all relevant parts.
74-98: LGTM! Simplified and correct truncation logic.The direct use of
maxTokensin the comparison is cleaner than maintaining a separate variable. The truncation logic correctly processes messages from newest to oldest and maintains proper token accounting.
103-173: LGTM! Proper integration with refactored utilities.The remaining functions correctly integrate with the refactored token counting logic, and
truncateOutputappropriately uses the new file-specific truncation message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
lib/token-utils.ts (1)
37-54: Improve type safety:UIMessagePart<any, any>still usesanygenerics.While
UIMessagePart<any, any>is better than plainany, the generic parameters still bypass TypeScript's type checking. The multiple type assertions on lines 42, 47, and 49 indicate this could be improved with proper typing.Consider defining concrete types for the generic parameters or using a discriminated union. If the
ailibrary exports specific part types, use those:+// Define or import concrete types for message parts +type TextPart = UIMessagePart<{ text: string }, {}>; +type FilePart = UIMessagePart<{ fileId: Id<"files"> }, {}>; +type OtherPart = UIMessagePart<any, any>; +type SupportedMessagePart = TextPart | FilePart | OtherPart; + const countPartTokens = ( - part: UIMessagePart<any, any>, + part: SupportedMessagePart, fileTokens: Record<Id<"files">, number> = {}, ): number => {Alternatively, check if the
ailibrary provides a more specific type you can import directly.As per coding guidelines.
🧹 Nitpick comments (2)
lib/api/chat-handler.ts (1)
315-320: Consider adding context to error logs for better observability.The generic "Error in prepareStep:" message doesn't indicate whether the error occurred during summarization or system prompt refresh. Adding context could help with debugging.
Consider logging the error context:
} catch (error) { - console.error("Error in prepareStep:", error); + console.error("Error in prepareStep:", { + hasSummarized, + error: error instanceof Error ? error.message : String(error), + }); return currentSystemPrompt ? { system: currentSystemPrompt } : {}; }lib/utils/stream-writer-utils.ts (1)
5-7: Consider tighter typing for the internal StreamWriter interface.The
write: (data: any) => voidsignature is very permissive. While functional, you could define a union of expected message types for better type safety.Example:
type StreamMessage = | { type: "data-upload-status"; data: { message: string; isUploading: boolean }; transient: true } | { type: "data-summarization"; id: string; data: { status: string; message: string } }; type StreamWriter = { write: (data: StreamMessage) => void; };
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
lib/api/chat-handler.ts(8 hunks)lib/db/actions.ts(7 hunks)lib/token-utils.ts(5 hunks)lib/utils/file-token-utils.ts(4 hunks)lib/utils/stream-writer-utils.ts(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/convex_rules.mdc)
**/*.{ts,tsx}: Use Id from ./_generated/dataModel for document IDs (e.g., Id<'users'>) instead of string
Ensure Record key/value types align with validators (e.g., v.record(v.id('users'), v.string()) -> Record<Id<'users'>, string>)
Use as const on string literals in discriminated unions
When using Array type, explicitly declare arrays with const array: Array = [...]
When using Record type, explicitly declare with const record: Record<KeyType, ValueType> = {...}
Files:
lib/db/actions.tslib/utils/stream-writer-utils.tslib/api/chat-handler.tslib/utils/file-token-utils.tslib/token-utils.ts
🧬 Code graph analysis (4)
lib/db/actions.ts (3)
types/chat.ts (1)
SubscriptionTier(7-7)convex/_generated/api.js (2)
api(21-21)api(21-21)lib/errors.ts (1)
ChatSDKError(36-74)
lib/api/chat-handler.ts (5)
lib/utils/stream-writer-utils.ts (5)
writeUploadStartStatus(10-19)writeUploadCompleteStatus(21-30)writeSummarizationStarted(33-42)writeSummarizationCompleted(44-53)createSummarizationCompletedPart(55-65)lib/utils/sandbox-file-utils.ts (1)
uploadSandboxFiles(90-118)lib/utils/message-summarization.ts (1)
checkAndSummarizeIfNeeded(63-159)lib/db/actions.ts (2)
saveChatSummary(612-640)saveMessage(56-91)convex/messages.ts (1)
saveMessage(93-201)
lib/utils/file-token-utils.ts (2)
convex/_generated/dataModel.d.ts (1)
Id(48-49)types/chat.ts (1)
SubscriptionTier(7-7)
lib/token-utils.ts (1)
convex/_generated/dataModel.d.ts (1)
Id(48-49)
🔇 Additional comments (20)
lib/utils/file-token-utils.ts (3)
43-68: LGTM! Type safety improvements align with coding guidelines.The update to use
Record<Id<"files">, number>instead ofRecord<string, number>correctly follows the coding guidelines and improves type safety across the file token handling utilities.
98-115: LGTM! Well-designed parameter for agent mode.The addition of the
skipFileTokensparameter with a sensible default (false) maintains backward compatibility while enabling the new summarization workflow. The conditional logic correctly handles both paths: extracting and fetching file tokens when needed, or skipping entirely for agent mode.
120-136: LGTM! Type consistency maintained.The update to
precomputedFileTokensparameter type correctly aligns with the return type ofgetFileTokensByIdsand maintains consistency across all file token utilities.lib/token-utils.ts (8)
1-4: LGTM! Type imports and constant updates align with guidelines.The addition of
UIMessagePartandIdtypes, along with the renamed constant and updated subscription tier logic, are consistent with the PR objectives and coding guidelines.Also applies to: 7-7, 18-19
24-25: LGTM! Token limit adjustments and new constant.The reduced token limits (4096→2048) and new
FILE_READ_TRUNCATION_MESSAGEconstant align with the PR's token management improvements.Also applies to: 29-30
59-75: LGTM! Token counting logic correctly handles file tokens.The function properly uses
Record<Id<"files">, number>per coding guidelines and correctly counts tokens for each message part while filtering out reasoning blocks.
80-104: LGTM! Truncation logic correctly enforces token limits.The function properly uses
Record<Id<"files">, number>and correctly truncates messages by keeping the newest messages within the token limit.
109-118: LGTM! Token counting function is correct.The function properly uses
Record<Id<"files">, number>per coding guidelines and correctly sums tokens across all messages.
123-139: LGTM! Content truncation logic is correct.The function correctly uses the updated
TOOL_DEFAULT_MAX_TOKENSlimit and properly handles edge cases where the suffix exceeds the token budget.
156-166: LGTM! Input token counting is correct.The function correctly counts tokens from both text input and uploaded files.
171-179: LGTM! Legacy wrapper function is correct.The function properly delegates to
truncateContentwith the appropriate suffix based on mode.lib/db/actions.ts (5)
19-19: LGTM: UUID import for summary message IDs.The uuid import is appropriately used at line 301 to generate unique IDs for injected summary messages.
195-203: LGTM: Mode parameter properly typed.The mode parameter is correctly typed as an optional union of "ask" | "agent" and properly threaded through for mode-aware token handling.
218-221: LGTM: Efficient conditional summary fetch.The optional chaining and conditional query appropriately avoid unnecessary database calls when no summary exists.
283-317: Past review concern addressed via design clarification.The previous review flagged the missing agent-mode gate for summary injection. The updated comments (lines 283-284) now explicitly document that summaries apply "regardless of current mode" because "summaries are only created in agent mode but provide value in any mode." This appears to be an intentional design decision that has been clarified in the code.
612-653: LGTM: Well-structured summary persistence functions.Both functions implement appropriate error handling:
saveChatSummarythrowsChatSDKErrorfor critical persistence failures with detailed logginggetLatestSummaryreturns null for graceful degradation on query failuresThe error handling strategy aligns well with their respective roles.
lib/api/chat-handler.ts (3)
7-7: LGTM: Summarization imports properly integrated.All imports are appropriately used throughout the summarization flow and helper functions.
Also applies to: 34-34, 46-53
246-285: Past review concern fully addressed with correct persistence ordering.The previous review flagged setting
hasSummarized/finalMessagesbefore persistence. The updated code now:
- Persists the summary first (lines 266-271)
- Updates state only after successful save (lines 273-275)
- Returns updated messages for the step
If
saveChatSummarythrows, the catch block at line 315 allowshasSummarizedto remainfalse, enabling retry on the next step. This is the correct sequence.
420-428: LGTM: Summarization parts correctly prepended to assistant messages.The logic appropriately prepends accumulated
summarizationPartsto assistant messages before persistence, preserving the existing parts array.lib/utils/stream-writer-utils.ts (1)
1-65: LGTM: Clean utility with proper discriminated union typing.The utility functions are well-structured with consistent patterns. Line 59 correctly uses
as constfor the type property in line with coding guidelines.
Summary by CodeRabbit
New Features
Changes
Bug Fixes