-
Notifications
You must be signed in to change notification settings - Fork 822
crop text in middle instead of decimate it #809
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
} else { | ||
maxLength -= 100 + clamp(userConfig.maxResponseTokenLength, 1, maxLength - 1000) | ||
// if we don't have the models exact content limit use the default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So maybe the entire else
block should be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR updates the text cropping functionality to remove excess text from the middle rather than discarding every N sentence, while also adjusting token limit calculations and separator characters for a more natural output. Key changes include:
- Modifying cropText to split text by an expanded set of delimiters and crop from the middle with an appended disclaimer.
- Updating the token limit logic and default configurations for extended model contexts.
- Adjusting the YouTube site adapter to use newline as a separator and refining the summary prompt.
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
File | Description |
---|---|
src/utils/crop-text.mjs | Revised crop text function: new cropping algorithm, delimiter update, and token limit adjustments. |
src/content-script/site-adapters/youtube/index.mjs | Updated subtitle extraction and summary prompt adjustments. |
src/config/index.mjs | Increased default maxResponseTokenLength to accommodate longer texts. |
croppedText = splits.slice(0, middleIndex).join('\n') | ||
if (croppedTokens > 0) { | ||
croppedText += `\n\n**Important disclaimer**, this text is incomplete! ${croppedTokens} or ${ | ||
(croppedTokens / totalTokens).toFixed(2) * 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Consider explicitly converting the result of toFixed to a number (e.g., using Number(...)) to avoid any unexpected type coercion issues.
(croppedTokens / totalTokens).toFixed(2) * 100 | |
Number((croppedTokens / totalTokens).toFixed(2)) * 100 |
Copilot uses AI. Check for mistakes.
let firstHalfTokens = 0 | ||
let secondHalfTokens = 0 | ||
const halfTargetTokens = Math.floor(cropTargetLength / 2) | ||
let middleIndex = -1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the input text is short and the first pass never exceeds the halfTargetTokens, middleIndex will remain -1, which may lead to unexpected slicing. Consider adding a condition to bypass cropping when the text length is within limits.
Copilot uses AI. Check for mistakes.
/review |
Qodo Merge was enabled for this repository. To continue using it, please link your Git account with your Qodo account here. PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
/improve |
Qodo Merge was enabled for this repository. To continue using it, please link your Git account with your Qodo account here. PR Code Suggestions ✨
|
A few problems:
So I've
#800