Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

⚡️ perf: improve performance on long context text #3754

Merged
merged 9 commits into from
Sep 3, 2024
Merged

Conversation

arvinxx
Copy link
Contributor

@arvinxx arvinxx commented Sep 3, 2024

💻 变更类型 | Change Type

  • ✨ feat
  • 🐛 fix
  • ♻️ refactor
  • 💄 style
  • 👷 build
  • ⚡️ perf
  • 📝 docs
  • 🔨 chore

🔀 变更说明 | Description of Change

📝 补充信息 | Additional Information

refs: #1029

tokenizer 性能优化点:

  1. 增加防抖,300ms 变更后触发;
  2. tokenizer 实现挪到 Web Worker,不阻塞主进程;
  3. 50k token 以下的 tokenizer 走 Web Worker,50k 以上走服务端调用, 用 /webapi/tokenizer 做 token 计算。 ([RFC] 058 - 服务端接口架构梳理 #3755

smoothing 动画,默认输出速率从 2-> 4,gpt-4o 级别的模型输出会明显提速


发现长上下文(300k)另外的几个问题:

  1. Smoothing 动画目前采用 setTimeout 实现,在长上下文下会导致明显的掉帧问题,准备尝试使用 requestAnimationFrame 看看是否能缓解;
  2. React Markdown 渲染 300k token 文本需要耗费 2s 左右,这期间页面是卡住的,完全不可交互,需要考虑解决方案。

Copy link

vercel bot commented Sep 3, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
lobe-chat-database ✅ Ready (Inspect) Visit Preview 💬 Add feedback Sep 3, 2024 6:17pm
lobe-chat-preview ✅ Ready (Inspect) Visit Preview 💬 Add feedback Sep 3, 2024 6:17pm

@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Sep 3, 2024
@lobehubbot
Copy link
Member

👍 @arvinxx

Thank you for raising your pull request and contributing to our Community
Please make sure you have followed our contributing guidelines. We will review it as soon as possible.
If you encounter any problems, please feel free to connect with us.
非常感谢您提出拉取请求并为我们的社区做出贡献,请确保您已经遵循了我们的贡献指南,我们会尽快审查它。
如果您遇到任何问题,请随时与我们联系。

@dosubot dosubot bot added the ⚡️ Performance Performance issue | 性能问题 label Sep 3, 2024
Copy link

codecov bot commented Sep 3, 2024

Codecov Report

Attention: Patch coverage is 64.28571% with 20 lines in your changes missing coverage. Please review.

Project coverage is 91.75%. Comparing base (dce200c) to head (3a94ed3).
Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
src/utils/tokenizer/index.ts 6.66% 14 Missing ⚠️
src/server/context.ts 0.00% 3 Missing ⚠️
src/hooks/useTokenCount.ts 89.47% 2 Missing ⚠️
src/database/server/models/session.ts 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3754      +/-   ##
==========================================
- Coverage   91.80%   91.75%   -0.06%     
==========================================
  Files         452      453       +1     
  Lines       30020    30056      +36     
  Branches     2077     2913     +836     
==========================================
+ Hits        27559    27577      +18     
- Misses       2461     2479      +18     
Flag Coverage Δ
app 91.75% <64.28%> (-0.06%) ⬇️
server 97.36% <0.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@arvinxx arvinxx changed the title ⚡️ perf: 优化渲染长文本时的渲染性能 ⚡️ perf: 优化长文本时的渲染性能 Sep 3, 2024
@arvinxx arvinxx merged commit 51c6b62 into main Sep 3, 2024
9 of 11 checks passed
@arvinxx arvinxx deleted the fix/tokenizer branch September 3, 2024 18:22
@lobehubbot
Copy link
Member

❤️ Great PR @arvinxx ❤️

The growth of project is inseparable from user feedback and contribution, thanks for your contribution! If you are interesting with the lobehub developer community, please join our discord and then dm @arvinxx or @canisminor1990. They will invite you to our private developer channel. We are talking about the lobe-chat development or sharing ai newsletter around the world.
项目的成长离不开用户反馈和贡献,感谢您的贡献! 如果您对 LobeHub 开发者社区感兴趣,请加入我们的 discord,然后私信 @arvinxx@canisminor1990。他们会邀请您加入我们的私密开发者频道。我们将会讨论关于 Lobe Chat 的开发,分享和讨论全球范围内的 AI 消息。

@arvinxx arvinxx changed the title ⚡️ perf: 优化长文本时的渲染性能 ⚡️ perf: improve performance on long context text Sep 3, 2024
@lobehubbot
Copy link
Member

🎉 This PR is included in version 1.15.10 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⚡️ Performance Performance issue | 性能问题 released size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] GPT-4o的流式传输,webui显示速度跟不上传输速度
2 participants