Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhance:speedup xinference embedding & rerank #3587

Merged
merged 21 commits into from
Apr 18, 2024

Conversation

leslie2046
Copy link
Contributor

Description

enhance:speedup xinference embedding & rerank by moving calling unnecessary verification(/v1/cluster/auth) API and fetch model information(/v1/models/)

restful_client.py
when a Client creating ,it will always call /v1/cluster/auth and v1/models/{model_uid} which cost long time,and i think only when user adding model need to call these 2 API .so move them to validate_credentials().

Fixes # (issue)

Type of Change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update, included: Dify Document
  • Improvement, including but not limited to code refactoring, performance optimization, and UI/UX improvement
  • Dependency upgrade

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • TODO

Suggested Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • My changes generate no new warnings
  • I ran dev/reformat(backend) and cd web && npx lint-staged(frontend) to appease the lint gods
  • optional I have made corresponding changes to the documentation
  • optional I have added tests that prove my fix is effective or that my feature works
  • optional New and existing unit tests pass locally with my changes

@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. 🐍 python 💪 enhancement New feature or request labels Apr 18, 2024
Copy link
Collaborator

@Yeuoly Yeuoly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Apr 18, 2024
@Yeuoly Yeuoly merged commit 4365843 into langgenius:main Apr 18, 2024
6 checks passed
HSPK added a commit to puyuantech/dify-local that referenced this pull request Apr 22, 2024
* feat: increase read timeout of OpenAI Compatible API, Ollama, Nvidia LLM (langgenius#3538)

* feat: agent log (langgenius#3537)

Co-authored-by: jyong <718720800@qq.com>

* fix: typo of PublishConfig (langgenius#3540)

* fix: workflow delete edge (langgenius#3541)

* feat: filter empty content messages in llm node (langgenius#3547)

* fix: json-reader-json-output (langgenius#3552)

* fix: tool node show output text variable type error (langgenius#3556)

* feat: economical index support retrieval testing (langgenius#3563)

* optimize question classifier prompt and support keyword hit test (langgenius#3565)

* fix event/stream ping (langgenius#3553)

* enhance: preload general packages (langgenius#3567)

* added claude 3 opus (langgenius#3545)

* feat: code (langgenius#3557)

* feat: add workflow api in Node.js sdk (langgenius#3584)

* Fix: use debounce for switch (langgenius#3585)

* fix: json in raw text sometimes changed back to key value in HTTP node (langgenius#3586)

* test: add scripts for running tests on api module both locally and CI jobs (langgenius#3497)

* add-open-mixtral-8x22b (langgenius#3591)

* docs: Update README.md (langgenius#3577)

* enhance:speedup xinference embedding & rerank  (langgenius#3587)

* fix(openai_api_compatible): fixing the error when converting chunk to json (langgenius#3570)

* feat: stable diffusion 3 (langgenius#3599)

* Feat/enterprise sso (langgenius#3602)

* Add mixtral 8x22b (langgenius#3606)

* fix: copy invite link has duplicated origin (langgenius#3608)

* seucirty: http smuggling (langgenius#3609)

* chore: apply ruff rules on tests and app.py (langgenius#3605)

* feat: Vision switch functionality is provided on OpenRouter (langgenius#3564)

* get dict key indexing_technique in DocumentAddByFileApi (langgenius#3615)

Co-authored-by: songqijun <songqijun@qipeng.com>

* fix: chat rename (langgenius#3627)

* feat: moonshot fc (langgenius#3629)

* add-llama3-for-nvidia-api-catalog (langgenius#3631)

* content fix to continue (langgenius#3633)

Co-authored-by: xiaohan <fuck@qq.com>

* Fix error in [Update yaml and py file in Tavily Tool] (langgenius#3465)

Co-authored-by: Yeuoly <admin@srmxy.cn>

* feat: add file log (langgenius#3612)

Co-authored-by: liuzhenghua-jk <liuzhenghua-jk@360shuke.com>

* fix: validate languages (langgenius#3638)

* Fix problem with scroll inside chat window (langgenius#3578)

* fix: in alembic's offline mode (db migrate with --sql option), skip data operations (langgenius#3533)

* fix: workflow_run_id not log_id in workflow api doc (langgenius#3658)

* Optimize README_CN (langgenius#3660)

* fix: delete tool parameters cache when sync draft workflow for run workflow use new parameter change in draft workflow  (langgenius#3637)

* python 3.12 support (langgenius#3652)

* version to 0.6.4 (langgenius#3670)

---------

Co-authored-by: takatost <takatost@users.noreply.github.com>
Co-authored-by: KVOJJJin <jzongcode@gmail.com>
Co-authored-by: jyong <718720800@qq.com>
Co-authored-by: Bowen Liang <liangbowen@gf.com.cn>
Co-authored-by: zxhlyh <jasonapring2015@outlook.com>
Co-authored-by: Yeuoly <45712896+Yeuoly@users.noreply.github.com>
Co-authored-by: Joel <iamjoel007@gmail.com>
Co-authored-by: Jyong <76649700+JohnJyong@users.noreply.github.com>
Co-authored-by: liuzhenghua <1090179900@qq.com>
Co-authored-by: Siddharth Jain <137015071+tellsiddh@users.noreply.github.com>
Co-authored-by: Joshua <138381132+joshua20231026@users.noreply.github.com>
Co-authored-by: Matheus Mondaini <matheus.mondaini@outlook.com>
Co-authored-by: 呆萌闷油瓶 <253605712@qq.com>
Co-authored-by: aniaan <hi@aniaan.dev>
Co-authored-by: Garfield Dai <dai.hai@foxmail.com>
Co-authored-by: jeessy2 <6205259+jeessy2@users.noreply.github.com>
Co-authored-by: sqj8899 <sqj8899@126.com>
Co-authored-by: songqijun <songqijun@qipeng.com>
Co-authored-by: fuckqqcom <9391575+fuckqqcom@users.noreply.github.com>
Co-authored-by: xiaohan <fuck@qq.com>
Co-authored-by: Richards Tu <142148415+richards199999@users.noreply.github.com>
Co-authored-by: Yeuoly <admin@srmxy.cn>
Co-authored-by: liuzhenghua-jk <liuzhenghua-jk@360shuke.com>
Co-authored-by: YidaHu <huyidada@gmail.com>
Co-authored-by: rmmedia <125268006+rmmedia-pl@users.noreply.github.com>
Co-authored-by: saga.rey <saga.rey@outlook.com>
Co-authored-by: xin.gao <34891602+gaoxin-pen@users.noreply.github.com>
dengpeng pushed a commit to dengpeng/dify that referenced this pull request Jun 16, 2024
HuberyHuV1 pushed a commit to HuberyHuV1/dify that referenced this pull request Jul 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💪 enhancement New feature or request lgtm This PR has been approved by a maintainer size:M This PR changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants