enhance:speedup xinference embedding & rerank #3587

leslie2046 · 2024-04-18T04:12:00Z

Description

enhance:speedup xinference embedding & rerank by moving calling unnecessary verification(/v1/cluster/auth) API and fetch model information(/v1/models/)

restful_client.py
when a Client creating ,it will always call /v1/cluster/auth and v1/models/{model_uid} which cost long time,and i think only when user adding model need to call these 2 API .so move them to validate_credentials().

Fixes # (issue)

Type of Change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update, included: Dify Document
Improvement, including but not limited to code refactoring, performance optimization, and UI/UX improvement
Dependency upgrade

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

TODO

Suggested Checklist:

I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
My changes generate no new warnings
I ran dev/reformat(backend) and cd web && npx lint-staged(frontend) to appease the lint gods
optional I have made corresponding changes to the documentation
optional I have added tests that prove my fix is effective or that my feature works
optional New and existing unit tests pass locally with my changes

…ecessary verification(/v1/cluster/auth) API and fetch model information(/v1/models/) API to validate_credentials() only

Yeuoly

LGTM

* feat: increase read timeout of OpenAI Compatible API, Ollama, Nvidia LLM (langgenius#3538) * feat: agent log (langgenius#3537) Co-authored-by: jyong <718720800@qq.com> * fix: typo of PublishConfig (langgenius#3540) * fix: workflow delete edge (langgenius#3541) * feat: filter empty content messages in llm node (langgenius#3547) * fix: json-reader-json-output (langgenius#3552) * fix: tool node show output text variable type error (langgenius#3556) * feat: economical index support retrieval testing (langgenius#3563) * optimize question classifier prompt and support keyword hit test (langgenius#3565) * fix event/stream ping (langgenius#3553) * enhance: preload general packages (langgenius#3567) * added claude 3 opus (langgenius#3545) * feat: code (langgenius#3557) * feat: add workflow api in Node.js sdk (langgenius#3584) * Fix: use debounce for switch (langgenius#3585) * fix: json in raw text sometimes changed back to key value in HTTP node (langgenius#3586) * test: add scripts for running tests on api module both locally and CI jobs (langgenius#3497) * add-open-mixtral-8x22b (langgenius#3591) * docs: Update README.md (langgenius#3577) * enhance:speedup xinference embedding & rerank (langgenius#3587) * fix(openai_api_compatible): fixing the error when converting chunk to json (langgenius#3570) * feat: stable diffusion 3 (langgenius#3599) * Feat/enterprise sso (langgenius#3602) * Add mixtral 8x22b (langgenius#3606) * fix: copy invite link has duplicated origin (langgenius#3608) * seucirty: http smuggling (langgenius#3609) * chore: apply ruff rules on tests and app.py (langgenius#3605) * feat: Vision switch functionality is provided on OpenRouter (langgenius#3564) * get dict key indexing_technique in DocumentAddByFileApi (langgenius#3615) Co-authored-by: songqijun <songqijun@qipeng.com> * fix: chat rename (langgenius#3627) * feat: moonshot fc (langgenius#3629) * add-llama3-for-nvidia-api-catalog (langgenius#3631) * content fix to continue (langgenius#3633) Co-authored-by: xiaohan <fuck@qq.com> * Fix error in [Update yaml and py file in Tavily Tool] (langgenius#3465) Co-authored-by: Yeuoly <admin@srmxy.cn> * feat: add file log (langgenius#3612) Co-authored-by: liuzhenghua-jk <liuzhenghua-jk@360shuke.com> * fix: validate languages (langgenius#3638) * Fix problem with scroll inside chat window (langgenius#3578) * fix: in alembic's offline mode (db migrate with --sql option), skip data operations (langgenius#3533) * fix: workflow_run_id not log_id in workflow api doc (langgenius#3658) * Optimize README_CN (langgenius#3660) * fix: delete tool parameters cache when sync draft workflow for run workflow use new parameter change in draft workflow (langgenius#3637) * python 3.12 support (langgenius#3652) * version to 0.6.4 (langgenius#3670) --------- Co-authored-by: takatost <takatost@users.noreply.github.com> Co-authored-by: KVOJJJin <jzongcode@gmail.com> Co-authored-by: jyong <718720800@qq.com> Co-authored-by: Bowen Liang <liangbowen@gf.com.cn> Co-authored-by: zxhlyh <jasonapring2015@outlook.com> Co-authored-by: Yeuoly <45712896+Yeuoly@users.noreply.github.com> Co-authored-by: Joel <iamjoel007@gmail.com> Co-authored-by: Jyong <76649700+JohnJyong@users.noreply.github.com> Co-authored-by: liuzhenghua <1090179900@qq.com> Co-authored-by: Siddharth Jain <137015071+tellsiddh@users.noreply.github.com> Co-authored-by: Joshua <138381132+joshua20231026@users.noreply.github.com> Co-authored-by: Matheus Mondaini <matheus.mondaini@outlook.com> Co-authored-by: 呆萌闷油瓶 <253605712@qq.com> Co-authored-by: aniaan <hi@aniaan.dev> Co-authored-by: Garfield Dai <dai.hai@foxmail.com> Co-authored-by: jeessy2 <6205259+jeessy2@users.noreply.github.com> Co-authored-by: sqj8899 <sqj8899@126.com> Co-authored-by: songqijun <songqijun@qipeng.com> Co-authored-by: fuckqqcom <9391575+fuckqqcom@users.noreply.github.com> Co-authored-by: xiaohan <fuck@qq.com> Co-authored-by: Richards Tu <142148415+richards199999@users.noreply.github.com> Co-authored-by: Yeuoly <admin@srmxy.cn> Co-authored-by: liuzhenghua-jk <liuzhenghua-jk@360shuke.com> Co-authored-by: YidaHu <huyidada@gmail.com> Co-authored-by: rmmedia <125268006+rmmedia-pl@users.noreply.github.com> Co-authored-by: saga.rey <saga.rey@outlook.com> Co-authored-by: xin.gao <34891602+gaoxin-pen@users.noreply.github.com>

leslie2046 added 21 commits April 8, 2024 19:48

add ignore

d3c0d19

Merge branch 'main' of https://github.com/leslie2046/dify into main

6ff72ac

Merge branch 'langgenius:main' into main

ee7df7a

modify ignore

9eed048

Merge branch 'langgenius:main' into main

abf9e37

Merge branch 'langgenius:main' into main

635a203

Merge branch 'langgenius:main' into main

555e56a

Merge branch 'langgenius:main' into main

9efda0a

Merge branch 'langgenius:main' into main

2083fc3

Merge branch 'langgenius:main' into main

7a8873d

Merge branch 'langgenius:main' into main

eb7bbff

Merge branch 'langgenius:main' into main

01e8ca2

Merge branch 'langgenius:main' into main

c6cfb7b

Merge branch 'langgenius:main' into main

61dd836

Merge branch 'langgenius:main' into main

0f4685a

Merge branch 'langgenius:main' into main

2731299

Merge branch 'langgenius:main' into main

b513209

Merge branch 'langgenius:main' into main

8684808

Merge branch 'langgenius:main' into main

2b0dbdf

Merge branch 'langgenius:main' into main

3f81e50

enhance:speedup xinference embedding & rerank by removing calling unn…

b683ead

…ecessary verification(/v1/cluster/auth) API and fetch model information(/v1/models/) API to validate_credentials() only

dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. 🐍 python 💪 enhancement New feature or request labels Apr 18, 2024

Yeuoly approved these changes Apr 18, 2024

View reviewed changes

dosubot bot added the lgtm This PR has been approved by a maintainer label Apr 18, 2024

Yeuoly merged commit 4365843 into langgenius:main Apr 18, 2024
6 checks passed

leslie2046 mentioned this pull request Apr 19, 2024

enhance:speedup xinference audio transcription #3636

Merged

14 tasks

dengpeng pushed a commit to dengpeng/dify that referenced this pull request Jun 16, 2024

enhance:speedup xinference embedding & rerank (langgenius#3587)

b7b77ea

HuberyHuV1 pushed a commit to HuberyHuV1/dify that referenced this pull request Jul 22, 2024

enhance:speedup xinference embedding & rerank (langgenius#3587)

17bae9a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enhance:speedup xinference embedding & rerank #3587

enhance:speedup xinference embedding & rerank #3587

leslie2046 commented Apr 18, 2024

Yeuoly left a comment

enhance:speedup xinference embedding & rerank #3587

enhance:speedup xinference embedding & rerank #3587

Conversation

leslie2046 commented Apr 18, 2024

Description

Type of Change

How Has This Been Tested?

Suggested Checklist:

Yeuoly left a comment

Choose a reason for hiding this comment