Skip to content

Commit 0d2281b

Browse files
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
1 parent 33fe3b4 commit 0d2281b

File tree

1 file changed

+2
-2
lines changed
  • ChatQnA/docker_compose/intel/hpu/gaudi

1 file changed

+2
-2
lines changed

ChatQnA/docker_compose/intel/hpu/gaudi/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -176,7 +176,7 @@ This deployment may allocate more Gaudi resources to the tgi-service to optimize
176176

177177
### compose_faqgen.yaml - FAQ generation Deployment
178178

179-
The FAQs(frequently asked questions and answers) generation Deployment will generate FAQs instread of normally text generation. It add a new microservice called `llm-faqgen`, which is a microservice that interacts with the TGI/vLLM LLM server to generate FAQs from input text. Chatqna backend image change from `opea/chatqna:latest` to `opea/chatqna-faqgen:latest`, which depends on `llm-faqgen`.
179+
The FAQs(frequently asked questions and answers) generation Deployment will generate FAQs instead of normally text generation. It add a new microservice called `llm-faqgen`, which is a microservice that interacts with the TGI/vLLM LLM server to generate FAQs from input text. Chatqna backend image change from `opea/chatqna:latest` to `opea/chatqna-faqgen:latest`, which depends on `llm-faqgen`.
180180

181181
The TGI (Text Generation Inference) deployment and the default deployment differ primarily in their service configurations and specific focus on handling large language models (LLMs). The TGI deployment includes a unique `tgi-service`, which utilizes the `ghcr.io/huggingface/tgi-gaudi:2.0.6` image and is specifically configured to run on Gaudi hardware. This service is designed to handle LLM tasks with optimizations such as `ENABLE_HPU_GRAPH` and `USE_FLASH_ATTENTION`. The `chatqna-gaudi-backend-server` in the TGI deployment depends on the `tgi-service`, whereas in the default deployment, it relies on the `vllm-service`.
182182

@@ -188,7 +188,7 @@ The TGI (Text Generation Inference) deployment and the default deployment differ
188188
| retriever | opea/retriever:latest | No |
189189
| tei-reranking-service | ghcr.io/huggingface/tei-gaudi:1.5.0 | 1 card |
190190
| vllm-service | opea/vllm-gaudi:latest | Configurable |
191-
| llm-faqgen | opea/llm-faqgen:latest | No |
191+
| llm-faqgen | opea/llm-faqgen:latest | No |
192192
| chatqna-gaudi-backend-server | opea/chatqna-faqgen:latest | No |
193193
| chatqna-gaudi-ui-server | opea/chatqna-ui:latest | No |
194194
| chatqna-gaudi-nginx-server | opea/nginx:latest | No |

0 commit comments

Comments
 (0)