Skip to content

Commit

Permalink
feat: Add ollama API suport, run_for_ollama_api_in_M1_mac.sh
Browse files Browse the repository at this point in the history
  • Loading branch information
吴尔平 authored and 吴尔平 committed May 10, 2024
1 parent a66989f commit fe2076b
Show file tree
Hide file tree
Showing 6 changed files with 21 additions and 1 deletion.
6 changes: 6 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,12 @@ bash scripts/run_for_openai_api_with_gpu_in_Linux_or_WSL.sh
bash scripts/run_for_openai_api_in_M1_mac.sh
```

## Run With ollama API On M1 Mac

```bash
bash scripts/run_for_ollama_api_in_M1_mac.sh
```

## Run With 3B LLM (MiniChat-2-3B-INT8-GGUF) On M1 Mac
```bash
bash scripts/run_for_3B_in_M1_mac.sh
Expand Down
6 changes: 6 additions & 0 deletions README_zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,12 @@ bash scripts/run_for_openai_api_with_gpu_in_Linux_or_WSL.sh
bash scripts/run_for_openai_api_in_M1_mac.sh
```

## 在M1Mac环境下使用Ollama API

```bash
bash scripts/run_for_ollama_api_in_M1_mac.sh
```

## 在M1Mac环境下使用3B LLM((MiniChat-2-3B-INT8-GGUF)

```bash
Expand Down
6 changes: 5 additions & 1 deletion qanything_kernel/connector/llm/llm_for_openai_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,7 @@ def num_tokens_from_messages(self, messages, model=None):
"gpt-4-32k-0613",
"gpt-4-32k",
# "gpt-4-1106-preview",
"qwen:32b",
}:
tokens_per_message = 3
tokens_per_name = 1
Expand All @@ -97,7 +98,10 @@ def num_tokens_from_messages(self, messages, model=None):
# 对于 gpt-4 模型可能会有更新,此处返回假设为 gpt-4-0613 的token数量,并给出警告
debug_logger.info("Warning: gpt-4 may update over time. Returning num tokens assuming gpt-4-0613.")
return self.num_tokens_from_messages(messages, model="gpt-4-0613")

elif "qwen:32b" in model:
# 对于 qwen 模型可能会有更新,此处返回假设为 qwen:32b 的token数量,并给出警告
debug_logger.info("Warning: qwen may update over time. Returning num tokens assuming qwen:32b.")
return self.num_tokens_from_messages(messages, model="qwen:32b")
else:
# 对于没有实现的模型,抛出未实现错误
raise NotImplementedError(
Expand Down
2 changes: 2 additions & 0 deletions scripts/run_for_ollama_api_in_M1_mac.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
#!/bin/bash
bash scripts/base_run.sh -s "M1mac" -w 4 -m 19530 -q 8777 -o -b 'http://localhost:11434/v1' -k 'ollama' -n 'qwen:32b' -M '32B' -l '4096'

0 comments on commit fe2076b

Please sign in to comment.