Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leptonai integrate #4079

Merged
merged 8 commits into from
May 5, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
add-leptonai
  • Loading branch information
leilei-jiang committed May 4, 2024
commit 535eeb473bce3f241681403c19b2caa3666602bc
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
- gemma-7b
- mistral-7b
- mixtral-8x7b
- llama2-7b
- llama2-13b
- llama3-70b
-
20 changes: 20 additions & 0 deletions api/core/model_runtime/model_providers/leptonai/llm/gemma-7b.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
model: gemma-7b
label:
zh_Hans: gemma-7b
en_US: gemma-7b
model_type: llm
features:
- agent-thought
model_properties:
mode: chat
context_size: 8192
parameter_rules:
- name: temperature
use_template: temperature
- name: top_p
use_template: top_p
- name: max_tokens
use_template: max_tokens
default: 1024
min: 1
max: 1024
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
model: llama2-13b
label:
zh_Hans: llama2-13b
en_US: llama2-13b
model_type: llm
features:
- agent-thought
model_properties:
mode: chat
context_size: 4096
parameter_rules:
- name: temperature
use_template: temperature
- name: top_p
use_template: top_p
- name: max_tokens
use_template: max_tokens
default: 1024
min: 1
max: 1024
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,6 @@ parameter_rules:
use_template: top_p
- name: max_tokens
use_template: max_tokens
default: 512
default: 1024
min: 1
max: 4096
pricing:
input: '0.01'
output: '0.01'
unit: '0.000001'
currency: USD
max: 1024
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
model: llama3-70b
label:
zh_Hans: llama3-70b
en_US: llama3-70b
model_type: llm
features:
- agent-thought
model_properties:
mode: chat
context_size: 8192
parameter_rules:
- name: temperature
use_template: temperature
- name: top_p
use_template: top_p
- name: max_tokens
use_template: max_tokens
default: 1024
min: 1
max: 1024
9 changes: 7 additions & 2 deletions api/core/model_runtime/model_providers/leptonai/llm/llm.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,13 @@


class LeptonAILargeLanguageModel(OAIAPICompatLargeLanguageModel):
MODEL_SUFFIX_MAP = {
MODEL_PREFIX_MAP = {
'llama2-7b': 'llama2-7b',
'gemma-7b': 'gemma-7b',
'mistral-7b': 'mistral-7b',
'mixtral-8x7b': 'mixtral-8x7b',
'llama3-70b': 'llama3-70b',
'llama2-13b': 'llama2-13b',
}
def _invoke(self, model: str, credentials: dict,
prompt_messages: list[PromptMessage], model_parameters: dict,
Expand All @@ -25,5 +30,5 @@ def validate_credentials(self, model: str, credentials: dict) -> None:
@classmethod
def _add_custom_parameters(cls, credentials: dict, model: str) -> None:
credentials['mode'] = 'chat'
credentials['endpoint_url'] = f'https://{cls.MODEL_SUFFIX_MAP[model]}.lepton.run/api/v1'
credentials['endpoint_url'] = f'https://{cls.MODEL_PREFIX_MAP[model]}.lepton.run/api/v1'

Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
model: mistral-7b
label:
zh_Hans: mistral-7b
en_US: mistral-7b
model_type: llm
features:
- agent-thought
model_properties:
mode: chat
context_size: 8192
parameter_rules:
- name: temperature
use_template: temperature
- name: top_p
use_template: top_p
- name: max_tokens
use_template: max_tokens
default: 1024
min: 1
max: 1024
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
model: mixtral-8x7b
label:
zh_Hans: mixtral-8x7b
en_US: mixtral-8x7b
model_type: llm
features:
- agent-thought
model_properties:
mode: chat
context_size: 32000
parameter_rules:
- name: temperature
use_template: temperature
- name: top_p
use_template: top_p
- name: max_tokens
use_template: max_tokens
default: 1024
min: 1
max: 1024
Loading