Skip to content

Commit

Permalink
Add DeepSeek models (#147)
Browse files Browse the repository at this point in the history
  • Loading branch information
svilupp authored May 7, 2024
1 parent 4c2f945 commit 641c9a0
Show file tree
Hide file tree
Showing 6 changed files with 106 additions and 3 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
## [Unreleased]

### Added
- Added support for [DeepSeek models](https://platform.deepseek.com/docs) via the `dschat` and `dscode` aliases. You can set the `DEEPSEEK_API_KEY` environment variable to your DeepSeek API key.

### Fixed

Expand Down
2 changes: 1 addition & 1 deletion Project.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name = "PromptingTools"
uuid = "670122d1-24a8-4d70-bfce-740807c42192"
authors = ["J S @svilupp and contributors"]
version = "0.23.0"
version = "0.24.0"

[deps]
AbstractTrees = "1520ce14-60c1-5f80-bbc7-55ef81b5835c"
Expand Down
55 changes: 55 additions & 0 deletions docs/src/frequently_asked_questions.md
Original file line number Diff line number Diff line change
Expand Up @@ -415,3 +415,58 @@ Fine-tuning is a powerful technique to adapt a model to your specific use case (
2. Once the finetuning time comes, create a bundle of ShareGPT-formatted conversations (common finetuning format) in a single `.jsonl` file. Use `PT.save_conversations("dataset.jsonl", [conversation1, conversation2, ...])` (notice that plural "conversationS" in the function name).

For an example of an end-to-end finetuning process, check out our sister project [JuliaLLMLeaderboard Finetuning experiment](https://github.com/svilupp/Julia-LLM-Leaderboard/blob/main/experiments/cheater-7b-finetune/README.md). It shows the process of finetuning for half a dollar with [JarvisLabs.ai](https://jarvislabs.ai/templates/axolotl) and [Axolotl](https://github.com/OpenAccess-AI-Collective/axolotl).

## Can I see how my prompt is rendered / what is sent to the API?

Yes, there are two ways.
1) "dry run", where the `ai*` function will return the prompt rendered in the style of the selected API provider
2) "partial render", for provider-agnostic purposes, you can run only the first step of the rendering pipeline to see the messages that will be sent (but formatted as `SystemMessage` and `UserMessage`), which is easy to read and work with

1) Dry Run

Add kwargs `dry_run` and `return_all` to see what could have been sent to the API to your `ai*` functions (without `return_all` there is nothing to show you).

Example for OpenAI:
```julia
dry_conv = aigenerate(:BlankSystemUser; system = "I exist", user = "say hi",
model = "lngpt3t", return_all = true, dry_run = true)
```

```plaintext
2-element Vector{Dict{String, Any}}:
Dict("role" => "system", "content" => "I exist")
Dict("role" => "user", "content" => "say hi")
```

2) Partial Render

Personally, I prefer to see the pretty formatting of PromptingTools *Messages.
To see what will be sent to the model, you can `render` only the first stage of the rendering pipeline with schema `NoSchema()` (it merely does the variable replacements and creates the necessary messages). It's shared by all the schema/providers.

```julia
PT.render(PT.NoSchema(), "say hi, {{name}}"; name="John")
```

```plaintext
2-element Vector{PromptingTools.AbstractMessage}:
PromptingTools.SystemMessage("Act as a helpful AI assistant")
PromptingTools.UserMessage("say hi, John")
```

What about the prompt templates?
Prompt templates have an extra pre-rendering step that expands the symbolic `:name` (understood by PromptingTools as a reference to `AITemplate(:name)`) into a vector of Messages.

```julia
# expand the template into messages
tpl = PT.render(AITemplate(:BlankSystemUser))
PT.render(PT.NoSchema(), tpl; system = "I exist", user = "say hi")
# replace any variables, etc.
```

```plaintext
2-element Vector{PromptingTools.AbstractMessage}:
PromptingTools.SystemMessage("I exist")
PromptingTools.UserMessage("say hi")
```

For more information about the rendering pipeline and examples refer to [Walkthrough Example for aigenerate](@ref).
14 changes: 14 additions & 0 deletions src/llm_interface.jl
Original file line number Diff line number Diff line change
Expand Up @@ -192,6 +192,20 @@ Requires one environment variables to be set:
"""
struct GroqOpenAISchema <: AbstractOpenAISchema end

"""
DeepSeekOpenAISchema
Schema to call the [DeepSeek](https://platform.deepseek.com/docs) API.
Links:
- [Get your API key](https://platform.deepseek.com/api_keys)
- [API Reference](https://platform.deepseek.com/docs)
Requires one environment variables to be set:
- `DEEPSEEK_API_KEY`: Your API key (often starts with "sk-...")
"""
struct DeepSeekOpenAISchema <: AbstractOpenAISchema end

abstract type AbstractOllamaSchema <: AbstractPromptSchema end

"""
Expand Down
13 changes: 13 additions & 0 deletions src/llm_openai.jl
Original file line number Diff line number Diff line change
Expand Up @@ -205,6 +205,19 @@ function OpenAI.create_chat(schema::GroqOpenAISchema,
base_url = url)
OpenAI.create_chat(provider, model, conversation; kwargs...)
end
function OpenAI.create_chat(schema::DeepSeekOpenAISchema,
api_key::AbstractString,
model::AbstractString,
conversation;
url::String = "https://api.deepseek.com/v1",
kwargs...)
# Build the corresponding provider object
# try to override provided api_key because the default is OpenAI key
provider = CustomProvider(;
api_key = isempty(DEEPSEEK_API_KEY) ? api_key : DEEPSEEK_API_KEY,
base_url = url)
OpenAI.create_chat(provider, model, conversation; kwargs...)
end
function OpenAI.create_chat(schema::DatabricksOpenAISchema,
api_key::AbstractString,
model::AbstractString,
Expand Down
24 changes: 22 additions & 2 deletions src/user_preferences.jl
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ Check your preferences by calling `get_preferences(key::String)`.
- `ANTHROPIC_API_KEY`: The API key for the Anthropic API. Get yours from [here](https://www.anthropic.com/).
- `VOYAGE_API_KEY`: The API key for the Voyage API. Free tier is upto 50M tokens! Get yours from [here](https://dash.voyageai.com/api-keys).
- `GROQ_API_KEY`: The API key for the Groq API. Free in beta! Get yours from [here](https://console.groq.com/keys).
- `DEEPSEEK_API_KEY`: The API key for the DeepSeek API. Get \$5 credit when you join. Get yours from [here](https://platform.deepseek.com/api_keys).
- `MODEL_CHAT`: The default model to use for aigenerate and most ai* calls. See `MODEL_REGISTRY` for a list of available models or define your own.
- `MODEL_EMBEDDING`: The default model to use for aiembed (embedding documents). See `MODEL_REGISTRY` for a list of available models or define your own.
- `PROMPT_SCHEMA`: The default prompt schema to use for aigenerate and most ai* calls (if not specified in `MODEL_REGISTRY`). Set as a string, eg, `"OpenAISchema"`.
Expand All @@ -46,6 +47,7 @@ Define your `register_model!()` calls in your `startup.jl` file to make them ava
- `ANTHROPIC_API_KEY`: The API key for the Anthropic API. Get yours from [here](https://www.anthropic.com/).
- `VOYAGE_API_KEY`: The API key for the Voyage API. Free tier is upto 50M tokens! Get yours from [here](https://dash.voyageai.com/api-keys).
- `GROQ_API_KEY`: The API key for the Groq API. Free in beta! Get yours from [here](https://console.groq.com/keys).
- `DEEPSEEK_API_KEY`: The API key for the DeepSeek API. Get \$5 credit when you join. Get yours from [here](https://platform.deepseek.com/api_keys).
Preferences.jl takes priority over ENV variables, so if you set a preference, it will take precedence over the ENV variable.
Expand All @@ -64,6 +66,7 @@ const ALLOWED_PREFERENCES = ["MISTRALAI_API_KEY",
"ANTHROPIC_API_KEY",
"VOYAGE_API_KEY",
"GROQ_API_KEY",
"DEEPSEEK_API_KEY",
"MODEL_CHAT",
"MODEL_EMBEDDING",
"MODEL_ALIASES",
Expand Down Expand Up @@ -179,6 +182,10 @@ _temp = get(ENV, "GROQ_API_KEY", "")
const GROQ_API_KEY::String = @load_preference("GROQ_API_KEY",
default=_temp);

_temp = get(ENV, "DEEPSEEK_API_KEY", "")
const DEEPSEEK_API_KEY::String = @load_preference("DEEPSEEK_API_KEY",
default=_temp);

_temp = get(ENV, "LOCAL_SERVER", "http://localhost:10897/v1")
## Address of the local server
const LOCAL_SERVER::String = @load_preference("LOCAL_SERVER",
Expand Down Expand Up @@ -341,7 +348,10 @@ aliases = merge(
"gl3" => "llama3-8b-8192",
"gllama370" => "llama3-70b-8192",
"gl70" => "llama3-70b-8192",
"gmixtral" => "mixtral-8x7b-32768"
"gmixtral" => "mixtral-8x7b-32768",
## DeepSeek
"dschat" => "deepseek-chat",
"dscode" => "deepseek-coder"
),
## Load aliases from preferences as well
@load_preference("MODEL_ALIASES", default=Dict{String, String}()))
Expand Down Expand Up @@ -646,7 +656,17 @@ registry = Dict{String, ModelSpec}(
GroqOpenAISchema(),
2.7e-7,
2.7e-7,
"Mistral.ai Mixtral 8x7b, hosted by Groq. Max 32K context. See details [here](https://console.groq.com/docs/models)")
"Mistral.ai Mixtral 8x7b, hosted by Groq. Max 32K context. See details [here](https://console.groq.com/docs/models)"),
"deepseek-chat" => ModelSpec("deepseek-chat",
DeepSeekOpenAISchema(),
1.4e-7,
2.8e-7,
"Deepseek.com-hosted DeepSeekV2 model. Max 32K context. See details [here](https://platform.deepseek.com/docs)"),
"deepseek-coder" => ModelSpec("deepseek-coder",
DeepSeekOpenAISchema(),
1.4e-7,
2.8e-7,
"Deepseek.com-hosted coding model. Max 16K context. See details [here](https://platform.deepseek.com/docs)")
)

"""
Expand Down

0 comments on commit 641c9a0

Please sign in to comment.