Skip to content

Commit c4257ef

Browse files
author
Alex Kwiatkowski
committed
add embeddings endpoint
1 parent b228d38 commit c4257ef

File tree

7 files changed

+198
-87
lines changed

7 files changed

+198
-87
lines changed

README.md

Lines changed: 12 additions & 82 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ HTTP API for [LLM](https://github.com/simonw/llm) with OpenAI compatibility
55
## Usage
66

77
```shell
8+
> llm http-api --help
89
Usage: llm http-api [OPTIONS]
910

1011
Run a FastAPI HTTP server with OpenAI compatibility
@@ -16,94 +17,23 @@ Options:
1617
--help Show this message and exit.
1718
```
1819

19-
## OpenAI Endpoints
20-
21-
### Audio
22-
23-
- [ ] `POST /v1/audio/speech`
24-
- [ ] `POST /v1/audio/transcriptions`
25-
- [ ] `POST /v1/audio/translations`
26-
27-
### Chat
20+
```shell
21+
> curl http://localhost:8080/v1/embeddings -X POST -H "Content-Type: application/json" -d '{
22+
"input": "Hello world",
23+
"model": "jina-embeddings-v2-small-en"
24+
}'
25+
{"object":"embedding","embedding":[-0.47561466693878174,-0.4471365511417389,...],"index":0}
26+
```
2827

29-
- [ ] `POST /v1/chat/completions`
28+
## OpenAI Endpoints
3029

3130
### Embeddings
3231

33-
- [ ] `POST /v1/embeddings`
34-
35-
### Fine Tuning
36-
37-
- [ ] `POST /v1/fine_tuning/jobs`
38-
- [ ] `GET /v1/fine_tuning/jobs`
39-
- [ ] `GET /v1/fine_tuning/jobs/{fine_tuning_job_id}`
40-
- [ ] `POST /v1/fine_tuning/jobs/{fine_tuning_job_id}/cancel`
41-
- [ ] `GET /v1/fine_tuning/jobs/{fine_tuning_job_id}/events`
42-
43-
### Files
44-
45-
- [ ] `POST /v1/files`
46-
- [ ] `GET /v1/files`
47-
- [ ] `GET /v1/files/{file_id}`
48-
- [ ] `DELETE /v1/files/{file_id}`
49-
- [ ] `GET /v1/files/{file_id}/content`
50-
51-
### Images
52-
53-
- [ ] `POST /v1/images/generations`
54-
- [ ] `POST /v1/images/edit`
55-
- [ ] `POST /v1/images/variations`
56-
57-
### Models
58-
59-
- [ ] `GET /v1/models`
60-
- [ ] `GET /v1/models/{model}`
61-
- [ ] `DELETE /v1/models/{model}`
62-
63-
### Moderations
64-
65-
- [ ] `POST /v1/moderations`
66-
- [ ] `GET /v1/models/{model}`
67-
68-
### Assistants
69-
70-
- [ ] `POST /v1/assistants`
71-
- [ ] `GET /v1/assistants`
72-
- [ ] `GET /v1/assistants/{assistant_id}`
73-
- [ ] `POST /v1/assistants/{assistant_id}`
74-
- [ ] `DELETE /v1/assistants/{assistant_id}`
75-
- [ ] `POST /v1/assistants/{assistant_id}/files`
76-
- [ ] `GET /v1/assistants/{assistant_id}/files`
77-
- [ ] `GET /v1/assistants/{assistant_id}/files/{file_id}`
78-
- [ ] `DELETE /v1/assistants/{assistant_id}/files/{file_id}`
79-
80-
### Threads
81-
82-
- [ ] `POST /v1/threads`
83-
- [ ] `GET /v1/threads/{thread_id}`
84-
- [ ] `POST /v1/threads/{thread_id}`
85-
- [ ] `DELETE /v1/threads/{thread_id}`
86-
87-
### Messages
88-
89-
- [ ] `POST /v1/threads/{thread_id}/messages`
90-
- [ ] `GET /v1/threads/{thread_id}/messages`
91-
- [ ] `GET /v1/threads/{thread_id}/messages/{message_id}`
92-
- [ ] `POST /v1/threads/{thread_id}/messages/{message_id}`
93-
- [ ] `GET /v1/threads/{thread_id}/messages/{message_id}/files`
94-
- [ ] `GET /v1/threads/{thread_id}/messages/{message_id}/files/{file_id}`
32+
- [x] [`POST /v1/embeddings`](./docs/endpoints/EMBEDDINGS.md)
9533

96-
### Runs
34+
### Unimplemented
9735

98-
- [ ] `POST /v1/threads/{thread_id}/runs`
99-
- [ ] `GET /v1/threads/{thread_id}/runs`
100-
- [ ] `GET /v1/threads/{thread_id}/runs/{run_id}`
101-
- [ ] `POST /v1/threads/{thread_id}/runs/{run_id}`
102-
- [ ] `POST /v1/threads/{thread_id}/runs/{run_id}/submit_tool_outputs`
103-
- [ ] `POST /v1/threads/{thread_id}/runs/{run_id}/cancel`
104-
- [ ] `POST /v1/threads/run`
105-
- [ ] `GET /v1/threads/{thread_id}/runs/{run_id}/steps/{step_id}`
106-
- [ ] `GET /v1/threads/{thread_id}/runs/{run_id}/steps`
36+
A detailed list of unimplemented OpenAI endpoints can be found [here](./docs/endpoints/UNIMPLEMENTED.md)
10737

10838
## Development
10939

docs/endpoints/EMBEDDINGS.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
# Endpoints/Embeddings
2+
3+
## POST /v1/embeddings
4+
5+
### [Embedding object](https://platform.openai.com/docs/api-reference/embeddings/object)
6+
7+
- `index` - integer
8+
- `embedding` - array
9+
- `object` - string
10+
11+
### [Request body](https://platform.openai.com/docs/api-reference/embeddings/create)
12+
13+
- `input` - string or array - Required
14+
- `model` - string - Required
15+
- `encoding_format` - string - Optional - Defaults to float
16+
- `user` - string - Optional - Ignored
17+
18+
### Returns
19+
20+
A list of embedding objects.
21+
22+
### Example
23+
24+
```shell
25+
> curl http://localhost:8080/v1/embeddings -X POST -H "Content-Type: application/json" -d '{
26+
"input": "Hello world",
27+
"model": "jina-embeddings-v2-small-en"
28+
}'
29+
{"object":"embedding","embedding":[-0.47561466693878174,-0.4471365511417389,...],"index":0}
30+
```

docs/endpoints/UNIMPLEMENTED.md

Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,84 @@
1+
# Endpoints/Unimplemented
2+
3+
### Audio
4+
5+
- [ ] `POST /v1/audio/speech`
6+
- [ ] `POST /v1/audio/transcriptions`
7+
- [ ] `POST /v1/audio/translations`
8+
9+
### Chat
10+
11+
- [ ] `POST /v1/chat/completions`
12+
13+
### Fine Tuning
14+
15+
- [ ] `POST /v1/fine_tuning/jobs`
16+
- [ ] `GET /v1/fine_tuning/jobs`
17+
- [ ] `GET /v1/fine_tuning/jobs/{fine_tuning_job_id}`
18+
- [ ] `POST /v1/fine_tuning/jobs/{fine_tuning_job_id}/cancel`
19+
- [ ] `GET /v1/fine_tuning/jobs/{fine_tuning_job_id}/events`
20+
21+
### Files
22+
23+
- [ ] `POST /v1/files`
24+
- [ ] `GET /v1/files`
25+
- [ ] `GET /v1/files/{file_id}`
26+
- [ ] `DELETE /v1/files/{file_id}`
27+
- [ ] `GET /v1/files/{file_id}/content`
28+
29+
### Images
30+
31+
- [ ] `POST /v1/images/generations`
32+
- [ ] `POST /v1/images/edit`
33+
- [ ] `POST /v1/images/variations`
34+
35+
### Models
36+
37+
- [ ] `GET /v1/models`
38+
- [ ] `GET /v1/models/{model}`
39+
- [ ] `DELETE /v1/models/{model}`
40+
41+
### Moderations
42+
43+
- [ ] `POST /v1/moderations`
44+
- [ ] `GET /v1/models/{model}`
45+
46+
### Assistants
47+
48+
- [ ] `POST /v1/assistants`
49+
- [ ] `GET /v1/assistants`
50+
- [ ] `GET /v1/assistants/{assistant_id}`
51+
- [ ] `POST /v1/assistants/{assistant_id}`
52+
- [ ] `DELETE /v1/assistants/{assistant_id}`
53+
- [ ] `POST /v1/assistants/{assistant_id}/files`
54+
- [ ] `GET /v1/assistants/{assistant_id}/files`
55+
- [ ] `GET /v1/assistants/{assistant_id}/files/{file_id}`
56+
- [ ] `DELETE /v1/assistants/{assistant_id}/files/{file_id}`
57+
58+
### Threads
59+
60+
- [ ] `POST /v1/threads`
61+
- [ ] `GET /v1/threads/{thread_id}`
62+
- [ ] `POST /v1/threads/{thread_id}`
63+
- [ ] `DELETE /v1/threads/{thread_id}`
64+
65+
### Messages
66+
67+
- [ ] `POST /v1/threads/{thread_id}/messages`
68+
- [ ] `GET /v1/threads/{thread_id}/messages`
69+
- [ ] `GET /v1/threads/{thread_id}/messages/{message_id}`
70+
- [ ] `POST /v1/threads/{thread_id}/messages/{message_id}`
71+
- [ ] `GET /v1/threads/{thread_id}/messages/{message_id}/files`
72+
- [ ] `GET /v1/threads/{thread_id}/messages/{message_id}/files/{file_id}`
73+
74+
### Runs
75+
76+
- [ ] `POST /v1/threads/{thread_id}/runs`
77+
- [ ] `GET /v1/threads/{thread_id}/runs`
78+
- [ ] `GET /v1/threads/{thread_id}/runs/{run_id}`
79+
- [ ] `POST /v1/threads/{thread_id}/runs/{run_id}`
80+
- [ ] `POST /v1/threads/{thread_id}/runs/{run_id}/submit_tool_outputs`
81+
- [ ] `POST /v1/threads/{thread_id}/runs/{run_id}/cancel`
82+
- [ ] `POST /v1/threads/run`
83+
- [ ] `GET /v1/threads/{thread_id}/runs/{run_id}/steps/{step_id}`
84+
- [ ] `GET /v1/threads/{thread_id}/runs/{run_id}/steps`

pyproject.toml

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,14 @@ llm_http_api = "llm_http_api"
3434

3535
[project.optional-dependencies]
3636
test = [
37+
"httpx >=0.25.2",
38+
"llm-clip >=0.1",
39+
"llm-embed-jina >=0.1.2",
40+
"llm-gpt4all >=0.2",
41+
"llama-cpp-python >=0.2.23",
42+
"llm-llama-cpp >=0.3",
43+
"llm-mlc >=0.5",
44+
"llm-sentence-transformers >=0.1.2",
3745
"pytest ~=7.4.0",
3846
"pytest-cov ~=4.1.0",
3947
"ruff ~=0.1.0",
@@ -64,3 +72,10 @@ exclude = [
6472
"node_modules",
6573
"venv",
6674
]
75+
76+
[tool.pytest.ini_options]
77+
filterwarnings = [
78+
"ignore:.*Pydantic V1 style `@validator` validators are deprecated.*:DeprecationWarning",
79+
"ignore:.*Support for class-based `config` is deprecated.*:DeprecationWarning",
80+
"ignore:.*pkg_resources is deprecated as an API.*:DeprecationWarning",
81+
]
Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,3 @@
1-
from llm_http_api.server.fastapi import app
1+
import importlib
22

3-
4-
@app.post("/v1/embeddings")
5-
async def create_embedding():
6-
return {"message": "TODO#POST embedding"}
3+
importlib.import_module("llm_http_api.server.embeddings.create_embedding")
Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
import llm
2+
from typing import Optional
3+
from pydantic import BaseModel
4+
from llm_http_api.server.fastapi import app
5+
6+
7+
class Embed(BaseModel):
8+
# input: str | list[str] | bytes | list[bytes]
9+
input: str | bytes
10+
model: str
11+
encoding_format: Optional[str] = "float"
12+
user: Optional[str] = None
13+
14+
15+
class EmbeddingResult(BaseModel):
16+
object: str = "embedding"
17+
embedding: list[float]
18+
index: int
19+
20+
21+
@app.post("/v1/embeddings")
22+
async def create_embedding(embed: Embed):
23+
model = llm.get_embedding_model(embed.model)
24+
embedding = model.embed(embed.input)
25+
return EmbeddingResult(
26+
object="embedding",
27+
embedding=embedding,
28+
index=0,
29+
)
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
from hamcrest import assert_that, is_not, empty, has_entries
2+
from fastapi.testclient import TestClient
3+
from llm_http_api.server.fastapi import app
4+
5+
client = TestClient(app)
6+
7+
8+
def test_create_embedding():
9+
response = client.post(
10+
"/v1/embeddings",
11+
json={
12+
"input": "Hello World",
13+
"model": "jina-embeddings-v2-small-en",
14+
},
15+
)
16+
assert response.status_code == 200
17+
assert_that(
18+
response.json(),
19+
has_entries(
20+
{
21+
"object": "embedding",
22+
"embedding": is_not(empty()),
23+
"index": 0,
24+
}
25+
),
26+
)

0 commit comments

Comments
 (0)