This project is a gRPC server written in Rust.
It implements the ability to store multiple text fragments inside audio files, and lightning-fast semantic search on them.
Among other things, it can be used as a retriever part in a RAG system.
UPD: The project is not finished, improvements will be added as soon as possible.
- Index building: creating an HNSW search index from vector representations.
- Semantic search: performing fast vector similarity search on stored text fragments.
- Parallel processing: optimized search using parallel processing.
- No database required: all data is stored locally in WAV audio files and JSON metadata.
A preloaded local model is used to create embeddings (vector representations of text) (without connecting to external AI APIs). For example, you can use the following models:
- EN
To use them, you need to download and put the following files into the ./model directory of the project:
model.onnx,config.json,tokenizer.json,tokenizer_config.json,special_tokens_map.json.
In config.yaml the values for the fields are set:
Authusername- username for basic auth (optional parameter).password- password for basic auth (optional parameter).
Serverhost- host to run the gRPC server.port- port to run the gRPC server.
Logginglog_level- log/trace level.
RateLimitcapacity- maximum number of tokens (bucket capacity).refill_rate- number of tokens added per time interval (refill_interval_ms).refill_interval_ms- duration of refill interval (in milliseconds).
Appmodel_dir- directory to store model files (example./model).audio_dir- directory to store audio files (example./output/audio).output_dir- directory (parent) for indexing results (example./output).index_path- path to the search index file (example./output/hnsw.idx).storage_path- path to the file with metadata (example./output/storage.json).
Important!
For Auth, override credentials via environment variables to avoid storing secrets in YAML.
Environment variables:
- APP__AUTH__USERNAME="your_secure_auth_login"
- APP__AUTH__PASSWORD="your_secure_auth_password"
- Audio encoding: 16-bit WAV files (mono), 48 kHz sampling rate.
- Batch size: 500 text fragments per audio file (configurable in audio.rs).
- Embedding model: any embedding model can be used (examples above).
- Search algorithm: HNSW with cosine similarity + fallback parallel linear search.
Audio file batch_n structure (after encoding)
- First, the number of chunks (4 bytes)
- Then for each chunk:
- Length (4 bytes)
- Data (N bytes)
- Trailing zeros (500)
RateLimiting uses the Token Bucket algorithm.
It is worth considering that this algorithm can allow a burst when tokens are accumulated (the bucket is full).
Currently implemented via a third-party crate rater.
The rate limit is applied to all routes in total.
To calculate RPS, use the formula refill_rate * 1000 / refill_interval_ms.
The structure of the /output directory after building the index:
output/
├── audio/
│ ├── batch_0.wav # First 500 text chunks
│ ├── batch_1.wav # Next 500 text chunks
│ └── ...
├── hnsw.idx # Search index with embeddings
└── storage.json # Metadata and batch information
To send a request to the server, take text_indexer.proto (from the ./proto directory), and use it in your client.
You can check the functionality, for example, via Postman.
authorization: Basic <base64_token>- username:password authentication token in base64 format.correlation-id: <id>- identifier for request tracing (if not specified, the server will generate its own).
If the text file is located on the server, you can specify the path to it:
{
"id": "123e4567-e89b-12d3-a456-426614174000",
"file_path": "./articles_on_various_topics.txt",
"chunk_size": 150
}Otherwise, you can pass text in the request.
{
"id": "123e4567-e89b-12d3-a456-426614174000",
"content": "text in base64 format",
"chunk_size": 150
}As a result, the server will return JSON of the following type:
{
"id": "123e4567-e89b-12d3-a456-426614174000"
}authorization: Basic <base64_token>- username:password authentication token in base64 format.correlation-id: <id>- identifier for request tracing (if not specified, the server will generate its own).
{
"id": "123e4567-e89b-12d3-a456-426614174001",
"query": "Scientific discoveries of the Hubble Space Telescope",
"top_k": 5,
"min_similarity": 0.3
}As a result, the server will return JSON of the following type:
{
"id": "123e4567-e89b-12d3-a456-426614174001",
"results": [
{
"text": "One of the Hubble Space Telescope's major discoveries is evidence that the expansion of the Universe is accelerating, driven by dark energy.",
"score": 0.9029404520988464
},
{
"text": "The Hubble Space Telescope has captured amazing star formation in nebulae such as Orion and Aquila.",
"score": 0.8357565402984619
},
{
"text": "Launched in 1990, the Hubble Space Telescope has become one of the most important instruments in the history of astronomy.",
"score": 0.7911539673805237
},
{
"text": "In three decades of operation, Hubble has helped to clarify the age of the Universe and prove the existence of dark matter.",
"score": 0.7869136929512024
},
{
"text": "In some cultures, crows are considered a symbol of wisdom, and science backs up that reputation.",
"score": 0.3231840753555298
}
]
}- To install
Ruston Unix-like systems (MacOS, Linux, ...) - run the command in the terminal. After the download is complete, you will get the latest stable version of Rust for your platform, as well as the latest version of Cargo.
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh- Run the following command in the terminal to verify.
If the installation is successful (step 1), you will see something likecargo 1.88.0 ....
cargo --version- We clone the project from GitHub, open it, and execute the following commands.
Check the code to see if it can be compiled (without running it).
cargo checkBuild + run the project (in release mode with optimizations).
cargo run --releaseUDP: If you have Windows, see Instructions here.
To deploy a project locally in Docker, you need to:
- Make sure
Docker daemonis running. - Make sure
embedding modelis present in the./modeldirectory of the project (files downloaded and added). - Open a terminal in the root of the project, and run the command (for example
docker build -t disorder-server .). - After the project is built, run the command (for example
docker run -rm -p 9090:9090 disorder-server). - Enjoy using the service.
This project was inspired by memau, a project that stores data in audio files.
This project is licensed under the MIT License or Apache License 2.0, your choice.