Skip to content
Merged

Dev #25

Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
101 changes: 100 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -163,6 +163,105 @@ async fn main() -> vectorless::Result<()> {
| **Feedback Learning** | Improves from user feedback over time |
| **Multi-turn Queries** | Handles complex questions with decomposition |

## Configuration

### Zero Configuration (Recommended)

Just set `OPENAI_API_KEY` and you're ready to go:

```bash
export OPENAI_API_KEY="sk-..."
```

<details>
<summary><b>Python</b></summary>

```python
from vectorless import Engine

# Uses OPENAI_API_KEY from environment
engine = Engine(workspace="./data")
```

</details>

<details>
<summary><b>Rust</b></summary>

```rust
use vectorless::Engine;

let client = Engine::builder()
.with_workspace("./workspace")
.build().await?;
```

</details>

### Environment Variables

| Variable | Description |
|----------|-------------|
| `OPENAI_API_KEY` | LLM API key |
| `VECTORLESS_MODEL` | Default model (e.g., `gpt-4o-mini`) |
| `VECTORLESS_ENDPOINT` | API endpoint URL |
| `VECTORLESS_WORKSPACE` | Workspace directory |

### Advanced Configuration

For fine-grained control, use a config file:

```bash
cp config.toml ./vectorless.toml
```

<details>
<summary><b>Python</b></summary>

```python
from vectorless import Engine

# Use full configuration file
engine = Engine(config_path="./vectorless.toml")

# Or override specific settings
engine = Engine(
config_path="./vectorless.toml",
model="gpt-4o", # Override model from config
)
```

</details>

<details>
<summary><b>Rust</b></summary>

```rust
use vectorless::Engine;

// Use full configuration file
let client = Engine::builder()
.with_config_path("./vectorless.toml")
.build().await?;

// Or override specific settings
let client = Engine::builder()
.with_config_path("./vectorless.toml")
.with_model("gpt-4o", None) // Override model
.build().await?;
```

</details>

### Configuration Priority

Later overrides earlier:

1. Default configuration
2. Auto-detected config file (`vectorless.toml`, `config.toml`, `.vectorless.toml`)
3. Explicit config file (`config_path` / `with_config_path`)
4. Environment variables
5. Constructor/builder parameters (highest priority)

## Architecture

Expand All @@ -177,7 +276,7 @@ async fn main() -> vectorless::Result<()> {

## Examples

See the [examples/](examples/) directory.
See the [examples/](examples/) directory for more usage patterns.

## Contributing

Expand Down
51 changes: 1 addition & 50 deletions docs/samples/sample.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,53 +29,4 @@ The core module provides fundamental types:
The parser module handles document parsing:
- `MarkdownParser` — Parse Markdown files
- `PdfParser` — Parse PDF files (planned)
- `HtmlParser` — Parse HTML files (planned)

## Usage Examples

### Basic Usage

```rust
use vectorless::client::{Vectorless, VectorlessBuilder};

let client = VectorlessBuilder::new()
.with_workspace("./workspace")
.build()?;

let doc_id = client.index("./document.md").await?;
```

### Advanced Usage

You can customize the retrieval process:

```rust
use vectorless::{LlmNavigator, RetrieveOptions};

let retriever = LlmNavigator::with_defaults();
let options = RetrieveOptions::new()
.with_top_k(5)
.with_min_score(0.5);

let results = retriever.retrieve(&tree, "What is vectorless?", &options).await?;
```

## Configuration

The library can be configured via TOML files or programmatically.

### Configuration File

```toml
[summary]
model = "gpt-4"
max_tokens = 200

[retrieval]
model = "gpt-4"
top_k = 3
```

## API Reference

See the API documentation for detailed information about each function and type.
- `HtmlParser` — Parse HTML files (planned)
45 changes: 45 additions & 0 deletions examples/python/advanced/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Advanced Example - Full Configuration

Use a configuration file for fine-grained control.

## Setup

```bash
pip install vectorless

# Copy the example config
cp ../../../config.toml ./vectorless.toml

# Edit to customize your settings
vim vectorless.toml
```

## Run

```bash
python main.py
```

## Configuration File Structure

```toml
[llm]
api_key = "sk-..."

[llm.summary]
model = "gpt-4o-mini"
max_tokens = 200

[llm.retrieval]
model = "gpt-4o"
max_tokens = 100

[retrieval]
top_k = 5
beam_width = 3
max_iterations = 10

[storage]
workspace_dir = "./workspace"
cache_size = 100
```
115 changes: 115 additions & 0 deletions examples/python/advanced/main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
#!/usr/bin/env python3
"""
Advanced example - Full Configuration File.

This example demonstrates how to use a full configuration file
for fine-grained control over all settings.

Usage:
cp ../../../config.toml ./vectorless.toml
# Edit vectorless.toml to customize settings
python main.py
"""

import os
from vectorless import Engine, IndexContext

# Path to config file (relative to this script)
CONFIG_PATH = "./vectorless.toml"
WORKSPACE = "./workspace"


def main():
print("=== Vectorless Advanced Example (Full Configuration) ===\n")

# Check if config file exists
if not os.path.exists(CONFIG_PATH):
print(f"Error: Config file not found: {CONFIG_PATH}")
print("\nCreate it by copying the example:")
print(f" cp ../../../config.toml {CONFIG_PATH}")
print("\nThen edit it to customize your settings.")
return

# Create engine with config file
engine = Engine(config_path=CONFIG_PATH)

print(f"✓ Engine created with config file: {CONFIG_PATH}\n")

# Index a document
content = """
# System Documentation

## Architecture

The system consists of three main components:

1. **Index Pipeline** - Parses documents and builds a navigable tree
2. **Retrieval Pipeline** - Queries and retrieves relevant content
3. **Pilot** - LLM-powered navigation guide

## Configuration Options

### LLM Settings
- `model`: The LLM model to use (e.g., "gpt-4o", "gpt-4o-mini")
- `endpoint`: API endpoint URL
- `api_key`: Your API key
- `temperature`: Generation temperature (0.0 for deterministic)

### Retrieval Settings
- `top_k`: Number of results to return
- `max_iterations`: Maximum search iterations
- `beam_width`: Beam width for multi-path search

### Storage Settings
- `workspace_dir`: Directory for persisted documents
- `cache_size`: LRU cache size
- `compression`: Enable/disable compression

## Performance Tuning

For faster retrieval:
- Use a smaller model like gpt-4o-mini
- Reduce max_iterations
- Enable caching

For higher accuracy:
- Use a more capable model like gpt-4o
- Increase beam_width
- Enable multi-turn decomposition
"""
ctx = IndexContext.from_content(content, name="system_docs", format="markdown")
doc_id = engine.index(ctx)
print(f"✓ Indexed: {doc_id}\n")

# Query examples
questions = [
"What are the main components?",
"How can I improve retrieval speed?",
"What settings are available?",
]

for q in questions:
result = engine.query(doc_id, q)
print(f"Q: {q}")
print(f"A: {result.content[:150]}...")
print(f" Score: {result.score:.2f}\n")

# Cleanup
engine.remove(doc_id)
print("✓ Cleaned up")

# Print configuration info
print("\n" + "=" * 60)
print("Configuration Priority")
print("=" * 60)
print("""
1. Default configuration
2. Auto-detected config file (vectorless.toml, config.toml)
3. Explicit config file (config_path parameter)
4. Environment variables (OPENAI_API_KEY, etc.)
5. Constructor parameters (api_key, model, etc.)
""")


if __name__ == "__main__":
main()
11 changes: 11 additions & 0 deletions examples/python/advanced/pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
[project]
name = "vectorless-advanced-example"
version = "0.1.0"
requires-python = ">=3.9"
dependencies = [
"vectorless",
]

[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
16 changes: 16 additions & 0 deletions examples/python/basic/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Basic Example - Zero Configuration

The simplest way to use Vectorless.

## Setup

```bash
pip install vectorless
export OPENAI_API_KEY="sk-..."
```

## Run

```bash
python main.py
```
Loading