Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .github/workflows/deploy-testpypi.yml
Original file line number Diff line number Diff line change
Expand Up @@ -87,9 +87,9 @@ jobs:
fi

# Run bump_version.sh script with new version
if [ -f "bump_version.sh" ]; then
chmod +x bump_version.sh
./bump_version.sh "$NEW_NUMERIC_VERSION"
if [ -f "setup/bump_version.sh" ]; then
chmod +x setup/bump_version.sh
./setup/bump_version.sh "$NEW_NUMERIC_VERSION"
echo "Ran bump_version.sh with version: $NEW_NUMERIC_VERSION"
else
echo "Warning: bump_version.sh not found, skipping version bump"
Expand Down
8 changes: 5 additions & 3 deletions .github/workflows/manual-deploy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ jobs:
actions: read
outputs:
new_version: ${{ steps.determine_version.outputs.new_version }}
new_numeric_version: ${{ steps.determine_version.outputs.new_numeric_version }}
new_branch: ${{ steps.determine_version.outputs.new_branch }}
steps:
- name: Checkout code
Expand Down Expand Up @@ -93,9 +94,9 @@ jobs:
fi

# Run bump_version.sh script with new version
if [ -f "bump_version.sh" ]; then
chmod +x bump_version.sh
./bump_version.sh "$NEW_NUMERIC_VERSION"
if [ -f "setup/bump_version.sh" ]; then
chmod +x setup/bump_version.sh
./setup/bump_version.sh "$NEW_NUMERIC_VERSION"
echo "Ran bump_version.sh with version: $NEW_NUMERIC_VERSION"
else
echo "Warning: bump_version.sh not found, skipping version bump"
Expand Down Expand Up @@ -193,3 +194,4 @@ jobs:
echo "🎉 Successfully published version: ${{ needs.version-and-build.outputs.new_version }}"
echo "📦 Package available at: https://pypi.org/project/rapidfireai/"
echo "🌿 Created branch: ${{ needs.version-and-build.outputs.new_branch }}"
echo "Access the package at https://pypi.org/project/rapidfireai/${{ needs.version-and-build.outputs.new_numeric_version }}/"
19 changes: 10 additions & 9 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,9 +96,9 @@ git push origin test0.10.2

```bash
# Kill services on specific ports if conflicts occur
lsof -t -i:8081 | xargs kill -9 # dispatcher
lsof -t -i:5002 | xargs kill -9 # mlflow
lsof -t -i:3000 | xargs kill -9 # frontend
lsof -t -i:8851 | xargs kill -9 # dispatcher
lsof -t -i:8852 | xargs kill -9 # mlflow
lsof -t -i:8853 | xargs kill -9 # frontend
```

## Architecture
Expand Down Expand Up @@ -217,7 +217,7 @@ RapidFire wraps MLflow for experiment tracking:
- Runs tracked with metrics, parameters, artifacts
- Checkpoints saved as MLflow artifacts
- UI extends MLflow with IC Ops panel
- Access MLflow directly at `http://localhost:5002`
- Access MLflow directly at `http://localhost:8852`

## Development Notes

Expand All @@ -232,7 +232,7 @@ The frontend is a fork of MLflow. For frontend-specific guidance, see `rapidfire
To run frontend in development mode with hot reload:
```bash
cd rapidfireai/frontend
node ./yarn/releases/yarn-4.9.1.cjs start # Runs on localhost:3000
node ./yarn/releases/yarn-4.9.1.cjs start # Runs on localhost:8853
```

### Database Schema
Expand All @@ -247,7 +247,8 @@ Defined in `db/*.sql` files. Tables include:

- `RF_EXPERIMENT_PATH`: Base path for experiments (default: `./rapidfire_experiments`)
- `RF_TUTORIAL_PATH`: Path for tutorial notebooks (default: `./tutorial_notebooks`)
- `MLFLOW_URL`: MLflow tracking server URL (default: `http://localhost:5002`)
- `RF_MLFLOW_HOST`: MLflow tracking server Host (default: `localhost`)
- `RF_MLFLOW_PORT`: MLFlow tracking server Port (default: `8852`)
- `USE_SHARED_MEMORY`: Enable shared memory for checkpoints (default: True)

### Logging
Expand Down Expand Up @@ -343,9 +344,9 @@ Run `rapidfireai doctor` to diagnose:
### Port Conflicts

Common ports:
- 3000: Frontend dashboard
- 5002: MLflow tracking server
- 8081: Dispatcher API
- 8853: Frontend dashboard
- 8852: MLflow tracking server
- 8851: Dispatcher API

Use port killing commands above if conflicts occur.

Expand Down
16 changes: 8 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ rapidfireai start
# It should print about 50 lines, including the following:
# ...
# RapidFire Frontend is ready
# Open your browser and navigate to: http://0.0.0.0:3000
# Open your browser and navigate to: http://0.0.0.0:8853
# ...
# Press Ctrl+C to stop all services

Expand All @@ -103,9 +103,9 @@ rapidfireai doctor
If you encounter port conflicts, you can kill existing processes:

```bash
lsof -t -i:5002 | xargs kill -9 # mlflow
lsof -t -i:8081 | xargs kill -9 # dispatcher
lsof -t -i:3000 | xargs kill -9 # frontend server
lsof -t -i:8852 | xargs kill -9 # mlflow
lsof -t -i:8851 | xargs kill -9 # dispatcher
lsof -t -i:8853 | xargs kill -9 # frontend server
```

## Documentation
Expand Down Expand Up @@ -282,12 +282,12 @@ chmod +x ./rapidfireai/start_dev.sh
# head to settings in Cursor/VSCode and search for venv and add the path - $HOME/rapidfireai/.venv
# we cannot run a Jupyter notebook directly since there are restrictions on Jupyter being able to create child processes

# VSCode can port-forward localhost:3000 where the rf-frontend server will be running
# VSCode can port-forward localhost:8853 where the rf-frontend server will be running

# for port clash issues -
lsof -t -i:8081 | xargs kill -9 # dispatcher
lsof -t -i:5002 | xargs kill -9 # mlflow
lsof -t -i:3000 | xargs kill -9 # frontend
lsof -t -i:8851 | xargs kill -9 # dispatcher
lsof -t -i:8852 | xargs kill -9 # mlflow
lsof -t -i:8853 | xargs kill -9 # frontend
```

## Community & Governance
Expand Down
172 changes: 97 additions & 75 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -26,83 +26,14 @@ classifiers = [
"Topic :: Software Development :: Libraries :: Application Frameworks",
]
dependencies = [
# fit and evals dependencies
# Core ML/AI Framework
"torch==2.5.1",
"torchvision==0.20.1",
"torchaudio==2.5.1",
"datasets==3.6.0",
"sentence-transformers==5.1.0",
"gpustat==1.1.1",

# Distributed Computing
"ray==2.44.1",

# LLM Inference
"transformers==4.56.1",

# OpenAI API
"openai==1.106.1",
"tiktoken==0.11.0",

# LangChain Ecosystem
"langchain",
"langchain-classic",
"langchain-core",
"langchain-community",
"langchain-openai",
"langchain-huggingface",

# Vector Search
"faiss-gpu-cu12==1.12.0",

# Statistical Analysis
"scipy==1.16.1",

# Data Manipulation & Display
# "pandas>=2.0.0,<2.3.0", # Colab compatibility (2.2.2) and cudf constraint
"pandas==2.3.2",
"pyarrow==21.0.0",
"numpy==1.26.4",
"unstructured==0.18.15",

# REST API (Dispatcher)
"flask>=3.1.1",
"flask-cors>=6.0.1",
"waitress>=3.0.2",

# Notebook
"ipykernel==6.30.1",
# "ipykernel",
"ipywidgets>=7.3.4,<9.0.0", # Support both v7 (Colab) and v8 (Jupyter)
"tensorboard>=2.11.0",

# JSON Query Tool
"jq==1.10.0",

# Other
"psutil==7.0.0",
"tqdm==4.67.1",
"typing-extensions>=4.0.0",
"peft>=0.17.0",
"trl==0.21.0",
"bitsandbytes>=0.47.0",
"nltk>=3.9.1",
"evaluate>=0.4.5",
"rouge-score>=0.1.2",
"sentencepiece>=0.2.1",
"dill>=0.3.8",
"mlflow>=3.2.0",
"pytest>=8.4.1",
"requests>=2.32.0", # Relaxed for Colab (2.32.4)
"loguru>=0.7.3",
"ipython>=7.34.0", # Colab compatibility (7.34.0)
"jupyter>=1.1.1",
"uv>=0.8.14",

# Colab compatibility
# # fit and evals dependencies
# # Core ML/AI Framework
# "torch==2.5.1",
# "torchvision==0.20.1",
# "torchaudio==2.5.1",
# "datasets==3.6.0",
# "sentence-transformers==5.1.0",
# "gpustat==1.1.1",

# # Distributed Computing
# "ray==2.44.1",
Expand All @@ -126,8 +57,13 @@ dependencies = [
# "faiss-gpu-cu12==1.12.0",

# # Statistical Analysis
# "scipy==1.16.1",

# # Data Manipulation & Display
# # "pandas>=2.0.0,<2.3.0", # Colab compatibility (2.2.2) and cudf constraint
# "pandas==2.3.2",
# "pyarrow==21.0.0",
# "numpy==1.26.4",
# "unstructured==0.18.15",

# # REST API (Dispatcher)
Expand All @@ -136,12 +72,98 @@ dependencies = [
# "waitress>=3.0.2",

# # Notebook
# "ipykernel==6.30.1",
# # "ipykernel",
# "ipywidgets>=7.3.4,<9.0.0", # Support both v7 (Colab) and v8 (Jupyter)
# "tensorboard>=2.11.0",

# # JSON Query Tool
# "jq==1.10.0",

# # Other
# "psutil==7.0.0",
# "tqdm==4.67.1",
# "typing-extensions>=4.0.0",
# "peft>=0.17.0",
# "trl==0.21.0",
# "bitsandbytes>=0.47.0",
# "nltk>=3.9.1",
# "evaluate>=0.4.5",
# "rouge-score>=0.1.2",
# "sentencepiece>=0.2.1",
# "dill>=0.3.8",
# "mlflow>=3.2.0",
# "pytest>=8.4.1",
# "requests>=2.32.0", # Relaxed for Colab (2.32.4)
# "loguru>=0.7.3",
# "ipython>=7.34.0", # Colab compatibility (7.34.0)
# "jupyter>=1.1.1",
# "uv>=0.8.14",

# Colab compatibility
# Core ML/AI Framework
# "sentence-transformers==5.1.0",
"sentence-transformers>=5.1.2",

# Distributed Computing
# "ray==2.44.1",
"ray",

# LLM Inference
# "transformers==4.56.1",
"transformers>=4.57.1",

# OpenAI API
# "openai==1.106.1",
# "tiktoken==0.11.0",
"openai>=1.109.1",
"tiktoken>=0.12.0",

# LangChain Ecosystem
"langchain>=0.3.27",
# "langchain-classic",
"langchain-core>=0.3.79",
"langchain-community>=0.3.22",
"langchain-openai>=0.3.35",
"langchain-huggingface>=0.3.1",

# Vector Search
# "faiss-gpu-cu12==1.12.0",

# Statistical Analysis

# Data Manipulation & Display
# "unstructured==0.18.15",
"unstructured",

# REST API (Dispatcher)
# "flask>=3.1.1",
# "flask-cors>=6.0.1",
# "waitress>=3.0.2",
"flask>=3.1.2",
"flask-cors",
"waitress",

# Notebook

# JSON Query Tool
# "jq==1.10.0",
"jq",

# Other
"pandas>=2.2.2",
"mlflow",
"dill>=0.3.8",
"loguru",
"ipykernel>=6.17.1",
"peft>=0.17.1",
"trl",
"ipywidgets>=7.7.1",
"bitsandbytes",
"sentencepiece>=0.2.1",
"click>=8.3.0",
"requests>=2.32.4",
# "numpy>=2.0.2",
]

[project.optional-dependencies]
Expand Down
9 changes: 7 additions & 2 deletions rapidfireai/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -326,8 +326,12 @@ def install_packages(evals: bool = False):

if not evals:
# Upgrading pytorch to 2.7.0 for fit
print("Upgrading pytorch to 2.7.0 for fit")
packages.append({"package": "torch==2.7.0", "extra_args": ["--upgrade"]})
# print("Upgrading pytorch to 2.7.0 for fit")
# packages.append({"package": "torch==2.7.0", "extra_args": ["--upgrade","--index-url", "https://download.pytorch.org/whl/cu126"]})
# packages.append({"package": "torchvision==0.22.0", "extra_args": ["--upgrade","--index-url", "https://download.pytorch.org/whl/cu126"]})
# packages.append({"package": "torchaudio==2.7.0", "extra_args": ["--upgrade","--index-url", "https://download.pytorch.org/whl/cu126"]})
# packages.append({"package": "transformers==4.57.1", "extra_args": ["--upgrade"]})
pass

## TODO: re-enable for fit once trl has fix
if evals and cuda_major == 12:
Expand All @@ -336,6 +340,7 @@ def install_packages(evals: bool = False):
packages.append({"package": "torchvision==0.20.1", "extra_args": ["--upgrade", "--index-url", "https://download.pytorch.org/whl/cu124"]})
packages.append({"package": "torchaudio==2.5.1", "extra_args": ["--upgrade", "--index-url", "https://download.pytorch.org/whl/cu124"]})
packages.append({"package": "vllm==0.7.2", "extra_args": ["--torch-backend=cu124"]})
packages.append({"package": "faiss-gpu-cu12==1.12.0", "extra_args": []})
packages.append({"package": "flashinfer-python==0.2.5", "extra_args": ["--index-url", "https://flashinfer.ai/whl/cu124/torch2.5/"]})
# elif cuda_major == 11:
# print(f"\n🎯 Detected CUDA {cuda_major}.x")
Expand Down
Loading
Loading