PipeLearn

Visual sklearn Pipeline Builder - A drag-and-drop interface for creating scikit-learn machine learning pipelines.

Overview

PipeLearn is a visual tool that helps ML engineers and data scientists build scikit-learn pipelines through an intuitive drag-and-drop interface. No more writing boilerplate code - just drag components, connect them, configure parameters, and generate production-ready Python code.

Features

Visual Pipeline Builder: Drag and drop sklearn components onto a canvas
50+ Components: Comprehensive catalog including:
- Preprocessing (StandardScaler, MinMaxScaler, OneHotEncoder, etc.)
- Feature Selection (SelectKBest, VarianceThreshold, RFE)
- Decomposition (PCA, TruncatedSVD)
- Estimators (LogisticRegression, RandomForest, SVC, etc.)
- Imputation (SimpleImputer, KNNImputer)
Parameter Configuration: Easy-to-use UI for configuring component parameters
Pipeline Validation: Real-time validation with helpful error messages
Code Generation: Generates clean, production-ready Python code
Export Options: Copy to clipboard or download as .py file

Architecture

PipeLearn/
├── backend/              # FastAPI backend
│   ├── app.py           # Main API application
│   ├── components.py    # sklearn component catalog
│   ├── pipeline_generator.py  # Code generation logic
│   └── requirements.txt
│
└── frontend/            # React frontend
    ├── public/
    ├── src/
    │   ├── components/  # React components
    │   │   ├── ComponentPanel.js
    │   │   ├── PipelineEditor.js
    │   │   ├── ComponentNode.js
    │   │   ├── ParameterConfig.js
    │   │   └── CodeViewer.js
    │   ├── App.js
    │   └── index.js
    └── package.json

Getting Started

Prerequisites

Python 3.8+
Node.js 16+
npm or yarn

Installation

1. Clone the repository

git clone https://github.com/yourusername/PipeLearn.git
cd PipeLearn

2. Set up the backend

cd backend
pip install -r requirements.txt

3. Set up the frontend

cd frontend
npm install

Running the Application

1. Start the backend server

cd backend
python app.py

The API will be available at http://localhost:8000

2. Start the frontend development server

In a new terminal:

cd frontend
npm start

The application will open in your browser at http://localhost:3000

Usage Guide

Building a Pipeline

Add Components: Drag components from the left panel onto the canvas
Connect Components: Click and drag from the bottom handle of one component to the top handle of another
Configure Parameters: Click the gear icon (⚙️) on any component to configure its parameters
Build Pipeline: Click the "Build Pipeline" button to generate Python code
Export Code: Copy to clipboard or download the generated code

Example: Classification Pipeline

Here's a simple example of creating a classification pipeline:

Drag StandardScaler (Preprocessing)
Drag PCA (Decomposition)
Drag RandomForestClassifier (Estimators)
Connect them in order: StandardScaler → PCA → RandomForestClassifier
Configure parameters:
- PCA: n_components = 5
- RandomForestClassifier: n_estimators = 200, random_state = 42
Click "Build Pipeline"

Generated code:

# Generated by PipeLearn
# sklearn Pipeline Builder

from sklearn.decomposition import PCA
from sklearn.ensemble import RandomForestClassifier
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler

# Create the pipeline
pipeline = make_pipeline(
    StandardScaler(),
    PCA(n_components=5),
    RandomForestClassifier(n_estimators=200, random_state=42)
)

# Usage example:
# pipeline.fit(X_train, y_train)
# predictions = pipeline.predict(X_test)

API Documentation

Endpoints

GET `/api/components`

Returns the catalog of available sklearn components.

Response:

{
  "preprocessing": {
    "StandardScaler": {
      "class": "sklearn.preprocessing.StandardScaler",
      "description": "Standardize features...",
      "parameters": {...}
    }
  }
}

POST `/api/generate-pipeline`

Generates Python code from the visual pipeline.

Request Body:

{
  "nodes": [...],
  "edges": [...]
}

Response:

{
  "success": true,
  "code": "# Generated pipeline code...",
  "pipeline_structure": [...]
}

POST `/api/validate-pipeline`

Validates the pipeline structure.

Response:

{
  "valid": true,
  "errors": [],
  "warnings": []
}

Component Categories

Preprocessing

StandardScaler, MinMaxScaler, RobustScaler
Normalizer, OneHotEncoder, LabelEncoder
PolynomialFeatures

Feature Selection

SelectKBest, VarianceThreshold, RFE

Decomposition

PCA, TruncatedSVD

Estimators

Classifiers:

LogisticRegression
RandomForestClassifier
SVC (Support Vector Classification)
GradientBoostingClassifier

Regressors:

LinearRegression
RandomForestRegressor

Imputation

SimpleImputer, KNNImputer

Development

Adding New Components

To add a new sklearn component:

Edit backend/components.py
Add the component to the appropriate category:

"YourComponent": {
    "class": "sklearn.module.YourComponent",
    "description": "Component description",
    "parameters": {
        "param_name": {
            "type": "number|boolean|select|string",
            "default": default_value,
            "description": "Parameter description"
        }
    },
    "input": "numeric|categorical",
    "output": "numeric|prediction"
}

Restart the backend server

Environment Variables

Create a .env file in the frontend directory:

REACT_APP_API_URL=http://localhost:8000

Testing

Backend Tests

cd backend
pytest

Frontend Tests

cd frontend
npm test

Deployment

Backend (Production)

cd backend
pip install gunicorn
gunicorn app:app --workers 4 --bind 0.0.0.0:8000

Frontend (Production Build)

cd frontend
npm run build

Deploy the build/ directory to your hosting service.

Troubleshooting

Backend not connecting

Ensure the backend server is running on port 8000
Check CORS settings in app.py

Components not loading

Verify the backend API is accessible at http://localhost:8000/api/components
Check browser console for errors

Pipeline validation errors

Ensure all components are connected properly
Only one estimator should be present at the end of the pipeline
Check for circular connections

Contributing

Contributions are welcome! Please follow these steps:

Fork the repository
Create a feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

Roadmap

Add more sklearn components (clustering, ensemble methods)
Pipeline templates for common use cases
Export to Jupyter notebook format
Pipeline performance visualization
Integration with MLflow for experiment tracking
Support for custom transformers
Collaborative editing features

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Built with React Flow for the visual editor
Powered by FastAPI and scikit-learn
Inspired by the need for faster ML pipeline prototyping

Support

For issues, questions, or contributions, please open an issue on GitHub.

Made with ❤️ for the ML community

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
EXAMPLES.md		EXAMPLES.md
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
start.sh		start.sh

License

HishamElamir/PipeLearn

Folders and files

Latest commit

History

Repository files navigation

PipeLearn

Overview

Features

Architecture

Getting Started

Prerequisites

Installation

1. Clone the repository

2. Set up the backend

3. Set up the frontend

Running the Application

1. Start the backend server

2. Start the frontend development server

Usage Guide

Building a Pipeline

Example: Classification Pipeline

API Documentation

Endpoints

GET /api/components

POST /api/generate-pipeline

POST /api/validate-pipeline

Component Categories

Preprocessing

Feature Selection

Decomposition

Estimators

Imputation

Development

Adding New Components

Environment Variables

Testing

Backend Tests

Frontend Tests

Deployment

Backend (Production)

Frontend (Production Build)

Troubleshooting

Backend not connecting

Components not loading

Pipeline validation errors

Contributing

Roadmap

License

Acknowledgments

Support

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

GET `/api/components`

POST `/api/generate-pipeline`

POST `/api/validate-pipeline`

Packages