Join our community | Newsletter | Docs | Twitter | Blog | YouTube | Contact us
Tip
Deploy AI apps for free on Ploomber Cloud!
This repository contains sample pipelines developed using Ploomber.
Note: We recommend you to go through the first tutorial to learn the basics of Ploomber.
Use Colab:
Or run locally:
pip install ploomber
# list examples
ploomber examples
# download example with name
ploomber examples --name {name}
# example
ploomber examples --name templates/mlflowEach example contains a README.md file that describes it; a README.ipynb is also available with the same contents but in Jupyter notebook format and with command outputs. In addition, files for pip (requirements.txt) and conda (environment.yml) are provided for local execution.
Starting points for common use cases. Use them to ramp up a project quickly.
-
templates/etlDownload a data file, upload it to a database, process it, and plot with Python and R. -
templates/exploratory-analysisSample pipeline that explores penguins data. -
templates/google-cloudUse Google Cloud and Ploomber to develop a scalable and production-ready pipeline. -
templates/ml-advancedML pipeline using the Python API. Shows how to create a Python package, test it with pytest, and train models in parallel. -
templates/ml-basicDownload data, clean it, generate features and train a model. -
templates/ml-intermediateTraining and serving ML pipelines with integration testing to evaluate training data quality. -
templates/ml-onlineLoad data, generate features, train a model, and deploy model with flask. -
templates/mlflowTrain a grid of models and log them to MLflow. -
templates/python-apiLoads, clean, and plot data using the Python API. -
templates/pytorchUsing GPUs to train models in Ploomber Cloud. -
templates/shellCreate a pipeline with shell scripts as tasks. -
templates/spec-api-directoryCreate a pipeline from a directory with scripts (without a pipeline.yaml file). -
templates/spec-api-rLoad, clean and plot data with R. -
templates/spec-api-sqlUse SQL scripts to manipulate data in a database, dump a table, and plot it with Python.
Short and to-the-point examples showing how to use a specific feature.
-
cookbook/dynamic-paramsPipeline parameters whose values are computed at runtime. -
cookbook/file-clientUpload task's products upon execution (local, S3, GCloud storage) -
cookbook/gridAn example showing how to create a grid of tasks to train models with different parameters. -
cookbook/hooksTask hooks -
cookbook/incrementalA pipeline that processes new records from a database and uploads them. -
cookbook/nested-cvNested cross-validation for model selection and hyperparameter tuning. -
cookbook/python-loadLoad pipeline.yaml file in a Python session to customize initialization. -
cookbook/report-generationGenerating HTML/PDF reports. -
cookbook/serializationShows how to use the serializer and unserializer decorators. -
cookbook/sql-dumpA minimal example showing how to dump a table from a SQL database. -
cookbook/variable-number-of-productsShows how to create tasks whose number of products depends on runtime conditions.
In-depth tutorials for learning. These are part of the documentation.
-
guides/cronThis guide shows how to schedule Ploomber pipelines using cron. -
guides/debuggingTutorial showing techniques for debugging pipelines. -
guides/first-pipelineIntroductory tutorial to learn the basics of Ploomber. -
guides/intro-to-ploomberIntroductory tutorial to learn the basics of Ploomber. -
guides/loggingTutorial showing how to add logging to a pipeline. -
guides/parametrizedTutorial showing how to parametrize pipelines and change parameters from the command-line. -
guides/refactorUsing Soorgeon to convert a notebook into a Ploomber pipeline. -
guides/serializationTutorial explaining how the serializer and unserializer fields in a pipeline.yaml file work. -
guides/sql-templatingIntroductory tutorial teaching how to develop modular SQL pipelines. -
guides/testingTutorial showing how to use a task's on_finish hook to test data quality. -
guides/versioningA tutorial showing how to version pipeline products.
The simplest way to get started with Ploomber is via the Spec API, which allows you to describe pipelines using a pipeline.yaml file, most examples on this repository use the Spec API. However, if you want more flexibility, you may write pipelines with Python.
The templates/python-api/ directory contains a project written using the Python API. And the python-api-examples/ includes some tutorials and more examples.
In Ploomber 0.21, we introduced a simplified API to write pipelines in a single Jupyter notebook (or .py) file. This is a great option for small projects.
You can find the examples in the micro-pipelines/ directory.