Skip to content

mabel-dev/flows

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

flows

Github Actions-style data pipline execution engine.

Terminology

  • Flow: A single executable instance of a pipeline.
  • Operator: A reusable processing task.
  • Pipeline: The sequence of operators (tasks) to be performed.
  • Step: A specific instance of an operator, configured and executed as part of a flow.
  • Tenant: A permission boundary for access to resources.

Example Pipeline Definition

Below is an example of a pipeline definition in YAML format:

name: user_data_pipeline
tenant: acme_corp

schema:
  - name: name
    description: Full name of the user
    type: varchar
    expectations:
      not_null: true
      min_length: 2

  - name: age
    description: Age of the user
    type: integer
    expectations:
      min: 0

steps:
  - name: load_data
    uses: internal/read@1.0.0
    config:
      path: "gs://data.csv"

  - name: filter_data
    uses: internal/filter@latest
    config:
      conditions:
        - [["length", ">=", 4], ["status", "==", "approved"]]
        - [["is_published", "==", true]]

  - name: save_results
    uses: internal/save@1.0.0
    config:
      endpoint: "https://{{ environment.HOST }}/upload"
      username: "{{ secrets.API_USER }}"
      password: "{{ secrets.API_PASSWORD }}"

About

Data Pipelines

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published