flAPI is a powerful service that automatically generates read-only APIs for datasets by utilizing SQL templates. Built on top of DuckDB and leveraging its SQL engine and extension ecosystem, flAPI offers a seamless way to connect to various data sources and expose them as RESTful APIs.
- Automatic API Generation: Create APIs for your datasets without coding
- MCP (Model Context Protocol) Support: Declarative creation of AI tools alongside REST endpoints
- Multiple Data Sources: Connect to BigQuery, SAP ERP & BW (via ERPL), Parquet, Iceberg, Postgres, MySQL, and more
- SQL Templates: Use Mustache-like syntax for dynamic queries
- Caching: DuckLake-backed cache with full refresh and incremental sync
- Security: Implement row-level and column-level security with ease
- Easy deployment: Deploy flAPI with a single binary file
The easiest way to get started with flAPI is to use the pre-built docker image.
> docker pull ghcr.io/datazoode/flapi:latestThe image is pretty small and mainly contains the flAPI binary which is statically linked against DuckDB v1.4.1. Details about the docker image can be found in the Dockerfile.
Once you have downloaded the binary, you can run flAPI by executing the following command:
> docker run -it --rm -p 8080:8080 -p 8081:8081 -v $(pwd)/examples/:/config ghcr.io/datazoode/flapi -c /config/flapi.yaml
The different arguments in this docker command are:
-it --rm: Run the container in interactive mode and remove it after the process has finished-p 8080:8080: Exposes port 8080 of the container to the host, this makes the REST API available athttp://localhost:8080-p 8081:8081: Exposes port 8081 for the MCP server (when enabled)-v $(pwd)/examples/:/config: This mounts the localexamplesdirectory to the/configdirectory in the container, this is where the flAPI configuration file is expected to be found.ghcr.io/datazoode/flapi: The docker image to use-c /config/flapi.yaml: This is an argument to the flAPI application which tells it to use theflapi.yamlfile in the/configdirectory as the configuration file.
To enable MCP support, you can either:
Option A: Use the command line flag
> docker run -it --rm -p 8080:8080 -p 8081:8081 -v $(pwd)/examples/:/config ghcr.io/datazoode/flapi -c /config/flapi.yaml --enable-mcp
Option B: Configure in flapi.yaml
mcp:
enabled: true
port: 8081
# ... other MCP configurationIf everything is set up correctly, you should be able to access the API at the URL specified in the configuration file.
> curl 'http://localhost:8080/'
___
___( o)> Welcome to
\ <_. ) flAPI
`---'
Fast and Flexible API Framework
powered by DuckDBThe flAPI server creates embedded Swagger UI at which provides an overview of the available endpoints and allows you to test them. It can be found at
You should see the familiar Swagger UI page:
The raw yaml Swagger 2.0 is also available at http://localhost:8080/doc.yaml
If MCP is enabled, you can test the MCP server as well:
# Check MCP server health
> curl 'http://localhost:8081/mcp/health'
{"status":"healthy","server":"flapi-mcp-server","version":"0.3.0","protocol_version":"2024-11-05","tools_count":0}
# Initialize MCP connection
> curl -X POST http://localhost:8081/mcp/jsonrpc \
-H "Content-Type: application/json" \
-d '{"jsonrpc": "2.0", "id": 1, "method": "initialize"}'
# List available tools
> curl -X POST http://localhost:8081/mcp/jsonrpc \
-H "Content-Type: application/json" \
-d '{"jsonrpc": "2.0", "id": 2, "method": "tools/list"}'flAPI now supports the Model Context Protocol (MCP) in a unified configuration approach. Every flAPI instance automatically runs both a REST API server and an MCP server concurrently, allowing you to create AI tools alongside your REST endpoints using the same configuration files and SQL templates.
- Unified Configuration: Single YAML files can define REST endpoints, MCP tools, and MCP resources
- Automatic Detection: Configuration type is determined by presence of
url-path(REST),mcp-tool(MCP tool), ormcp-resource(MCP resource) - Shared Components: MCP tools and resources use the same SQL templates, parameter validation, authentication, and caching as REST endpoints
- Concurrent Servers: REST API (port 8080) and MCP server (port 8081) run simultaneously
- Declarative Definition: Define everything using YAML configuration with SQL templatestocol
- Tool Discovery: Automatic tool discovery and schema generation
- Security Integration: Reuse existing authentication, rate limiting, and caching features
POST /mcp/jsonrpc- Main JSON-RPC endpoint for tool callsGET /mcp/health- Health check endpoint
MCP is now automatically enabled - no separate configuration needed! Every flAPI instance runs both REST API and MCP servers concurrently.
Configuration files can define multiple entity types:
# Single configuration file serves as BOTH REST endpoint AND MCP tool
url-path: /customers/ # Makes this a REST endpoint
mcp-tool: # Also makes this an MCP tool
name: get_customers
description: Retrieve customer information by ID
result_mime_type: application/json
request:
- field-name: id
field-in: query
description: Customer ID
required: false
validators:
- type: int
min: 1
max: 1000000
preventSqlInjection: true
template-source: customers.sql
connection: [customers-parquet]
rate-limit:
enabled: true
max: 100
interval: 60
auth:
enabled: true
type: basic
users:
- username: admin
password: secret
roles: [admin]# MCP Resource example
mcp-resource:
name: customer_schema
description: Customer database schema definition
mime_type: application/json
template-source: customer-schema.sql
connection: [customers-parquet]Once MCP is enabled, you can interact with tools using JSON-RPC 2.0:
# Check MCP server health
curl 'http://localhost:8081/mcp/health'
# Initialize MCP connection
curl -X POST http://localhost:8081/mcp/jsonrpc \
-H "Content-Type: application/json" \
-d '{"jsonrpc": "2.0", "id": 1, "method": "initialize"}'
# List available tools (discovered from unified configuration)
curl -X POST http://localhost:8081/mcp/jsonrpc \
-H "Content-Type: application/json" \
-d '{"jsonrpc": "2.0", "id": 2, "method": "tools/list"}'
# Call a tool (same SQL template used for both REST and MCP)
curl -X POST http://localhost:8081/mcp/jsonrpc \
-H "Content-Type: application/json" \
-d '{"jsonrpc": "2.0", "id": 3, "method": "tools/call", "params": {"name": "get_customers", "arguments": {"id": "123"}}}'Here's a simple example of how to create an API endpoint using flAPI:
flAPI uses the popular YAML format to configure the API endpoints. A basic configuration file looks like this:
project_name: example-flapi-project
project_description: An example flAPI project demonstrating various configuration options
template:
path: './sqls' # The path where SQL templates and API endpoint configurations are stored
environment-whitelist: # Optional: List of regular expressions for whitelisting envvars which are available in the templates
- '^FLAPI_.*'
duckdb: # Configuration of the DuckDB embedded into flAPI
db_path: ./flapi_cache.db # Optional: remove or comment out for in-memory database, we use this store also as cache
access_mode: READ_WRITE # See the https://duckdb.org/docs/configuration/overview) for more details
threads: 8
max_memory: 8GB
default_order: DESC
connections: # A YAML map of database connection configurations, a API endpoint needs to reference one of these connections
bigquery-lakehouse:
# SQL commands to initialize the connection (e.g., e.g. installing, loading and configuring the BQ a DuckDB extension)
init: |
INSTALL 'bigquery' FROM 'http://storage.googleapis.com/hafenkran';
LOAD 'bigquery';
properties: # A YAML map of connection-specific properties (accessible in templates via {{ context.conn.property_name }})
project_id: 'my-project-id'
customers-parquet:
properties:
path: './data/customers.parquet'
heartbeat:
enabled: true # The eartbeat worker is a background thread which can can be used to periodically trigger endpionts
worker-interval: 10 # The interval in seconds at which the heartbeat worker will trigger endpoints
enforce-https:
enabled: false # Whether to force HTTPS for the API connections, we strongly recommend to use a reverse proxy to do SSL termination
# ssl-cert-file: './ssl/cert.pem'
# ssl-key-file: './ssl/key.pem'
After that ensure that the template path (./sqls in this example) exists.
Each endpoint is at least defined by a YAML file and a corresponding SQL template in the template path. For our example we will create the file ./sqls/customers.yaml:
url-path: /customers/ # The URL path at which the endpoint will be available
request: # The request configuration for the endpoint, this defines the parameters that can be used in the query
- field-name: id
field-in: query # The location of the parameter, other options are 'path', 'query' and 'body'
description: Customer ID # A description of the parameter, this is used in the auto-generated API documentation
required: false # Whether the parameter is required
validators: # A list of validators that will be applied to the parameter
- type: int
min: 1
max: 1000000
preventSqlInjection: true
template-source: customers.sql # The path to the SQL template that will be used to generate the endpoint
connection:
- customers-parquet # The connection that will be used to execute the query
rate-limit:
enabled: true # Whether rate limiting is enabled for the endpoint
max: 100 # The maximum number of requests per interval
interval: 60 # The interval in seconds
auth:
enabled: true # Whether authentication is enabled for the endpoint
type: basic # The type of authentication, other options are 'basic' and 'bearer'
users: # The users that are allowed to access the endpoint
- username: admin
password: secret
roles: [admin]
- username: user
password: password
roles: [read]
heartbeat:
enabled: true # Whether the heartbeat worker if enabled will trigger the endpoint periodically
params: # A YAML map of parameters that will be passed by the heartbeat worker to the endpoint
id: 123
There are many more configuration options available, see the full documentation for more details.
After the creation of the YAML endpoint configuration we need to connect the SQL template which connects the enpoint to the data connection. The template files use the Mustache templating language to dynamically generate the SQL query.
SELECT * FROM '{{{conn.path}}}'
WHERE 1=1
{{#params.id}}
AND c_custkey = {{{ params.id }}}
{{/params.id}}The above template uses the path parameter defined in the connection configuration to directly query a local parquet file. If the id parameter is
provided, it will be used to filter the results.
To test the endpoint and see if everything worked, we can use curl. We should also provide the correct basic auth credentials (admin:secret in this case). To make the JSON result easier to read, we pipe the output to jq.
> curl -X GET -u admin:secret "http://localhost:8080/customers?id=123" | jq .
{
"next": "",
"total_count": 1,
"data": [
{
"c_mktsegment": "BUILDING",
"c_acctbal": 5897.82999999999992724,
"c_phone": "15-817-151-1168",
"c_address": "YsOnaaER8MkvK5cpf4VSlq",
"c_nationkey": 5,
"c_name": "Customer#000000123",
"c_comment": "ependencies. regular, ironic requests are fluffily regu",
"c_custkey": 123
}
]
}
flAPI uses the DuckDB DuckLake extension to provide modern, snapshot-based caching. You write the SQL to define the cached table, and flAPI manages schemas, snapshots, retention, scheduling, and audit logs.
- Configure DuckLake globally (alias is
cacheby default):
ducklake:
enabled: true
alias: cache
metadata_path: ./examples/data/cache.ducklake
data_path: ./examples/data/cache.ducklake
data_inlining_row_limit: 10 # Enable data inlining for small changes (optional)
retention:
max_snapshot_age: 14d
compaction:
enabled: false
scheduler:
enabled: true- Add cache block to your endpoint (no
primary-key/cursor→ full refresh):
url-path: /publicis
template-source: publicis.sql
connection: [bigquery-lakehouse]
cache:
enabled: true
table: publicis_cache
schema: analytics
schedule: 5m
retention:
max_snapshot_age: 14d
template_file: publicis/publicis_cache.sql- Write the cache SQL template (CTAS):
-- publicis/publicis_cache.sql
CREATE OR REPLACE TABLE {{cache.catalog}}.{{cache.schema}}.{{cache.table}} AS
SELECT
p.country,
p.product_category,
p.campaign_type,
p.channel,
sum(p.clicks) AS clicks
FROM bigquery_scan('{{{conn.project_id}}}.landing__publicis.kaercher_union_all') AS p
GROUP BY 1, 2, 3, 4;- Query from the cache in your main SQL:
-- publicis.sql
SELECT
p.country,
p.product_category,
p.campaign_type,
p.channel,
p.clicks
FROM {{cache.catalog}}.{{cache.schema}}.{{cache.table}} AS p
WHERE 1=1Notes:
- The cache schema (
cache.analytics) is created automatically if missing. - Regular GET requests never refresh the cache. Refreshes happen on warmup, on schedule, or via the manual API.
- Data Inlining: When
data_inlining_row_limitis configured, small cache changes (≤ specified row limit) are written directly to DuckLake metadata instead of creating separate Parquet files. This improves performance for small incremental updates.
DuckLake supports writing very small inserts directly into the metadata catalog instead of creating a Parquet file for every micro-batch. This is called "Data Inlining" and can significantly speed up small, frequent updates.
-
Enable globally: configure once under the top-level
ducklakeblock:ducklake: enabled: true alias: cache metadata_path: ./examples/data/cache.ducklake data_path: ./examples/data/cache.ducklake data_inlining_row_limit: 10 # inline inserts up to 10 rows
-
Behavior:
- Inserts with rows ≤
data_inlining_row_limitare inlined into the catalog metadata. - Larger inserts automatically fall back to normal Parquet file writes.
- Inlining applies to all caches (global setting), no per-endpoint toggle.
- Inserts with rows ≤
-
Manual flush (optional): you can flush inlined data to Parquet files at any time using DuckLake’s function. Assuming your DuckLake alias is
cache:-- Flush all inlined data in the catalog CALL ducklake_flush_inlined_data('cache'); -- Flush only a specific schema CALL ducklake_flush_inlined_data('cache', schema_name => 'analytics'); -- Flush only a specific table (default schema "main") CALL ducklake_flush_inlined_data('cache', table_name => 'events_cache'); -- Flush a specific table in a specific schema CALL ducklake_flush_inlined_data('cache', schema_name => 'analytics', table_name => 'events_cache');
-
Notes:
- This feature is provided by DuckLake and is currently marked experimental upstream. See the DuckLake docs for details: Data Inlining.
- If you don’t set
data_inlining_row_limit, flAPI won’t enable inlining and DuckLake will use regular Parquet writes.
The engine infers sync mode from your YAML:
- No
primary-key, nocursor→ full refresh (CTAS) - With
cursoronly → incremental append - With
primary-key+cursor→ incremental merge (upsert)
Example YAMLs:
# Incremental append
cache:
enabled: true
table: events_cache
schema: analytics
schedule: 10m
cursor:
column: created_at
type: timestamp
template_file: events/events_cache.sql
# Incremental merge (upsert)
cache:
enabled: true
table: customers_cache
schema: analytics
schedule: 15m
primary-key: [id]
cursor:
column: updated_at
type: timestamp
template_file: customers/customers_cache.sqlCache template variables available to your SQL:
{{cache.catalog}},{{cache.schema}},{{cache.table}},{{cache.schedule}}{{cache.snapshotId}},{{cache.snapshotTimestamp}}(current){{cache.previousSnapshotId}},{{cache.previousSnapshotTimestamp}}(previous){{cache.cursorColumn}},{{cache.cursorType}}{{cache.primaryKeys}}{{params.cacheMode}}is available with valuesfull,append, ormerge
Incremental append example:
-- events/events_cache.sql
INSERT INTO {{cache.catalog}}.{{cache.schema}}.{{cache.table}}
SELECT *
FROM source_events
WHERE {{#cache.previousSnapshotTimestamp}} event_time > TIMESTAMP '{{cache.previousSnapshotTimestamp}}' {{/cache.previousSnapshotTimestamp}}Incremental merge example:
-- customers/customers_cache.sql
MERGE INTO {{cache.catalog}}.{{cache.schema}}.{{cache.table}} AS t
USING (
SELECT * FROM source_customers
WHERE {{#cache.previousSnapshotTimestamp}} updated_at > TIMESTAMP '{{cache.previousSnapshotTimestamp}}' {{/cache.previousSnapshotTimestamp}}
) AS s
ON t.id = s.id
WHEN MATCHED THEN UPDATE SET
name = s.name,
email = s.email,
updated_at = s.updated_at
WHEN NOT MATCHED THEN INSERT (*) VALUES (s.*);- Startup warmup: flAPI refreshes caches for endpoints with cache enabled.
- Scheduled refresh: controlled by
cache.scheduleon each endpoint (e.g.,5m). - Manual refresh: call the refresh API (see below).
- Regular GET requests do not refresh the cache.
flAPI maintains an audit table inside DuckLake at cache.audit.sync_events and provides control endpoints:
- Manual refresh:
curl -X POST "http://localhost:8080/api/v1/_config/endpoints/publicis/cache/refresh"- Audit logs (endpoint-specific and global):
curl "http://localhost:8080/api/v1/_config/endpoints/publicis/cache/audit"
curl "http://localhost:8080/api/v1/_config/cache/audit"- Garbage collection (retention):
Retention can be configured per endpoint under
cache.retention:
cache:
retention:
max_snapshot_age: 7d # time-based retention
# keep_last_snapshots: 3 # version-based retention (subject to DuckLake support)The system applies retention after each refresh and you can also trigger GC manually:
curl -X POST "http://localhost:8080/api/v1/_config/endpoints/publicis/cache/gc"- Compaction:
If enabled in global
ducklake.scheduler, periodic file merging is performed via DuckLakeducklake_merge_adjacent_files.
Use these variables inside your cache templates and main queries:
-
Identification
{{cache.catalog}}→ usuallycache{{cache.schema}}→ e.g.,analytics(auto-created if missing){{cache.table}}→ your cache table name
-
Mode and scheduling
{{params.cacheMode}}→full|append|merge{{cache.schedule}}→ if set in YAML
-
Snapshots
{{cache.snapshotId}},{{cache.snapshotTimestamp}}{{cache.previousSnapshotId}},{{cache.previousSnapshotTimestamp}}
-
Incremental hints
{{cache.cursorColumn}},{{cache.cursorType}}{{cache.primaryKeys}}→ comma-separated list, e.g.,id,tenant_id
Authoring tips:
- Full refresh: use
CREATE OR REPLACE TABLE ... AS SELECT .... - Append:
INSERT INTO cache.table SELECT ... WHERE event_time > previousSnapshotTimestamp. - Merge:
MERGE INTO cache.table USING (SELECT ...) ON pk .... - Do not create schemas in templates; flAPI does that automatically.
- Cache refresh happens on every request: by design this is disabled. Ensure you’re not calling the manual refresh endpoint from a client and that your logs show scheduled or warmup refreshes only.
- Schema not found: verify
cache.schemais set; flAPI will auto-create it. - Retention errors: use time-based
max_snapshot_agefirst. Version-based retention depends on DuckLake support.
flAPI extends plain YAML with lightweight include and environment-variable features so you can keep configurations modular and environment-aware.
- Write environment variables as
{{env.VAR_NAME}}anywhere in your YAML. - Only variables that match the whitelist in your root config are substituted:
template: path: './sqls' environment-whitelist: - '^FLAPI_.*' # allow all variables starting with FLAPI_ - '^PROJECT_.*' # optional additional prefixes
- If the whitelist is empty or omitted, all environment variables are allowed.
Examples:
# Substitute inside strings
project_name: "${{env.PROJECT_NAME}}"
# Build include paths dynamically
template:
path: "{{env.CONFIG_DIR}}/sqls"You can splice content from another YAML file directly into the current document.
- Basic include:
{{include from path/to/file.yaml}} - Section include:
{{include:top_level_key from path/to/file.yaml}}includes only that key - Conditional include: append
if <condition>to either form
Conditions supported:
trueorfalseenv.VAR_NAME(include if the variable exists and is non-empty)!env.VAR_NAME(include if the variable is missing or empty)
Examples:
# Include another YAML file relative to this file
{{include from common/settings.yaml}}
# Include only a section (top-level key) from a file
{{include:connections from shared/connections.yaml}}
# Conditional include based on an environment variable
{{include from overrides/dev.yaml if env.FLAPI_ENV}}
# Use env var in the include path
{{include from {{env.CONFIG_DIR}}/secrets.yaml}}Resolution rules and behavior:
- Paths are resolved relative to the current file first; absolute paths are supported.
- Includes inside YAML comments are ignored (e.g., lines starting with
#). - Includes are expanded before the YAML is parsed.
- Includes do not recurse: include directives within included files are not processed further.
- Circular includes are guarded against within a single expansion pass; avoid cycles.
Tips:
- Prefer section includes (
{{include:...}}) to avoid unintentionally overwriting unrelated keys. - Keep shared blocks in small files (e.g.,
connections.yaml,auth.yaml) and include them where needed.
The source code of flAPI is written in C++ and closely resembles the DuckDB build process. A good documentation of the build process is the GitHub action in build.yaml. In essecence a few prerequisites need to be met:
In essecence a few prerequisites need to be met:
- Install the dependencies:
sudo apt-get install -y build-essential cmake ninja-build - Checkout the repository and submodules:
git clone --recurse-submodules https://github.com/datazoode/flapi.git - Build the project:
make release
The build process will download and build DuckDB v1.1.2 and install the vcpkg package manager. We depend on the following vcpkg ports:
argparse- Command line argument parsercrow- Our REST-Web framework and JSON handlingyaml-cpp- YAML parserjwt-cpp- JSON Web Token libraryopenssl- Crypto librarycatch2- Testing framework
Note: MCP support is built-in and doesn't require additional dependencies beyond what's already included.
For more detailed information, check out our full documentation.
We welcome contributions! Please see our Contributing Guide for more details.
flAPI is licensed under the Apache License 2.0. See the LICENSE file for more details.
If you have any questions or need help, please open an issue or join our community chat.

