-
Couldn't load subscription status.
- Fork 355
Description
TLDR
Every destination provides a DestinationCapabilitiesContext instance that tells dlt internally how to behave when normalizing data for and loading data to this destination. There is a lot of information in this object that we could use to automatically generate a config section for the website to explain to the user which features this destination supports. For example preferred_loader_file_format tells dlt, which file_format for loading data is used if none is explicitely selected, or supports_tz_aware_datetime tells dlt wether the destination supports datetime columns that store a timezone.
Steps
Unfortunately all our docs scripts are written in js at this moment, so we can't integrate these changes into docs/website/tools/preprocess_docs.js but need to create an additional python script which is run after this preprocess step. Ideally preprocess docs would be written in python, maybe we can do this to soonish.
- Familiarize yourself with the docs build process 'docs/website/README.md' in the core repo. If you see any outdated info in there, please update the README file. You can check out the packages.json to see which scripts are run when the website is built and deployed. There are some tools that modify the markdown files to insert code snippets.
- Create a new script
website/tools/insert_destination_capabilities.py - When run, this script should inspect all files in the folder docs/website/docs_processed, these are the processed markdown files that are served as our docs at dlthub.com/docs or locally when you run
npm start. If a marker named<!--@@@DLT_DESTINATION_CAPABILITIES <destination_name>-->is found, a new markdown table should be inserted here with information about this destination. See how something similar is done with<!--@@@DLT_SNIPPET <snippet_name>-->markers. - You can get the capabilities object of each destination like this (duckdb example):
from dlt.destinations import duckdb
caps = duckdb.capabilities()This might only work if destination credentials are provided, so consider making the _raw_capabilities() method a public and static method and use that to get the destination capabilities of a destination.
- Add a
<!--@@@DLT_DESTINATION_CAPABILITIES <destination_name>-->marker on each destination page such as docs/website/docs/dlt-ecosystem/destinations/duckdb.md. - For a start render a table there that includes the following information for each destination:
- preferred_loader_file_format
- supported_loader_file_formats
- preferred_staging_file_format
- supported_staging_file_formats
- has_case_sensitive_identifiers
- supported_merge_strategies
- supported_replace_strategies
- supports_tz_aware_datetime
- supports_naive_datetime
- Add this new script in package.json right after every time we call
tools/preprocess_docs.js
Example destination capabilities section:
| Feature | Value | More |
|---|---|---|
| Default loader file format | parquet | (link to loader file format info in docs) |
| Supported loader file formats | parquet, csv | (link to loader file format info in docs) |
| Supports timezone in timestamps | True | (link to timezone information in docs) |
Sub-issues
Metadata
Metadata
Assignees
Labels
Type
Projects
Status