-
Couldn't load subscription status.
- Fork 355
avoids passing naming conventions as modules #3229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Deploying with
|
| Status | Name | Latest Commit | Preview URL | Updated (UTC) |
|---|---|---|---|---|
| ✅ Deployment successful! View logs |
docs | 7234263 | Commit Preview URL Branch Preview URL |
Oct 22 2025, 08:59 PM |
| ```py | ||
| import dlt | ||
| # you should be able to import sql_cs_latin2 here! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe remove this line? It's confusing, you say do not import it as modules, and then you say you should be able to import sql_cs_latin2 here!
| `dlt` will import `tests.common.cases.normalizers.sql_upper` and use the `NamingConvention` class found in it as the naming convention. | ||
|
|
||
| :::tip | ||
| Do not pass custom naming convention as modules if you do it explicitly. We recommend pattern below: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I'm creating my own module and then pass it as a string in the destination config, and dlt will attempt to import it? If I understood it correctly, I think it requires a bit more explanation what's happening. Right now users might think that sql_cs_latin2 is one of the predefined available naming conventions. Especially if they not read it carefully.
Also, should it be mentioned in ## Write your own naming convention section? I would expect this information to be there
* fix(dashboard): remove `pandas` deps, use `pyarrow` (#3157) * add custom header generation to ArrowToCsvWriter with quoting style handling (#3178) * [Databricks destination] Feat/2863 databricks table optimization (#3137) * Enhance Databricks adapter with partitioning, table format, and properties support - Added support for partitioning columns in the `databricks_adapter` function. - Introduced a `table_format` parameter to specify either DELTA or ICEBERG formats. - Implemented validation for table properties, ensuring they adhere to constraints. - Updated SQL generation in `DatabricksClient` to handle new clustering and partitioning logic. - Enhanced documentation for new parameters and usage examples. * Refactor Databricks SQL generation and enhance documentation for table hints - Updated SQL generation in `DatabricksClient` to position the USING clause after column definitions for ICEBERG tables. - Enhanced documentation for Databricks adapter, adding detailed descriptions for table-level and column-level hints, including examples for clustering, partitioning, and table properties. - Added tests for clustering, partitioning, and table properties to ensure functionality and validation of new features. * Update Databricks destination schema data types for improved compatibility - Changed data type for 'amount' from decimal(10,2) to decimal for broader precision. - Updated 'region' data type from string to text for better handling of larger text values. - Modified 'year' and 'month' data types from int to bigint to accommodate larger values. * Enhance DatabricksClient to validate table properties type - Updated the condition for checking table properties to ensure it is a dictionary before processing. - Adjusted test case to reflect the change in type checking for table properties, improving error handling for invalid types. * Fix formatting issue in DatabricksClient SQL generation by removing unnecessary whitespace * Refactor Databricks adapter for improved readability and consistency - Cleaned up whitespace and formatting in `databricks_adapter` function for better code clarity. - Enhanced error messages for table property validation to improve user feedback. - Updated test cases for consistency in formatting and readability. * refactor: update Databricks adapter use dlt partition and table_format - Removed deprecated hints for partition and table format. - Updated table format handling to store as lowercase and validate Iceberg-specific constraints. - Enhanced documentation to clarify Iceberg support and restrictions on Delta properties. - Added tests for Iceberg format and validation of Delta-specific properties at load time. * refactor: improve query formatting in Databricks adapter tests - Enhanced readability of SQL queries by adjusting line breaks and spacing. - Ensured consistency in query structure for better maintainability. * feat:3162 add resource name to incremental extract duplicate checks logging. (#3164) * feat: add resource name to incremental extract duplicate check warnings. Improves debugging. Issue #3162 * fix: quote resource_name variable --------- Co-authored-by: Andrei Canache <andrei.canache@global.com> * Feat/adds workspace (#3171) * ports toml config provider with profiles * supports run context with profiles * separates pluggy hooks from impls, uses pyproject and __plugins__.py for self-plugging * implements workspace run context with profiles and basic cli * displays workspace name and profile name before executing cli commands if run context supports profiles * exposes dlt.current.workspace() * converts run context protocol into abstract class * fixes plugins tests * refactors _workspace: private and public modules * adds workspace test cases * launches workspace and pipeline mpc with cli, sse by default * tests basic workspace behaviors * refactors code to switch context and profile * adds default profile to run context interface * ports pipeline and oss mcp, changes derivation structure * adds safeguards and tests to workspace cleanup cli helper * adds run_context to SupportsPipeline, checks run_context change on pipeline activation * adds mcp dependency to workspace extra, fixes types * renames test fixture * mcp export tweak * updates cli reference and common ci workflow * disables dlt-plus deps in ci * removes df from mcp tools, fixes workspace tests * fixes tests * Fix: make store_decimal_as_integer patching conditional on pyiceberg version (#3185) * bump pyiceberg, remove store_decimal_as_integer patching * return old version, patch parquet writer kwargs based on the lib version * add a note in the docs --------- Co-authored-by: ivasio <ivan@dlthub.com> * documents vault provider (#3160) * adds vault config provider docs * adds ariflow variable pipeline-scoped secrets.toml test * violetta's docs improvements Co-authored-by: Violetta Mishechkina <sansiositres@gmail.com> * adds env examples to google vault config --------- Co-authored-by: Violetta Mishechkina <sansiositres@gmail.com> * QoL: accept destination name as shorthand form of destination (#3122) * Destination type lookup fallback in from_reference * Test test_import_destination_type_config * Docs adjusted, e2e test added * e2e test with pipeline load * first named des with type, then dest as type * Tests fixed, exception improved, docs adjusted * fixes plugin module list * Code simplified * exception just sets attrs, from_name removed --------- Co-authored-by: Marcin Rudolf <rudolfix@rudolfix.org> * Minor docs fix (#3197) * removes cluster on create table test and allows only partition (#3191) * renames `dlt_plus` plugin to `dlthub` (#3192) * adds selective required context, checks profile support in switch_profile * creates and tests hub module * adds plugin version to telemetry * renames imports in docs * renames ci workflows * fixes lint * skips dlthub tests properly * Add redirect from dlt-plus pages to hub (#3195) * Add redirect from dlt-plus pages to hub * fix hub to plus forwarding --------- Co-authored-by: dave <shrps@posteo.net> * repo(pytest): migrate to `pyproject.toml` and reduce verbosity (#3205) * migrate pytest.ini to pyproject.toml * decrease verbosity level to quiet * feat: unify `dlt.Relation` API and create bound Ibis tables (#3179) * port code from PR#2498 * added .to_ibis() * fix override; linting; fix transpiling * narrowed var type * add docstrings to tests * remove pytest marks * wrap _safe_raw_sql() to handle closing connections * lint / format * 3.9 typing; use public dlt interface * revert back to simple yield without context manager * formatting * revert to return tuples * try to skip <3.9 * hack to skip test_ibis.py on <3.10 * closes ibis conn in ibis tests * bumps tokenizers library in lockfile * fixes to lint on windows * closes ibis conn properly --------- Co-authored-by: Marcin Rudolf <rudolfix@rudolfix.org> * chore/moves cli to `_workspace` module (#3215) * adds selective required context, checks profile support in switch_profile * creates and tests hub module * adds plugin version to telemetry * renames imports in docs * renames ci workflows * fixes lint * tests deploy command on duckdb * moves cli module to workspace * moves cli tests to workspace module * renames fixtures, rewrites fixture to patch run context to _storage * allows to patch global dir in workspace context * when finding git repo, does not look up if GIT_CEILING_DIRECTORIES is set * imports git utils only when need to clone package in dbt runner * runs workspace tests as part of common * fixes tests, config tests sideeffects * moves dashboards to workspace * fixes pipeline trace test * moves dashboard helper tests * excludes additional secret files and pinned profile from gitignore * cleansup hatchling files in pyproject * fixes dashboard running tests in ci * moves git module to libs * diff fix * fixes fixture names * feat/3103: Ensure consistency in HexBytes coercion (#3200) * Refactor: Replace hexbytes dependency with custom HexBytes implementation * Removed the hexbytes library and integrated a custom HexBytes class to ensure compatibility with the codebase. * Updated imports across multiple files to use the new HexBytes class. * Added tests for the HexBytes class to validate its functionality and ensure proper behavior with various input types. * Update hexbytes error handling test to reject lists as input type * Remove TypeError test for unsupported list input in HexBytes error handling * Refactor: Improve formatting of hex method in HexBytes class for better readability * Refactor: Clean up comments and improve readability in hex method of HexBytes class * Refactor: Rename methods in HexBytes class for clarity and consistency * Updated method names from `to_bytes` to `_to_bytes` and `hexstr_to_bytes` to `_hexstr_to_bytes` to indicate their private nature. * Adjusted method calls within the class to reflect the new names, enhancing code readability and maintainability. * * Removed support for bool and int types in HexBytes constructor, streamlining input handling and Introduced a new fromhex method to create HexBytes from hex strings, improving clarity. * Remove hexbytes dependency from lockfile and related configurations * Enhance hex method in HexBytes class to support custom separators and bytes per separator. This improves flexibility in hex encoding output while maintaining the existing functionality. * Refactor hex method in HexBytes class to improve parameter handling and readability. Updated the method signature to clarify the use of custom separators and bytes per separator, ensuring consistent behavior with existing functionality. * Update hex method in HexBytes class to remove unnecessary noqa comments, enhancing code clarity and consistency. * Update Databricks init scipt documentation (#3202) * Update Databricks init script docs DeltaLiveTablesHook.py has moved around over the last few DBR releases. Documenting the paths for the two latest LTS releases. * Update databricks.md * Feat: workspace file selector, package builder (#3207) * File selector, package builder * pathspec added, improvements * Test for file selector * Test for package builder * digest256_tar_stream util * Unnecessary file selector protocol removed * Posix path in builder * Relevant notes and dosctring improvements * Feat/adds workspace configuration (#3221) * removes runtime configuration from pipeline context to run context with corresponding action of initializing local runtime * improves telemetry instrumentation decorator + tests + disable telemetry in dlt tests by default * resolves DashboardConfiguration so it is placed within workspace or pipeline configuration * adds and resolves WorkspaceConfiguration and corresponding WorkspaceRuntimeConfiguration * reorganizes cli commands, wrappers, adds missing telemetry tracking * uses working and local dir overrides without adding profile names * uses python language to display stacktraces in marimo * restores runtime_config on pipeline pointing to new PipelineRuntimeConfiguration * renames working dir def _data to .var and updates .gitignore * adds workspace show command * adds reload method on run context, fixes requests helper test * slight cli improvements * new workspace writeup without sidebar * docstring for cli plugins * Add new dlthub structure for docs (#3199) Co-authored-by: Marcin Rudolf <rudolfix@rudolfix.org> * fix: it should be destination (#3217) * adds pokemon table count consts (#3232) * Feat/3154 convert script preprocess docs to python and add destination capabilities section to destination pages (#3188) * Add DLT destination capabilities tags to documentation files This commit introduces the `<!--@@@DLT_DESTINATION_CAPABILITIES <destination>-->` tags to various destination documentation files. The following files were updated: - athena.md - bigquery.md - clickhouse.md - databricks.md - destination.md - dremio.md - duckdb.md - ducklake.md - filesystem.md - lancedb.md - motherduck.md - mssql.md - postgres.md - qdrant.md - redshift.md - snowflake.md - sqlalchemy.md - synapse.md - weaviate.md * Enhance documentation by adding destination capabilities sections This commit adds the `## Destination capabilities` section along with the corresponding `<!--@@@DLT_DESTINATION_CAPABILITIES <destination>-->` tags to various destination documentation files. The following files were updated: - athena.md - bigquery.md - clickhouse.md - databricks.md - destination.md - dremio.md - duckdb.md - ducklake.md - filesystem.md - lancedb.md - motherduck.md - mssql.md - postgres.md - qdrant.md - redshift.md - snowflake.md - sqlalchemy.md - synapse.md - weaviate.md * Add new script for inserting DLT destination capabilities * Update package.json and package-lock.json to include new script for inserting destination capabilities This commit modifies the `package.json` to add a new script for inserting destination capabilities and updates the `package-lock.json` to reflect the changes in dependencies. The new script allows for better integration of destination capabilities into the documentation process. * Revert "Update package.json and package-lock.json to include new script for inserting destination capabilities" This reverts commit cd5d6c2. * Add script for inserting destination capabilities into documentation This commit introduces a new Python script, `insert_destination_capabilities.py`, It contains only place holder for now that prints to the console for testing the setup. * Add destination capabilities execution This commit introduces a new function, `executeDestinationCapabilities`, which executes a Python script to insert destination capabilities into the documentation process. * Enhance destination capabilities insertion script This commit refines the `insert_destination_capabilities.py` script by adding functionality to dynamically generate and insert destination capabilities tables into documentation files. It introduces a new data structure for capabilities, improves file processing logic, and ensures that only relevant files are processed. Additionally, it enhances error handling and logging for better traceability during execution. * Refactor destination capabilities insertion script This commit updates the `insert_destination_capabilities.py` script to improve its functionality by dynamically retrieving supported destination names from the source directory. It enhances the file processing logic to ensure only relevant files are processed based on available destinations. Additionally, it improves error handling and logging for better execution traceability. * Refactor and enhance destination capabilities insertion script This commit refines the `insert_destination_capabilities.py` script by adding functionality to dynamically retrieve and format destination capabilities into markdown tables. It introduces improved error handling, validation for destination names, and enhances the file processing logic to ensure only relevant files are processed. Additionally, it updates the main function to include pre-checks for source and target directories, ensuring a more robust execution flow. * Refactor and improve destination capabilities insertion script This commit enhances the `insert_destination_capabilities.py` script by refining the logic for generating markdown tables of destination capabilities. It introduces new patterns for documentation links, improves error handling, and optimizes the processing of relevant capabilities. Additionally, it streamlines the file processing logic and ensures that only valid capabilities are included in the output, resulting in cleaner and more informative documentation. * Remove destination capabilities sections from various documentation files This commit removes the `## Destination capabilities` sections and their corresponding `<!--@@@DLT_DESTINATION_CAPABILITIES <destination>-->` tags from multiple destination documentation files, including athena.md, bigquery.md, clickhouse.md, databricks.md, dremio.md, duckdb.md, ducklake.md, filesystem.md, lancedb.md, motherduck.md, mssql.md, postgres.md, qdrant.md, redshift.md, snowflake.md, sqlalchemy.md, synapse.md, and weaviate.md. This cleanup helps streamline the documentation and focuses on relevant content. * Add destination capabilities sections to various documentation files This commit introduces `## Destination capabilities` sections along with their corresponding `<!--@@@DLT_DESTINATION_CAPABILITIES <destination>-->` tags in multiple destination documentation files, including athena.md, bigquery.md, clickhouse.md, databricks.md, dremio.md, duckdb.md, ducklake.md, filesystem.md, lancedb.md, motherduck.md, mssql.md, postgres.md, qdrant.md, redshift.md, snowflake.md, sqlalchemy.md, synapse.md, and weaviate.md. This addition enhances the documentation by providing clear insights into the capabilities of each destination, improving user understanding and usability. * Update documentation for various destinations with formatting improvements This commit enhances the documentation for multiple destinations, including BigQuery, ClickHouse, Databricks, Dremio, DuckDB, DuckLake, Filesystem, LanceDB, MotherDuck, MSSQL, Postgres, Qdrant, Redshift, Snowflake, SQLAlchemy, Synapse, and Weaviate. Changes include improved formatting for warnings, notes, and tips, as well as minor adjustments to the content for clarity and consistency. These updates aim to enhance the readability and usability of the documentation for users. * Remove destination capabilities sections from various documentation files * Update destinations with capabilities marker * Added type guard to guard against Any * Temporarily commit preprocessed docs * Add new constants for documentation preprocessing and update requirements This commit introduces a new `constants.py` file containing various constants for documentation preprocessing, including directory paths, file extensions, timing settings, and markers. Additionally, the `requirements.txt` file is updated to include `watchdog` and `requests` packages, enhancing the project's dependencies. * Add tuba links processing script and remove unused line from constants This commit introduces a new script, `preprocess_tuba.py`, which handles the fetching and formatting of tuba links for documentation. It includes functions for fetching configuration, extracting tags, and inserting links into markdown files. Additionally, an unused line has been removed from `constants.py` to clean up the code. * Refactor tuba link processing and extract utility function This commit refactors the `preprocess_tuba.py` script by moving the `extract_marker_content` function to a new `utils.py` file for better organization and reusability. The logic for checking the presence of the TUBA marker has been simplified, and the formatting function for tuba links has been updated to improve clarity and maintainability. These changes enhance the overall structure of the documentation preprocessing tools. * Add snippet processing functionality for documentation This commit introduces a new script, `preprocess_snippets.py`, which provides functions for building a map of code snippets, retrieving snippets from files, and inserting them into markdown documents. The script enhances the documentation preprocessing tools by allowing for better management and formatting of code snippets. Additionally, the `utils.py` file is updated with new utility functions for directory traversal and marker content extraction, improving overall code organization and reusability. * Add example processing script for documentation generation This commit introduces a new script, `process_examples.py`, which automates the generation of example documentation from Python files. The script includes functionality to build documentation by extracting headers, comments, and code snippets, while also handling exclusions and errors gracefully. Additionally, the `utils.py` file is updated with a new utility function, `trim_array`, to enhance the management of line arrays. These changes improve the documentation process by streamlining example integration and ensuring better formatting. * Enhance documentation preprocessing with Python integration and new script This commit updates the `package.json` to include a new script for installing Python dependencies and modifies the start and build scripts to incorporate Python preprocessing. Additionally, a new `preprocess_docs.py` script is introduced, which automates the processing of markdown files by inserting code snippets, managing links, and syncing examples. The `requirements.txt` is also updated to include a new dependency, `python-debouncer`, improving the documentation workflow. * Refactor documentation preprocessing scripts for improved async handling and example processing This commit enhances the `preprocess_docs.py` script by integrating asynchronous file handling and introducing a lock mechanism to manage concurrent processing. The `package.json` is updated to modify the start script for better coordination of preprocessing tasks. Additionally, a new `preprocess_examples.py` script is added to streamline the generation of example documentation, ensuring proper formatting and error handling. The `preprocess_snippets.py` script is also updated to maintain consistency in line reading methods. These changes collectively improve the efficiency and reliability of the documentation workflow. * Refactor documentation preprocessing scripts for improved efficiency and caching This commit updates the `package.json` to streamline the start script by removing the lock file mechanism and enhancing the coordination of preprocessing tasks. The `preprocess_docs.py` script is refactored to eliminate the lock file usage, simplifying the processing flow. Additionally, the `preprocess_tuba.py` script introduces a caching mechanism for tuba configuration to reduce redundant network requests, improving performance. These changes collectively enhance the documentation workflow and processing efficiency. * Refactor file change handling in documentation preprocessing scripts This commit enhances the `preprocess_docs.py` script by simplifying the file change handling logic through the introduction of a new `handle_change_impl` function. The previous `should_process` function is removed to streamline the decision-making process for file processing. Additionally, whitespace cleanup is performed for better code readability. The `preprocess_tuba.py` script also receives minor whitespace adjustments. These changes collectively improve the maintainability and clarity of the documentation preprocessing workflow. * Add destination capabilities processing and refactor related scripts This commit introduces a new script, `preprocess_destination_capabilities.py`, which handles the generation of destination capabilities tables for documentation. It includes caching mechanisms for improved performance and integrates with existing constants for consistency. The `insert_destination_capabilities` function is now called within `preprocess_docs.py` to streamline the documentation processing workflow. Additionally, the `insert_destination_capabilities.py` script is removed as its functionality is now encapsulated in the new script. These changes enhance the documentation generation process by providing structured capabilities information. * Update package-lock.json and package.json for improved documentation preprocessing This commit updates the `package-lock.json` to reflect changes in dependencies and their versions, ensuring compatibility and performance enhancements. The `package.json` is modified to streamline the `start` and `preprocess-docs` scripts by removing the installation of Python dependencies from the start command and adjusting the environment variable settings. These changes collectively enhance the efficiency and reliability of the documentation generation workflow. * Add processed docs entry to .gitignore This commit updates the .gitignore file to include the 'docs_processed' entry, ensuring that preprocessed documentation files are excluded from version control. This change helps maintain a cleaner repository by preventing unnecessary files from being tracked. * Stop tracking docs_processed directory * Remove the `preprocess_docs.js` script, which handled documentation preprocessing tasks including snippet insertion and link management. This deletion streamlines the codebase by eliminating unused functionality, following recent refactoring efforts to improve documentation processing workflows. * Refactor destination capabilities processing script for type hinting and formatting improvements This commit updates the `preprocess_destination_capabilities.py` script by adding type hints for caching variables, enhancing code clarity and maintainability. Additionally, it modifies the formatting of the capabilities table to ensure consistent output and appends a newline for better readability. These changes collectively improve the structure and presentation of destination capabilities in the documentation. * Refactor documentation processing scripts by removing unnecessary argument documentation This commit simplifies the `insert_destination_capabilities` function in `preprocess_destination_capabilities.py` by removing the detailed argument and return type documentation. Additionally, the `format_tuba_links_section` function in `preprocess_tuba.py` is updated to streamline its docstring, enhancing clarity while maintaining essential information. These changes improve the readability and maintainability of the documentation processing scripts. * Update package.json to streamline documentation processing scripts This commit modifies the `package.json` to include a new script for installing Python dependencies and updates the `start` and `build` scripts to ensure a more efficient workflow. The changes enhance the coordination of documentation preprocessing tasks, improving the overall efficiency of the documentation generation process. * Added dependency installement in start * Refactor package.json scripts for improved documentation processing This commit updates the `package.json` to streamline the `start`, `build`, and `build:cloudflare` scripts by removing redundant installation of Python dependencies. The `preprocess-docs` script is now defined separately, enhancing clarity and efficiency in the documentation generation workflow. * Add type checking configurations for additional modules in mypy.ini This commit extends the mypy.ini configuration by adding ignore_missing_imports settings for several new modules, including constants and various preprocess modules. These changes aim to improve type checking flexibility and reduce false positives during type analysis, enhancing the overall development experience. * Enhance type hinting in preprocessing scripts for improved clarity This commit updates the type hints in `preprocess_destination_capabilities.py`, `preprocess_snippets.py`, and `preprocess_tuba.py` to provide more specific type information. Changes include casting for constants and refining list and dictionary type annotations. These improvements enhance code readability and maintainability, supporting better type checking and development practices. * Update dependencies and refactor documentation processing scripts This commit adds the `python-debouncer` dependency to `pyproject.toml` for improved event handling in documentation processing. Additionally, it refines the `package.json` scripts by separating the `preprocess-docs` command and optimizing the `start` script for better efficiency. The `preprocess_docs.py` script is also updated to utilize lazy imports for certain modules, enhancing performance during documentation processing. These changes collectively improve the clarity and efficiency of the documentation generation workflow. * Remove requirements.txt and clean up whitespace in preprocess_docs.py This commit deletes the `requirements.txt` file, which is no longer needed, and cleans up unnecessary whitespace in the `preprocess_docs.py` script. These changes help streamline the codebase and improve overall readability. * Update documentation for Databricks and DuckLake destinations This commit enhances the documentation for Databricks by adding a note about loading data to Managed Iceberg tables and refining the descriptions of table and column-level hints. Additionally, it updates the DuckLake documentation to recommend using a more explicit catalog name in configuration examples. These changes improve clarity and usability for users working with these destinations. * Enhance documentation for various destinations and add requirements.txt for project dependencies * Fix typo in DuckDB documentation regarding spatial extension installation * Remove destination capabilities section from AWS Athena documentation * Feat/adds workspace (#3171) * ports toml config provider with profiles * supports run context with profiles * separates pluggy hooks from impls, uses pyproject and __plugins__.py for self-plugging * implements workspace run context with profiles and basic cli * displays workspace name and profile name before executing cli commands if run context supports profiles * exposes dlt.current.workspace() * converts run context protocol into abstract class * fixes plugins tests * refactors _workspace: private and public modules * adds workspace test cases * launches workspace and pipeline mpc with cli, sse by default * tests basic workspace behaviors * refactors code to switch context and profile * adds default profile to run context interface * ports pipeline and oss mcp, changes derivation structure * adds safeguards and tests to workspace cleanup cli helper * adds run_context to SupportsPipeline, checks run_context change on pipeline activation * adds mcp dependency to workspace extra, fixes types * renames test fixture * mcp export tweak * updates cli reference and common ci workflow * disables dlt-plus deps in ci * removes df from mcp tools, fixes workspace tests * fixes tests * Fix build scripts for Cloudflare integration in package.json * Fix preprocess-docs:cloudflare script to use python directly instead of uv * Restore preprocess-docs scripts in package.json for consistency * Update preprocess-docs:cloudflare script to include requirements installation * Update preprocess-docs:cloudflare script to include requirements installation * Add __init__.py file to tools directory * Refactor import statements to use relative imports in preprocessing scripts * Update import statements to use absolute paths for consistency across preprocessing scripts * Add mypy configuration for additional modules to ignore missing imports * Removed duplicated line * Add mypy configuration to ignore missing imports for tools module * Update ducklake.md * temporarily add netlify build command back * fix typing in snippets and update mypy.ini a bit * reverse build commands back to previous order * Fixed watch by changing implementation into queue and locks * Refactor package.json for improved script organization and maintainability * Add mypy configuration to ignore missing imports for additional modules * Add mypy configuration to ignore missing imports for more modules * Remove mypy configuration for preprocess_examples to streamline settings * Update mypy configuration: rename dlt hub section to dlt plus and remove unused preprocess settings * Refactor import statements to remove 'tools' prefix, improving module accessibility across preprocess scripts * Refactor import statements in preprocessing scripts to use relative imports, enhancing module organization and consistency * Refactor import statements in preprocessing scripts to use absolute imports from the tools module, improving clarity and consistency across the codebase * Update mypy.ini * Fix formatting in _generate_doc_link function by removing unnecessary whitespace in return statement for improved readability * fix linting and script execution * remove sleeping after preprocessing in favor of predictable processing before docusaurus launch * remove unnecessary whitespace in preprocess_docs.py for cleaner code * Update deployment script in package.json and enhance file change handling in preprocess_docs.py; remove obsolete preprocess_change.py * Refactor preprocess_docs.py to improve file change handling; replace change counter with a pending changes flag for better processing control and enhance logging for file modifications. * Enhance capabilities table generation in preprocess_destination_capabilities.py by adding a descriptive header and introductory text for improved clarity and context. * Remove destination capabilities sections from multiple destination documentation files for consistency and clarity. * Fix formatting in start script of package.json for improved readability * Enhance capabilities table generation by improving destination name formatting; streamline file change handling in preprocess_docs.py by removing unnecessary print statements. * update files incrementally only when in watcher mode make tuba link generation random per day with a seed * fix duplicate page at examples error * remove outdated docs deploy action * add build docs action for better debugability * revert unintential change to md file * add info about where capabilities links should go * refactor: improve documentation link generation for capabilities * fix: update documentation link for replace strategy and improve link formatting --------- Co-authored-by: rudolfix <rudolfix@rudolfix.org> Co-authored-by: dave <shrps@posteo.net> * renames mcp pipeline tools (#3238) * ignores native config values if config spec does not implement those (#3233) * does not fail config resolution if native valued provided to a config that does not implement native values * updates databricks docs * allows to replace hints regexes on schema * removes partition hint on eth merge test on databricks * adds pokemon table count consts * reorgs databricks dlt fix * fixes lancedb custom destination example * fixes lancedb custom destination example * reduces no sql_database examples run on ci * fixes merge * marks and skips rfam tests * avoids passing naming conventions as modules (#3229) * adds /home/rudolfix/src/dlt to sys.path when running dlt commands and a cli flag to disable it * adds cli docs check to lint * avoids passing custom naming as modules in docs * removes cli docs check due to Python 3.9 * fixes deploy cli * adds pokemon table count consts * improves custom naming convention docs * feat: extend `TTableReference` (#3093) * added fields to TTableReference * added more cardinality; updated docstring * linting * add Schema.references property * Removed Required type hint dlt typed dictionary validation don't recognize Required / NotRequired annotations. Also, those annotations don't store metadata in Python <3.11 * added tests; asymmetric cardinality * removed duckdb deps for common test * format and lint * use naming convention; parameterize test on name normalizer * simplify typing inheritance of TTableReference * adds table reference tests for title case naming convention * fixes fixture import --------- Co-authored-by: Marcin Rudolf <rudolfix@rudolfix.org> * Feat: dlt.destination as named destination factory (#3204) * dlt.destination as factory initializer, existing test adjusted * Additional test adjustments * test fix in test_duckdb_client.py * Test readability improvements * Main destination function in decorators.py improved * Docs improvements with separate named destination section * Undo unnecessary adjustments in tests * custom destination test added * fix in destination snippet example * Fix broken links in docs * Initial commit with add_metrics (#3240) * Feature: Introduce support of http based resources for fs source (#3029) * Feature, Add support of http based paths * Feature, Add support of http resources * Feature, Enforce coercion to pendulum types. Add support of RFC 1123 format * Feature, Add cloudfront base_url to the configurations * Feature, Add a test for http based resources * Feature, Add a test case for RFC 1123 datetime format * Feature, Remove test cases related to datetime parsing in RFC and timestamp formats * Revert "Feature, Enforce coercion to pendulum types. Add support of RFC 1123 format" This reverts commit 142624b. * Feature, Restore the structure of the url for the cdn * Feature, Replace custom datetime parser function with a single dispatched one * Feature, Add a stub package for singledispatch * Feature, Reffactor pendulume datetime processing functions * Feature, Fix the linting errors in time related tests * Feature, Fix the declaration * Feature, Revert the changes related to datetime parsing * Feature, Add http schema for testing. Add pendulum parser to support RFC 1123 format * Feature, Update the configuration for http bucket * Feature, Add a http server. Update the test for http fs * Feature, Upgrade fsspec * Feature, Fix codestyle * Feature, Fix the protocol validation for fsspec args * Feature, Fix the typing annotations * Add an example for http filesystem * Feature, Add schema to the urlparse call * Feature, Fix the codestyle for http entries in MIME_DISPATCH * Feature, Expand the list of supported locations in the docs * uses more random port and closes httpd to release it properly, drops auto fixture as it would be attached to all tests * moves httpd tests to common tests * adds http extra to support fsspec --------- Co-authored-by: Marcin Rudolf <rudolfix@rudolfix.org> * Fixes docs on schema file naming convention (#3244) Neither `{source name}_schema.yml` nor `{source name}.schema.yml` worked in my experiments. * Fix: add support for yield_map in rest resource (#3211) * add support for yield_map in rest resource, add tests * fix tests * document usage of yield_map in rest_api resource * add record count asserts in tests * formatting --------- Co-authored-by: ivasio <ivan@dlthub.com> * Skip `cluster by` in bigquery on alter statements (#3239) * Ignore cluster by in bigquery on alter statements * fix linter --------- Co-authored-by: Marcin Rudolf <rudolfix@rudolfix.org> * Re-enable python 3.14 common tests (#3242) * enable python 3.14 * try on mac * remove beta 4 disclaimer * adds sleep before starting windows e2e tests --------- Co-authored-by: Marcin Rudolf <rudolfix@rudolfix.org> * graceful signal handler (#3234) * implements signal handlers that allow graceful shutdown on a first signal. tests pipelines in forked tests * includes KeyboardInterrupt in exception handlers in Pipeline to leave proper trace * saves package state on each batch in custom destination * initializes step in progress collectors * Add new dlthub structure for docs (#3199) Co-authored-by: Marcin Rudolf <rudolfix@rudolfix.org> * fix: it should be destination (#3217) * adds pokemon table count consts (#3232) * fixes docstrings on signals --------- Co-authored-by: Violetta Mishechkina <sansiositres@gmail.com> Co-authored-by: Xiatong 夏童 <40656281+Magicbeanbuyer@users.noreply.github.com> * init pipeline in three ways page (#3222) * init pipeline in three ways page * add run pipelines and what is workspace * move install workspace as step 0 * remove dashboard * Update docs/website/docs/hub/init.md Co-authored-by: Violetta Mishechkina <sansiositres@gmail.com> * Update docs/website/docs/hub/init.md Co-authored-by: Violetta Mishechkina <sansiositres@gmail.com> * Update docs/website/docs/hub/init.md Co-authored-by: Violetta Mishechkina <sansiositres@gmail.com> * Update docs/website/docs/hub/init.md Co-authored-by: Violetta Mishechkina <sansiositres@gmail.com> * Update docs/website/docs/hub/init.md Co-authored-by: Violetta Mishechkina <sansiositres@gmail.com> * resolve Violettas comments * fix lang in snippets * Update docs/website/docs/hub/init.md Co-authored-by: Violetta Mishechkina <sansiositres@gmail.com> * Update docs/website/docs/hub/init.md Co-authored-by: Violetta Mishechkina <sansiositres@gmail.com> * fix links * fix link * move dashboard link on top * add init to sidebar --------- Co-authored-by: Violetta Mishechkina <sansiositres@gmail.com> * add profiles (#3252) * Fix: workspace package manifest has sorted files (#3251) * Add dlthub intro docs (#3241) * Add dlthub intro * Update with comments * fix: `.to_ibis()` query normalization + docs update (#3225) * use dlt.Dataset query normalization in _DltBackend * pass dlt SQL cursor to _DltBackend instead of return values --------- Co-authored-by: Marcin Rudolf <rudolfix@rudolfix.org> * Fix: Empty columns that were previously flattened into compound ones violate freeze contract (#3226) * Initial commit * adds commented out test case that leaves columns with None --------- Co-authored-by: Marcin Rudolf <rudolfix@rudolfix.org> * adds more signal options (#3248) * adds option in load that prevents draining pool on signal * adds runtime pipeline option to not intercept signals * refactors signal module * tests new cases * describes signal handling in running in prod docs * bumps dlt to 1.18.0 * fixes tests forked * removes logging and buffered console output from signals * adds retry count to load job metrics, generates started_at in init of runnable load job * allows to update existing metrics in load step * finalized jobs require start and finish dates * generates metrics in each job state and in each completed loop, does not complete package if pool drained but jobs left, adds detailed tests for metrics * fixes remote metrics * replaces event with package bound semaphore to complete load jobs early * fixes dashboard to on windows * improves signals docs * renames delayed_signals to intercepted_signals * fixes flaky signal tests in pipelines --------- Co-authored-by: Thierry Jean <68975210+zilto@users.noreply.github.com> Co-authored-by: Anton Burnashev <anton.burnashev@gmail.com> Co-authored-by: Bayees <christian.bay@me.com> Co-authored-by: andreka <98219870+and2reak@users.noreply.github.com> Co-authored-by: Andrei Canache <andrei.canache@global.com> Co-authored-by: ivasio <van.chebotar@yandex.ru> Co-authored-by: ivasio <ivan@dlthub.com> Co-authored-by: Violetta Mishechkina <sansiositres@gmail.com> Co-authored-by: anuunchin <88698977+anuunchin@users.noreply.github.com> Co-authored-by: dave <shrps@posteo.net> Co-authored-by: Menna <menna.tullah.magdy731@gmail.com> Co-authored-by: Andrei Bondarenko <8046595+AndreiBondarenko@users.noreply.github.com> Co-authored-by: Xiatong 夏童 <40656281+Magicbeanbuyer@users.noreply.github.com> Co-authored-by: Max Yakovenko <mxyakovenko9@gmail.com> Co-authored-by: Willi Müller <willi.mueller@posteo.de> Co-authored-by: Adrian Macias <101747628+adrian-173@users.noreply.github.com> Co-authored-by: Alena Astrakhantseva <alena@dlthub.com>
Description
This PR adds $PWD to
sys.pathwhen running cli commands to reproduce Python behavior when it runs scripts. The example problem: custom naming modules were not importable and any pipeline inspection command was failing.This also fixes problems reported:
fixes #1622
fixes #2998
where custom naming modules were directly used. Now the documentation and example uses module names which are pickable.