Skip to content

external tables support #216

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jun 30, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions .github/workflows/ci-integration-tests-csae.yml
Original file line number Diff line number Diff line change
Expand Up @@ -265,6 +265,18 @@ jobs:
timeout_seconds: 300
priority: interactive
retries: 1
dbt_external_tables:
type: teradata
host: $DBT_TERADATA_SERVER_NAME
user: $DBT_TERADATA_USERNAME
password: $DBT_TERADATA_PASSWORD
logmech: TD2
schema: dbt_external_tables
tmode: TERA
threads: 4
timeout_seconds: 300
priority: interactive
retries: 1
EOF
env:
DBT_TERADATA_SERVER_NAME: ${{ steps.create-csae-environments.outputs.teradata-server-name }}
Expand All @@ -291,6 +303,12 @@ jobs:
cd $GITHUB_WORKSPACE/test/valid_history_test
chmod 777 run.sh
./run.sh

- name: Run external tables tests
run: |
cd $GITHUB_WORKSPACE/test/dbt_external_tables_test/integration_tests
chmod 777 run.sh
./run.sh

- name: Run nopi tests
run: |
Expand Down
78 changes: 78 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -771,6 +771,84 @@ If no query_band is set by user, default query_band will come in play that is :

> In Teradata, reusing the same alias across multiple common table expressions (CTEs) or subqueries within a single model is not permitted, as it results in parsing errors; therefore, it is essential to assign unique aliases to each CTE or subquery to ensure proper query execution.

## dbt-external-tables
* [dbt-external-tables](https://github.com/dbt-labs/dbt-external-tables) are supported with dbt-teradata from dbt-teradata v1.9.3 onwards.
* Under the hood, dbt-teradata uses the concept of foreign tables to create tables from external sources. More information can be found [here](https://docs.teradata.com/r/Enterprise_IntelliFlex_VMware/SQL-Data-Definition-Language-Syntax-and-Examples/Table-Statements/CREATE-FOREIGN-TABLE)
* User need to add the dbt-external-tables packages as dependency and can be resolved with `dbt deps` command
```yaml
packages:
- package: dbt-labs/dbt_external_tables
version: [">=0.9.0", "<1.0.0"]
```
* User need to add dispatch config for the project to pick the overridden macros from dbt-teradata package
```yaml
dispatch:
- macro_namespace: dbt_external_tables
search_order: ['dbt', 'dbt_external_tables']
```
* To define `STOREDAS` and `ROWFORMAT` for in dbt-external tables, one of the below options can be used:
* user can use the standard dbt-external-tables config `file_format` and `row_format` respectively
* Or user can just add it in `USING` config as mentioned in the Teradata's [documentation](https://docs.teradata.com/r/Enterprise_IntelliFlex_VMware/SQL-Data-Definition-Language-Syntax-and-Examples/Table-Statements/CREATE-FOREIGN-TABLE/CREATE-FOREIGN-TABLE-Syntax-Elements/USING-Clause)

* For external source, which requires authentication, user needs to create authentication object and pass it in `tbl_properties` as `EXTERNAL SECURITY` object.
For more information on Authentication object please follow this [link](https://docs.teradata.com/r/Enterprise_IntelliFlex_VMware/SQL-Data-Definition-Language-Syntax-and-Examples/Authorization-Statements-for-External-Routines/CREATE-AUTHORIZATION-and-REPLACE-AUTHORIZATION)

* Sample external sources are provided below as references
```yaml
version: 2
sources:
- name: teradata_external
schema: "{{ target.schema }}"
loader: S3

tables:
- name: people_csv_partitioned
external:
location: "/s3/s3.amazonaws.com/dbt-external-tables-testing/csv/"
file_format: "TEXTFILE"
row_format: '{"field_delimiter":",","record_delimiter":"\n","character_set":"LATIN"}'
using: |
PATHPATTERN ('$var1/$section/$var3')
tbl_properties: |
MAP = TD_MAP1
,EXTERNAL SECURITY MyAuthObj
partitions:
- name: section
data_type: CHAR(1)
columns:
- name: id
data_type: int
- name: first_name
data_type: varchar(64)
- name: last_name
data_type: varchar(64)
- name: email
data_type: varchar(64)
```

```yaml
version: 2
sources:
- name: teradata_external
schema: "{{ target.schema }}"
loader: S3

tables:
- name: people_json_partitioned
external:
location: '/s3/s3.amazonaws.com/dbt-external-tables-testing/json/'
using: |
STOREDAS('TEXTFILE')
ROWFORMAT('{"record_delimiter":"\n", "character_set":"cs_value"}')
PATHPATTERN ('$var1/$section/$var3')
tbl_properties: |
MAP = TD_MAP1
,EXTERNAL SECURITY MyAuthObj
partitions:
- name: section
data_type: CHAR(1)
```

## Fallback Schema
dbt-teradata internally created temporary tables to fetch the metadata of views for manifest and catalog creation.
In case if user does not have permission to create tables on the schema they are working on, they can define a fallback_schema(to which they have proper create/drop privileges) in dbt_project.yml as variable.
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{%- macro teradata__create_external_schema(source_node) -%}
{{ exceptions.raise_compiler_error(
"Creating external schema is not implemented for the Teradata adapter"
) }}
{%- endmacro -%}
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
{% macro teradata__create_external_table(source_node) %}

{# https://docs.teradata.com/r/Enterprise_IntelliFlex_VMware/SQL-Data-Definition-Language-Syntax-and-Examples/Table-Statements/CREATE-FOREIGN-TABLE/CREATE-FOREIGN-TABLE-Syntax #}

{%- set columns = source_node.columns.values() -%}
{%- set external = source_node.external -%}
{%- set partitions = external.partitions -%}

CREATE FOREIGN TABLE {{source(source_node.source_name, source_node.name)}}
{{ ', ' + external.tbl_properties if external.tbl_properties }}
{% if columns -%}
(
{% for column in columns %}
{%- set column_quoted = adapter.quote(column.name) if column.quote else column.name %}
{{column_quoted}} {{column.data_type}} {{- ',' if not loop.last -}}
{%- endfor %}
)
{%- endif %}
USING (
LOCATION('{{ external.location }}')
{% if external.file_format -%} STOREDAS ('{{ external.file_format }}') {%- endif %}
{% if external.row_format -%} ROWFORMAT ('{{ external.row_format }}') {%- endif %}
{{ external.using }}
)
{% if partitions -%}
PARTITION BY (
{%- for partition in partitions -%}
{{ partition.name }} {{ partition.data_type if partition.data_type }} {{', ' if not loop.last}}
{%- endfor -%}
)
{%- endif %}

{% endmacro %}
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
{% macro teradata__get_external_build_plan(source_node) %}

{% set build_plan = [] %}

{% set old_relation = adapter.get_relation(
database = source_node.database,
schema = source_node.schema,
identifier = source_node.identifier
) %}

{% set create_or_replace = (old_relation is none or var('ext_full_refresh', false)) %}

{% if create_or_replace %}
{% set build_plan = build_plan + [
dbt_external_tables.dropif(source_node),
dbt_external_tables.create_external_table(source_node)
] %}
{% else %}
{% set build_plan = build_plan + dbt_external_tables.refresh_external_table(source_node) %}
{% endif %}

{% do return(build_plan) %}

{% endmacro %}
20 changes: 20 additions & 0 deletions dbt/include/teradata/macros/dbt_external_tables/helpers/dropif.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
{% macro teradata__dropif(node) %}

{% set old_relation = adapter.get_relation(
database = none,
schema = node.schema,
identifier = node.identifier
) %}

{% set drop_relation = old_relation is not none %}

{% set ddl = '' %}
{% if drop_relation %}
{% set ddl %}
DROP FOREIGN TABLE {{ old_relation.include(database=False) }}
{% endset %}
{% endif %}

{{ return(ddl) }}

{% endmacro %}
2 changes: 2 additions & 0 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,8 @@
'macros/*.sql',
'macros/materializations/**/*.sql',
'macros/utils/*.sql',
'macros/dbt_external_tables/helpers/*.sql',
'macros/dbt_external_tables/*.sql',
'dbt_project.yml',
'sample_profiles.yml',
],
Expand Down
7 changes: 7 additions & 0 deletions test/dbt_external_tables_test/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
dbt-external-tables is based on https://github.com/dbt-labs/dbt-external-tables

Testing instructions:
1. create a teradata dbt profile called 'teradata'
2. create python venv with dbt-teradata
3. change directory to integration_tests
4. run run.sh file
7 changes: 7 additions & 0 deletions test/dbt_external_tables_test/integration_tests/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@

target/
dbt_packages/
logs/
.env/
profiles.yml
package-lock.yml
33 changes: 33 additions & 0 deletions test/dbt_external_tables_test/integration_tests/dbt_project.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
name: 'dbt_external_tables_integration_tests'
version: '1.0'

profile: 'teradata'

config-version: 2

model-paths: ["models"]
analysis-paths: ["analysis"]
test-paths: ["tests"]
seed-paths: ["seeds"]
macro-paths: ["macros"]

target-path: "target"
clean-targets:
- "target"
- "dbt_packages"

dispatch:
- macro_namespace: dbt_external_tables
search_order: ['dbt', 'dbt_external_tables', 'dbt_external_tables_integration_tests']
- macro_namespace: dbt_utils
search_order: ['teradata_utils', 'dbt_utils']

seeds:
+quote_columns: false


sources:
dbt_external_tables_integration_tests:
plugins:
teradata:
+enabled: "{{ target.type == 'teradata' }}"
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{% macro prep_external() %}
{{ return(adapter.dispatch('prep_external', 'dbt_external_tables')()) }}
{% endmacro %}

{% macro default__prep_external() %}
{% do log('No prep necessary, skipping', info = true) %}
{# noop #}
{% endmacro %}
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
version: 2

sources:
- name: not_external
tables:
- name: take_no_action
columns:
- name: id
data_type: varchar(255)
description: "primary key"
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
version: 2

sources:
- name: teradata_external
schema: "{{ target.schema }}"
loader: S3

tables:
- name: people_csv_unpartitioned
external: &csv-people
location: "/s3/s3.amazonaws.com/dbt-external-tables-testing/csv/"
file_format: "TEXTFILE"
using: |
-- PATHPATTERN ('$var1/$section/$var3')
tbl_properties: |
MAP = TD_MAP1
-- ,EXTERNAL SECURITY MyAuthObj
columns: &cols-of-the-people
- name: id
data_type: int
- name: first_name
data_type: varchar(64)
- name: last_name
data_type: varchar(64)
- name: email
data_type: varchar(64)
data_tests: &equal-to-the-people
- dbt_utils.equality:
compare_model: ref('people')
compare_columns:
- id
- first_name
- last_name
- email

- name: people_csv_partitioned
external:
<<: *csv-people
partitions:
- name: section
data_type: CHAR(1)
columns: *cols-of-the-people
data_tests:
- <<: *equal-to-the-people
- dbt_utils.at_least_one:
column_name: section


- name: people_pq_unpartitioned
external:
location: "/s3/s3.amazonaws.com/dbt-external-tables-testing/parquet/"
using: |
STOREDAS('PARQUET')
PATHPATTERN ('$var1/$section/$var3')
tbl_properties: |
MAP = TD_MAP1
-- ,EXTERNAL SECURITY MyAuthObj
columns:
- name: id
data_type: int
- name: first_name
data_type: varchar(64)
- name: last_name
data_type: varchar(64)
- name: email
data_type: varchar(64)
data_tests: *equal-to-the-people

- name: people_pq_partitioned
external:
location: "/s3/s3.amazonaws.com/dbt-external-tables-testing/parquet/"
partitions:
- name: COLUMN
- name: section
data_type: CHAR(1)
data_tests:
- <<: *equal-to-the-people
- dbt_utils.at_least_one:
column_name: section

- name: people_json_unpartitioned
external:
location: "/s3/s3.amazonaws.com/dbt-external-tables-testing/json/"
file_format: "TEXTFILE"
using: |
-- PATHPATTERN ('$var1/$section/$var3')
tbl_properties: |
MAP = TD_MAP1
-- ,EXTERNAL SECURITY MyAuthObj

- name: people_json_partitioned
external:
location: '/s3/s3.amazonaws.com/dbt-external-tables-testing/json/'
using: |
STOREDAS('TEXTFILE')
-- PATHPATTERN ('$var1/$section/$var3')
tbl_properties: |
MAP = TD_MAP1
-- ,EXTERNAL SECURITY MyAuthObj
partitions:
- name: section
data_type: CHAR(1)
Loading