Skip to content

[Bug] dbt-snowflake raises an error when caching less than the maximum number of objects in a schema #1234

Closed
@mikealfare

Description

Is this a new bug in dbt-snowflake?

  • I believe this is a new bug in dbt-snowflake
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

I'm seeing this error when running dbt-snowflake on a schema with 12,500 objects:

16:39:51  Encountered an error:
Compilation Error in macro list_relations_without_caching (macros/adapters/metadata.sql)
  dbt will list a maximum of 100000 objects in schema <DATABASE>.<SCHEMA>.
             Your schema exceeds this limit. Please contact support@getdbt.com for troubleshooting tips,
             or review and reduce the number of objects contained.
  
  > in macro snowflake__get_paginated_relations_array (macros/adapters.sql)
  > called by macro snowflake__list_relations_without_caching (macros/adapters.sql)
  > called by macro list_relations_without_caching (macros/adapters/metadata.sql)
  > called by macro list_relations_without_caching (macros/adapters/metadata.sql)

I can confirm that there are less than 100,000 objects in this schema.

-- use dbt service account role
show objects in <DATABASE>.<SCHEMA>;
result: 12,500

Expected Behavior

I should be able to run dbt-snowflake on a schema with 12,500 objects and the cache should reflect all 12,500 objects with no error.

Steps To Reproduce

  1. Create a project with 12,500 objects in a single schema (or more than 10K really)
  2. Run dbt run on the full project

Relevant log output

No response

Environment

- OS: dbt Cloud
- Python: 3.9
- dbt-core: versionless
- dbt-snowflake: versionless

Additional Context

Initial analysis suggests this is the result of the 2024_07 Snowflake bundle. In particular, it looks like part of this change. We use the object name as a watermark when paginating. This is no longer deterministic, meaning that we could (and very likely do) get duplicates of relations and miss relations because we're picking a random starting point in a randomized list. In the past, this return recordset was ordered, making the pagination deterministic. We need to figure out a new watermark method.

Metadata

Assignees

Labels

type:bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions