Skip to content

Conversation

@ChenyuLInx
Copy link
Contributor

@ChenyuLInx ChenyuLInx commented Sep 21, 2022

resolves #5733

Description

Add .dbtignore function

Checklist

@cla-bot cla-bot bot added the cla:yes label Sep 21, 2022
@github-actions
Copy link
Contributor

Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide.

Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks much simpler than I was afraid it would be. Big thanks to the maintainer of pathspec!

In some local testing, this seems to work just fine with partial parsing. Only real risk here is degraded parsing performance—adding more overhead to read_files_elapsed—which I haven't observed to a noticeable degree.

We'll want to document this, I just opened an issue so we don't lose track: dbt-labs/docs.getdbt.com#2043

@@ -1,3 +1,5 @@
import os
import pathspec # type: ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New dependency to add in setup.py. Checked and confirmed that license is good to use

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!!

@ChenyuLInx ChenyuLInx marked this pull request as ready for review September 21, 2022 21:36
@ChenyuLInx ChenyuLInx requested review from a team as code owners September 21, 2022 21:36
reobj = re.compile(regex, re.IGNORECASE)

for relative_path_to_search in relative_paths_to_search:
# potential speedup for ignore_spec
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some potential things we can do, not so sure we need to do them before we run into a performance issue here

[".sql"],
ParseFileType.Macro,
saved_files,
dbt_ignore_spec,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't really like the pass through here but we either do this or need to refactor the whole read files part of code I think

Copy link
Contributor

@gshank gshank left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The signatures for the 'read_files' calls are getting kind of long, but I suspect that we'll be doing some major refactoring in that area sometime in the next year, so I think we can leave it until then to deal with it.

# if ignore_spec.matches(relative_path_to_search):
# continue
absolute_path_to_search = os.path.join(root_path, relative_path_to_search)
walk_results = os.walk(absolute_path_to_search)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should really avoid using os.walk or os.path in favor of Pathlib. All the os.path stuff is old AF and has weird operating system specific gotchas.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree! I think we probably leave it to the larger refactor? Same as the signatures for read_files calls are getting long. I also don't like that but feels like right now might be okay to leave it.

@ChenyuLInx ChenyuLInx merged commit 207cc03 into main Sep 22, 2022
@ChenyuLInx ChenyuLInx deleted the feature/dbtignore branch September 22, 2022 16:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CT-1102] [Feature] .dbtignore

5 participants