Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repo checks: check documented methods exist #32320

Merged
merged 3 commits into from
Sep 3, 2024

Conversation

gante
Copy link
Member

@gante gante commented Jul 30, 2024

What does this PR do?

Adds a check to confirm that publicly documented methods exist. For instance, commenting out a publicly documented method like DynamicCache.to_legacy_cache triggers this check, which in essence is a reminder to be extra careful with methods tagged as documented.

In the process, also removed some old, unused logic.

@gante gante requested a review from LysandreJik July 30, 2024 10:43
@@ -413,22 +413,15 @@ def get_model_modules() -> List[str]:
"modeling_auto",
"modeling_encoder_decoder",
"modeling_marian",
"modeling_mmbt",
"modeling_outputs",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we've changed the model structure since 4 years ago, when this list was first created: transformers.models exclusively contains folders with models.

As such, we don't need to manually ignore these files.

@@ -442,8 +435,7 @@ def get_model_modules() -> List[str]:
for submodule in dir(model_module):
if submodule.startswith("modeling") and submodule not in _ignore_modules:
modeling_module = getattr(model_module, submodule)
if inspect.ismodule(modeling_module):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

redundant check in our current folder structure

Comment on lines -910 to -922
for doc_file in Path(PATH_TO_DOC).glob("**/*.rst"):
with open(doc_file, "r", encoding="utf-8", newline="\n") as f:
content = f.read()
raw_doc_objs = re.findall(r"(?:autoclass|autofunction):: transformers.(\S+)\s+", content)
documented_obj += [obj.split(".")[-1] for obj in raw_doc_objs]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have rst doc files for quite a long time

@@ -1094,50 +1126,6 @@ def check_model_type_doc_match():
)


# Re pattern to catch :obj:`xx`, :class:`xx`, :func:`xx` or :meth:`xx`.
Copy link
Member Author

@gante gante Jul 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have rst doc files for quite a long time

"""Check all models are properly tested and documented."""
print("Checking all models are included.")
"""Check all models are tested and documented."""
print(" - checking all models are included.")
Copy link
Member Author

@gante gante Jul 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: with the addition of spacing, we can clearly see that these prints belong to check_repo.py in make fixup, as opposed to a blob of text 🤗

Screenshot 2024-07-30 at 11 48 10

@@ -1054,7 +1053,7 @@ def ignore_undocumented(name: str) -> bool:

def check_all_objects_are_documented():
"""Check all models are properly documented."""
documented_objs = find_all_documented_objects()
documented_objs, documented_methods_map = find_all_documented_objects()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I intend to use this new mapping in check_docstrings, to perform basic quality checks on the docstring of publicly documented methods

@gante
Copy link
Member Author

gante commented Jul 30, 2024

arf, this is getting docker image issues due to imports

@gante gante closed this Jul 30, 2024
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@gante gante reopened this Aug 2, 2024
@gante gante force-pushed the check_documented_methods branch 2 times, most recently from 9230164 to 69ef594 Compare August 7, 2024 15:36
@gante gante force-pushed the check_documented_methods branch 2 times, most recently from a72b443 to 2846753 Compare August 22, 2024 10:22
@LysandreJik
Copy link
Member

Cool tool @gante! LMK when you want me to take a look!

@gante
Copy link
Member Author

gante commented Sep 2, 2024

@LysandreJik the PR is now ready 🙌 (it was failing before because the check repository CI image didn't have all needed dependencies to check methods in objects)

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, clean! And love the indentation, I agree it makes everything cleaner.

Thanks for the comments, helped understanding how the code work.

@gante gante merged commit d6534f9 into huggingface:main Sep 3, 2024
8 checks passed
@gante gante deleted the check_documented_methods branch September 3, 2024 16:40
itazap pushed a commit to NielsRogge/transformers that referenced this pull request Sep 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants