Skip to content

Conversation

@geographika
Copy link
Contributor

Running on single files on their own, or multiple files was fine, but as soon as an __init__.py file (even an empty one) was present the docstub run . --no-cache command hangs.

I thought it may be crashing out due to the size of the library, but after debugging, it appears there is a break missing in the module_name_from_path function.

We are looking at generating stub files for the GDAL project - thanks for creating this tool.

@lagru lagru added the fix Addresses regressions & bugs label Oct 10, 2025
@lagru
Copy link
Member

lagru commented Oct 10, 2025

Hmm, this fails Test_module_name_from_path::test_basic.

Thanks for the feedback and it's great that GDAL is trying out docstub. Very happy to listen to good and bad experiences. :)

if is_in_package:
name_parts.insert(0, directory.name)
directory = directory.parent
break
Copy link
Member

@lagru lagru Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the failing test, this seems like it may not be the correct fix.

Could you perhaps provide more context on how the original bug occurred? Maybe we can construct a minimal reproducing example for it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least in the context of the test case, it seems to deal fine with the presence of a __init__.py.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In OSGeo/gdal#13198, in what directory did you run docstub? It's meant to be passed the path to a package, including the root directory of that package.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah - that explains it. I'm running docstub on Python files generated from SWIG from within the same folder, along with an __init__.py file. I've updated the PR to add a check to avoid hanging if run in the same folder, but it works fine pointing to the package folder too with: docstub run ./osgeo --no-cache --config C:\docs\gdal\swig\python\pyproject.toml.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me it works if an absolute path is provided to docstub, but if I run it from the module directory itself (with docstub run . then we get an infinite loop because the parent of . is ., apparently.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That fits. I tried to make docstub preserve relative paths because that makes output and (error) messages a bit more readable. But absolute paths are the more robust option.

I definitely think docstub should handle this case more gracefully: docstub run . where . is inside a package.

Is there some particular behavior you would expect in that case? I'm thinking about that myself right now.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could convert the relative path to an absolute path with path = path.resolve() near the top of the function?

Copy link
Member

@lagru lagru Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I'm more asking if it makes sense that docstub supports running inside or only on part of a Python package at all. Which types are matched depends on what types are collected throughout the package.

Right now I'm thinking, support this use case but warn that running on partial packages may lead to incomplete results.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I'm more asking if it makes sense that docstub supports running inside or only on part of a Python package at all. Which types are matched depends on what types are collected throughout the package.

Right now I'm thinking, support this use case but warn that running on partial packages may lead to incomplete results.

For info, I get the same output if I run within the folder (using . which works with the latest commit), or outside with a full or relative path (./osgeo). All .py files are in the same folder though, so maybe this is more relevant to projects with .py files in different subfolders?

Both approaches require some types to be added to the pyproject.toml (I'm not sure why as the same types in some docstrings don't throw errors): https://github.com/OSGeo/gdal/blob/b8d0f72f306e7fc0b5a511d96221277797301be5/swig/python/pyproject.toml#L47

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand how the osgeo package is structured or populated. I assume something like the osgeo.osr module is created during build time? So I'm not sure if docstub is missing something or this is simply a setup that docstub can't really support.

@geographika
Copy link
Contributor Author

Thanks for the feedback and it's great that GDAL is trying out docstub. Very happy to listen to good and bad experiences. :)

We didn't realise the GDAL Python docstrings had quite so many issues until running numpydoc lint and then docstub!
Once these were cleaned-up the stubs were generated successfully. We hope to add docstub to the CI to do these automatically - it seems perfect for the GDAL use-case, as a lot of time has gone into docstrings and parameters, so trying to add annotations manually to the code would be too time-consuming.

lagru added 2 commits October 18, 2025 15:24
`Path(".").parent` can't move past the "." in a relative path and will
just return `Path(".")` again. This lead to `while True` never breaking.

To fix this, we make sure that absolute paths are used. We need to
resolve the `path` before `lru_cache` sees it. Otherwise,  `lru_cache`
might return a wrong cached result in case the current working directory
changes. That should never happen in docstub, but I think it's still
good to be defensive here.

This change also adds a few other defensive guards and asserts.

Long term, it might be the least error-prone to resolve all paths to
absolute ones as soon as possible. However, we'd have to do some
additional work to shorten paths that are within the current working
directory. Otherwise, users might see unnecessarily long paths in
their output.
Copy link
Member

@lagru lagru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I finally got back to working on this. I took the liberty to implement and push a hopefully more robust fix that also accounts for this function using lru_cache which might interfere with relative paths.

I also added a warning in case docstub is invoked on a subpackage only. That should hopefully clear up some confusion. In my book we can merge this if the CI passes. Let me know if this addresses the bug on you side.

Off topic: If I get time I might look into using only absolute paths in docstub. That should the most robust option. But it requires additional work to shorten paths again before showing them to users.

@lagru lagru added this to the v0.5 milestone Oct 18, 2025
@geographika
Copy link
Contributor Author

@lagru - I tested this locally, and it no longer hangs when run in a subdirectory. Thanks for looking into this.

I've created a PR to add docstub to the GDAL CI (OSGeo/gdal#13270) to check all annotations are valid. Next steps will be to look into refining the stub files.

@lagru
Copy link
Member

lagru commented Oct 23, 2025

Good to know and thanks for testing! Merging now. I plan to ship it in a release soonish.

@lagru lagru merged commit d2ba71d into scientific-python:main Oct 23, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fix Addresses regressions & bugs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants