-
Notifications
You must be signed in to change notification settings - Fork 736
Fix BaseGithubClient and _generate_documents #526
Conversation
Noticed a few quick issue with type checking and tests and figured I'd send this over. Let me know if you have any other ideas on how to handle it or if you have any questions about the original issues. Thanks for maintaining these. Very much appreciated. 🍹 |
if branch is None: | ||
if branch_name is None: | ||
raise ValueError("Either branch or branch_name must be provided.") | ||
branch = branch_name | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check is a nice one
but why can't you use branch_name directly? instead of adding a new parameter
await self.request(
"getBranch", "GET", owner=owner, repo=repo, branch=branch_name
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great question, and I likely should have gone into some detail about why that was done.
In the BaseGithubClient
class the get_branch
method parameter name is branch_name
https://github.com/emptycrown/llama-hub/blob/cdb6b30ce8fbb765f9b8ee5d47a5d23f51b2b9ba/llama_hub/github_repo/github_client.py#L196
In the GithubClient
class the get_branch
method parameter name is branch
(note it does not have the _name
)
https://github.com/emptycrown/llama-hub/blob/cdb6b30ce8fbb765f9b8ee5d47a5d23f51b2b9ba/llama_hub/github_repo/github_client.py#L325
GithubRepositoryReader
uses BaseGithubClient
as the type for github_client
to enforce the interface that a github client would need to conform to.
https://github.com/emptycrown/llama-hub/blob/cdb6b30ce8fbb765f9b8ee5d47a5d23f51b2b9ba/llama_hub/github_repo/base.py#L71
Here is how the GithubClient
class is used with GithubRepositoryReader
:
https://github.com/emptycrown/llama-hub/blob/cdb6b30ce8fbb765f9b8ee5d47a5d23f51b2b9ba/tests/tests_github_repo/test_github_reader.py#L876
github_client
would typically be GithubClient
which should conform to BaseGithubClient
but currently does not due to the parameter names. If we were just to update either BaseGithubClient
or GithubClient
to align with the other by changing branch
to branch_name
, we might cause an issue with code in the wild that is using the parameter name branch
or branch_name
as opposed to passing the argument in via position. That is the reason for adding both as Optional[str] = None
and then throwing a ValueError
if neither are set.
I understand it a bit of an odd situation and hope that helps to add some additional context for the change. There also may be a preferred way of dealing with this situation based your preferences as the maintainer and I'd be more than happy to adjust for that.
Thank you again for taking a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood now, thanks for the detailed answer!
@mkly sounds good, can you update with main please? |
* Add `branch` parameter to `get_branch` definition due to `GithubClient` implementation `get_branch` definition making use of `branch_name` as opposed to `branch`. This issue is most prevelent when type checker runs and they do not match * Throw exception if neither `branch` or `branch_name` is not defined. Potentionally could also add deprecation messaging. * Fix definition of `_generate_documents` id parameter. Default to empty string `""` as `os.path.join` will omit an empty string from the path list, but will fail on `None` * Fix `test__generate_documents` `url` value likely added after test addition * Fix `test_github_client` test to `loader_hub` as `llama_hub` did not exist at commit sha selected in test * Add `pytest-asyncio` dev dependency as this is required for async tests in `test_github_client` that make use of the `@pytest.mark.asyncio` decorator _Note_: Tests in `tests_github_repo` are currently all skipped unless the skip declaration is removed or commented out from `tests_github_repo/__init__.py`
9de7c02
to
b015af8
Compare
Nice one, Thank you! |
branch
parameter toget_branch
definition due toGithubClient
implementationget_branch
definition making use ofbranch_name
as opposed tobranch
. This issue is most prevelent when type checker runs and they do not matchbranch
orbranch_name
is not defined. Potentionally could also add deprecation messaging._generate_documents
id parameter. Default to empty string""
asos.path.join
will omit an empty string from the path list, but will fail onNone
test__generate_documents
url
value likely added after test additiontest_github_client
test toloader_hub
asllama_hub
did not exist at commit sha selected in testpytest-asyncio
dev dependency as this is required for async tests intest_github_client
that make use of the@pytest.mark.asyncio
decoratorNote: Tests in
tests_github_repo
are currently all skipped unless the skip declaration is removed or commented out fromtests_github_repo/__init__.py