-
Notifications
You must be signed in to change notification settings - Fork 849
Add optional cache to whoami
#3568
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Wauplin
wants to merge
3
commits into
main
Choose a base branch
from
add-caching-to-whoami
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -75,6 +75,7 @@ | |||||||||||||||||||||||||||||||
| BadRequestError, | ||||||||||||||||||||||||||||||||
| GatedRepoError, | ||||||||||||||||||||||||||||||||
| HfHubHTTPError, | ||||||||||||||||||||||||||||||||
| LocalTokenNotFoundError, | ||||||||||||||||||||||||||||||||
| RemoteEntryNotFoundError, | ||||||||||||||||||||||||||||||||
| RepositoryNotFoundError, | ||||||||||||||||||||||||||||||||
| RevisionNotFoundError, | ||||||||||||||||||||||||||||||||
|
|
@@ -1694,6 +1695,9 @@ def __init__( | |||||||||||||||||||||||||||||||
| self.headers = headers | ||||||||||||||||||||||||||||||||
| self._thread_pool: Optional[ThreadPoolExecutor] = None | ||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||
| # /whoami-v2 is the only endpoint for which we may want to cache results | ||||||||||||||||||||||||||||||||
| self._whoami_cache: dict[str, dict] = {} | ||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||
| def run_as_future(self, fn: Callable[..., R], *args, **kwargs) -> Future[R]: | ||||||||||||||||||||||||||||||||
| """ | ||||||||||||||||||||||||||||||||
| Run a method in the background and return a Future instance. | ||||||||||||||||||||||||||||||||
|
|
@@ -1735,39 +1739,76 @@ def run_as_future(self, fn: Callable[..., R], *args, **kwargs) -> Future[R]: | |||||||||||||||||||||||||||||||
| return self._thread_pool.submit(fn, *args, **kwargs) | ||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||
| @validate_hf_hub_args | ||||||||||||||||||||||||||||||||
| def whoami(self, token: Union[bool, str, None] = None) -> dict: | ||||||||||||||||||||||||||||||||
| def whoami(self, token: Union[bool, str, None] = None, *, cache: bool = False) -> dict: | ||||||||||||||||||||||||||||||||
| """ | ||||||||||||||||||||||||||||||||
| Call HF API to know "whoami". | ||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||
| If passing `cache=True`, the result will be cached for subsequent calls for the duration of the Python process. This is useful if you plan to call | ||||||||||||||||||||||||||||||||
| `whoami` multiple times as this endpoint is heavily rate-limited for security reasons. | ||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||
| Args: | ||||||||||||||||||||||||||||||||
| token (`bool` or `str`, *optional*): | ||||||||||||||||||||||||||||||||
| A valid user access token (string). Defaults to the locally saved | ||||||||||||||||||||||||||||||||
| token, which is the recommended method for authentication (see | ||||||||||||||||||||||||||||||||
| https://huggingface.co/docs/huggingface_hub/quick-start#authentication). | ||||||||||||||||||||||||||||||||
| To disable authentication, pass `False`. | ||||||||||||||||||||||||||||||||
| cache (`bool`, *optional*): | ||||||||||||||||||||||||||||||||
| Whether to cache the result of the `whoami` call for subsequent calls. | ||||||||||||||||||||||||||||||||
| If an error occurs during the first call, it won't be cached. | ||||||||||||||||||||||||||||||||
| Defaults to `False`. | ||||||||||||||||||||||||||||||||
| """ | ||||||||||||||||||||||||||||||||
| # Get the effective token using the helper function get_token | ||||||||||||||||||||||||||||||||
| effective_token = token or self.token or get_token() or True | ||||||||||||||||||||||||||||||||
| token = token or self.token | ||||||||||||||||||||||||||||||||
| if token is False: | ||||||||||||||||||||||||||||||||
| raise ValueError("Cannot use `token=False` with `whoami` method as it requires authentication.") | ||||||||||||||||||||||||||||||||
| if token is True or token is None: | ||||||||||||||||||||||||||||||||
| effective_token = get_token() | ||||||||||||||||||||||||||||||||
| if effective_token is None: | ||||||||||||||||||||||||||||||||
| raise LocalTokenNotFoundError( | ||||||||||||||||||||||||||||||||
| "Token is required to call the /whoami-v2 endpoint, but no token found. You must provide a token or be logged in to " | ||||||||||||||||||||||||||||||||
| "Hugging Face with `hf auth login` or `huggingface_hub.login`. See https://huggingface.co/settings/tokens." | ||||||||||||||||||||||||||||||||
| ) | ||||||||||||||||||||||||||||||||
| token = effective_token | ||||||||||||||||||||||||||||||||
|
Comment on lines
+1764
to
+1771
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (nit) not sure if we need an intermediate variable here
Suggested change
|
||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||
| if cache: | ||||||||||||||||||||||||||||||||
| if token in self._whoami_cache: | ||||||||||||||||||||||||||||||||
| return self._whoami_cache[token] | ||||||||||||||||||||||||||||||||
|
Comment on lines
+1773
to
+1775
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. just an excuse to use more Walrus operators in the codebase :)
Suggested change
|
||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||
| # Call Hub | ||||||||||||||||||||||||||||||||
| output = self._inner_whoami(token=token) | ||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||
| # Cache result and return | ||||||||||||||||||||||||||||||||
| if cache: | ||||||||||||||||||||||||||||||||
| self._whoami_cache[token] = output | ||||||||||||||||||||||||||||||||
| return output | ||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||
| def _inner_whoami(self, token: str) -> dict: | ||||||||||||||||||||||||||||||||
| r = get_session().get( | ||||||||||||||||||||||||||||||||
| f"{self.endpoint}/api/whoami-v2", | ||||||||||||||||||||||||||||||||
| headers=self._build_hf_headers(token=effective_token), | ||||||||||||||||||||||||||||||||
| headers=self._build_hf_headers(token=token), | ||||||||||||||||||||||||||||||||
| ) | ||||||||||||||||||||||||||||||||
| try: | ||||||||||||||||||||||||||||||||
| hf_raise_for_status(r) | ||||||||||||||||||||||||||||||||
| except HfHubHTTPError as e: | ||||||||||||||||||||||||||||||||
| if e.response.status_code == 401: | ||||||||||||||||||||||||||||||||
| error_message = "Invalid user token." | ||||||||||||||||||||||||||||||||
| # Check which token is the effective one and generate the error message accordingly | ||||||||||||||||||||||||||||||||
| if effective_token == _get_token_from_google_colab(): | ||||||||||||||||||||||||||||||||
| if token == _get_token_from_google_colab(): | ||||||||||||||||||||||||||||||||
| error_message += " The token from Google Colab vault is invalid. Please update it from the UI." | ||||||||||||||||||||||||||||||||
| elif effective_token == _get_token_from_environment(): | ||||||||||||||||||||||||||||||||
| elif token == _get_token_from_environment(): | ||||||||||||||||||||||||||||||||
| error_message += ( | ||||||||||||||||||||||||||||||||
| " The token from HF_TOKEN environment variable is invalid. " | ||||||||||||||||||||||||||||||||
| "Note that HF_TOKEN takes precedence over `hf auth login`." | ||||||||||||||||||||||||||||||||
| ) | ||||||||||||||||||||||||||||||||
| elif effective_token == _get_token_from_file(): | ||||||||||||||||||||||||||||||||
| elif token == _get_token_from_file(): | ||||||||||||||||||||||||||||||||
| error_message += " The token stored is invalid. Please run `hf auth login` to update it." | ||||||||||||||||||||||||||||||||
| raise HfHubHTTPError(error_message, response=e.response) from e | ||||||||||||||||||||||||||||||||
| if e.response.status_code == 429: | ||||||||||||||||||||||||||||||||
| error_message = ( | ||||||||||||||||||||||||||||||||
| "You've hit the rate limit for the /whoami-v2 endpoint, which is intentionally strict for security reasons." | ||||||||||||||||||||||||||||||||
| " If you're calling it often, consider caching the response with `whoami(..., cache=True)`." | ||||||||||||||||||||||||||||||||
| ) | ||||||||||||||||||||||||||||||||
| raise HfHubHTTPError(error_message, response=e.response) from e | ||||||||||||||||||||||||||||||||
| raise | ||||||||||||||||||||||||||||||||
| return r.json() | ||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a change in behavior, right? Previously, if
token=False, we could still get the token fromget_token(), but now we raise an exception?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, since
token=Falseexplicitly means "do not retrieve the token locally". The breaking change is that previously it would raise an HTTP 401 unauthorized issue, and now aValueError. I don't believe though that it's better like this to fail early.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I had to step away earlier but to clarify, I do agree that this change is the way to go and is inline with the docstring. But what I was saying was that previously, you would actually use the locally-saved token, even if a user passed
token=False, becausetoken or self.token or get_token()would keep going until a truthy value was found, so the logic would still fall back to a local token viaget_token().Unless I'm misunderstanding something, this would be a breaking change -- although I do agree that this is the right way to go. In fact, maybe a nit but I'd say we go even farther, and replace
token = self.tokenwithtoken = self.token if token is None else token, because again, this would mean that if a user doeswhoami(token=False), but has atokensaved to the class would end up using that token, which is contrary to the docstring.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^agree that we should do
token = self.token if token is None else tokensincetokencan beFalse(as we do in_build_hf_headers).