-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated _parse_operation_id regex to allow for non-standard Document Intelligence Endpoints #38696
base: main
Are you sure you want to change the base?
Conversation
…Intelligence endpoints.
Thank you for your contribution @jcbartle! We will review the pull request and get back to you soon. |
API change check API changes are not detected in this pull request. |
/azp run python - documentintelligence - tests |
Azure Pipelines successfully started running 1 pipeline(s). |
Thanks for the contribution @jcbartle! Could you possibly add a unit test for this change? |
Thanks for the contribution @jcbartle - we may need to wait for the SDK maintain (@YalinLi0312 ) to return from vacation to provide us with some context on why a regex was being used here. Looking at the code it seems to me that this should just be using url parsing to extract that path, probably something like this should work? |
Good morning, @annatisch. Thank you for your response. I actually like your suggestion over mine, as it's more clear what is happening than with a regular expression. @YalinLi0312 - how do you feel about Anna's proposed change? |
Hey, @jcbartle. I think the urlparse approach would be preferred, as it is indeed more clear (and also more a bit quicker vs regex parsing). Are you still able to make the change? If not, one of us can take over, as this is still something that should be fixed. |
@pvaneck - good evening. I'm happy to make the changes, but I want to ensure I'm giving proper credit. @annatisch came up with the better code, but I discovered the "bug". What's the preferred path forward? |
@jcbartle, thanks for the willingness and considering attribution! For this, you can just go ahead and make the change as suggested since you found the issue, initiated the original change, and are reacting to reviewer feedback. To make this ready for merge, can you do the following:
|
Hey @jcbartle, checking in again. Still able to contribute here? |
Description
The regex within the _parse_operation_id method assumes the following URL format for Document Intelligence endpoints when interpreting the Operation-Location header value:
https://xxxxxxxx.cognitiveservices.azure.com/
The full Operation-Location header value would be something like (i.e., only one forward-slash character between the domain and "documentintelligence"):
https://xxxxxxxx.cognitiveservices.azure.com/documentintelligence/documentModels/prebuilt-read/analyzeResults/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx?api-version=2024-07-31-preview
While this works fine when accessing Document Intelligence resources directly, this will break if the endpoint is abstracted away behind an API gateway (in our case, Azure API Management). In this configuration, your endpoint value is probably something like this:
https://apim.example.com/azure/document-intelligence/
With the full Operation-Location header value being something like (i.e., more than one forward-slash character between the domain and "documentintelligence"):
https://apim.example.com/azure/document-intelligence/documentintelligence/documentModels/prebuilt-read/analyzeResults/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx?api-version=2024-07-31-preview
Adjusting the regex to allow for any character (as opposed to any character except forward-slashes) between https:// and /documentintelligence/ enables this scenario to work correctly and, in my testing, did not impact the use of standard endpoint URLs.
All SDK Contribution checklist:
General Guidelines and Best Practices
Testing Guidelines