Open
Description
- Package Name: azure-ai-documentintelligence
- Package Version: 1.0.0b4
- Operating System: Windows
- Python Version: 3.12.7
Describe the bug
poller.continuation_token() crashes if input file is passed as an octet-steam into initial begin_analyze_document call.
Exception is below:
Traceback (most recent call last):
File "C:\temp\SDK_issue_report.py", line 25, in <module>
poller_continuation_token = poller.continuation_token() # get continuation token FAILS
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\anatolip\AppData\Local\anaconda3\envs\py12\Lib\site-packages\azure\core\polling\_poller.py", line 224, in continuation_token
return self._polling_method.get_continuation_token()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\anatolip\AppData\Local\anaconda3\envs\py12\Lib\site-packages\azure\core\polling\base_polling.py", line 651, in get_continuation_token
return base64.b64encode(pickle.dumps(self._initial_response)).decode("ascii")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: cannot pickle 'BufferedReader' instances
To Reproduce
Code below illustrates that continuation_token() works if file is passed to begin_analyze_document as base64 string, but if file is passed as a octet-stream, get continuation_token() fails.
from azure.core.credentials import AzureKeyCredential
from azure.ai.documentintelligence import DocumentIntelligenceClient
from azure.ai.documentintelligence.models import AnalyzeDocumentRequest
import base64, pickle, os
endpoint = os.environ["AZURE_DI_ENDPOINT"]
key = os.environ["AZURE_DI_KEY"]
path_to_sample_documents = r'c:\temp\SampleInvoice26andLineItems.pdf'
client = DocumentIntelligenceClient(endpoint=endpoint, credential=AzureKeyCredential(key))
with open(path_to_sample_documents, "rb") as f:
poller = client.begin_analyze_document("prebuilt-layout", AnalyzeDocumentRequest(bytes_source=f.read()))
poller_continuation_token = poller.continuation_token() # get continuation token WORKS
decoded_token = pickle.loads(base64.b64decode(poller_continuation_token))
resultUrl = decoded_token.http_response.headers.get('Operation-Location')
print(f"Results URL from continuation_token: {resultUrl} \n")
with open(path_to_sample_documents, "rb") as f:
poller = client.begin_analyze_document("prebuilt-layout", f, content_type="application/octet-stream")
print(f"ERROR during continuation_token!!!\n")
poller_continuation_token = poller.continuation_token() # get continuation token FAILS
Metadata
Metadata
Assignees
Labels
This issue points to a problem in the data-plane of the library.This issue requires a change to an existing behavior in the product in order to be resolved.Issues that are reported by GitHub users external to the Azure organization.Workflow: This issue needs attention from Azure service team or SDK team