Description
openedon Jan 18, 2024
Is your feature request related to a problem? Please describe.
I want to analyze base64 encoded documents through the new azure-ai-documentintelligence package.
It seems like I should do something like this:
from azure.core.credentials import AzureKeyCredential
from azure.ai.documentintelligence import DocumentIntelligenceClient
from azure.ai.documentintelligence.models import AnalyzeResult, AnalyzeDocumentRequest
client = DocumentIntelligenceClient("my_endpoint", AzureKeyCredential("my_key"))
my_base64 = b"..."
poller = client.begin_analyze_document(
"prebuilt-layout",
analyze_request=AnalyzeDocumentRequest(base64_source=my_base64),
content_type="application/octet-stream",
)
But no matter what I do I'm greeted with the following error:
{
"code": "InvalidContent",
"message": "The file is corrupted or format is unsupported. Refer to documentation for the list of supported formats."
}
since the documentation for the AnalyzeDocumentRequest
Link says that base64_source
is of type Optional[bytes]
I thought it'd work like this.
However all of the samples use this method either through:
- Passing a _io.BufferedReader like so:
with open("path/to/local/file", "rb") as f:
poller = client.begin_analyze_document(
"prebuilt-layout",
analyze_request=f,
content_type="application/octet-stream",
)
- By passing a URL like so:
url = "https://some.url/Invoice_1.pdf"
poller = client.begin_analyze_document(
"prebuilt-layout",
AnalyzeDocumentRequest(url_source=url),
)
Describe the solution you'd like
I'd like to get examples/tutorials for all the different options to upload to the document intelligence through the newly released package.
Especially interesting to me is uploading a base64 bytes object.
Describe alternatives you've considered
I tried to use from io import BytesIO
and create a buffer like this and use the with BytesIO(my_base64) as buf
but this didn't work either