Accelerate your Form Recognizer solution to production with this Solution Accelerator, which leverages an Azure Function and a set of Logic Apps to split multi-page PDF files to single-page PDF files and sends individual PDF files to the REST API endpoint of a trained custom document model in Form Recognizer.
This solution implements two capabilities that are commonly required when working with a trained custom document model:
- Splitting multi-page PDF documents into individual, single-page PDF documents
- Analyzing the results of documents sent to the Form Recognizer REST API endpoint of a trained custom document model
Please reference this blog post for detailed, step-by-step instructions for how to implement this solution. We are also actively working on organizing the same step-by-step instructions in this repository.
Step 1: Deploy core resources to Azure
Using the below button, six Azure services will be deployed:
- Storage account
- Function app
- App Service plan
- Form Recognizer
- Logic app (x2)
Step 2: Create containers & upload data
Download sample data from this repository and upload it into the new containers you create.
Step 3: Train custom document model
Open the Form Recognizer Studio and train a custom document model.
Deploy open-source Python code to your Function App to split multi-page PDF files.
Create a Logic App to call your Azure Function App and save individual PDF files based on a multi-page PDF file input.
Step 6: Configure Logic App to send single-page PDF document data to REST API endpoint of trained custom document model
Leverage the REST API endpoint of a trained custom document model in Form Recognizer.
Step 7: Verify the results
Upload a multi-page PDF file and verify that the first Logic App produces single-page PDF files. Then, verify that the second Logic App sends each file to the custom model endpoint in Form Recognizer and saves the resulting JSON.