Skip to main content

Azure OCR

Description

Use the step to extract text and data from PDF and image files using Azure Document Intelligence. You can route these extraction requests directly to Azure endpoints or securely through the AE Gateway.

Prerequisites:

  1. Mandatory to create a Document Intelligence resource in the Azure portal. refer- https://portal.azure.com/#home
  2. Obtain your Key and Endpoint from the Resource Management section in Azure, see following image:
AzureOCR

Processing Details: • Polling and Retries: Document processing is asynchronous. When you submit a request, the step waits for the service to finish. If the document requires more time, the step makes additional attempts to retrieve the result. If processing remains incomplete after these attempts, the step raises an exception. • Supported API Versions: 2024-11-30 and 2023-07-31.

AzureOCR

Example Usage: specify the Input File Path parameter contains the file path, to extract data from a specific page, enter 1 in the Pages field. Specify en for the Locale, and barcodes are decoded if present in the document. Please provide the API Endpoint and Key created on the Azure portal. Here, a prebuilt-read model is utilized for text extraction.

AzureOCR

Common Troubleshooting:

  • 401 Access Denied: Your subscription key or API endpoint is invalid. Verify your key and ensure you are using the correct regional API endpoint for your Azure resource.
{
"error": {
"code": "401",
"message": "Access denied due to invalid subscription key or wrong API endpoint.
Make sure to provide a valid key for an active subscription and use a correct regional API endpoint for your resource."
}
}
  • ModelNotFound: The requested model does not exist. Verify that you entered the correct prebuilt model name (e.g., prebuilt-document) or custom model ID.
{
"error": {
"code": "NotFound",
"message": "Resource not found.",
"innererror": {
"code": "ModelNotFound",
"message": "The requested model was not found."
}
}
}

To learn more about error code, refer following: https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/v3-error-guide?view=doc-intel-4.0.0

Reference Links:
API – Azure POST Request
https://learn.microsoft.com/en-us/rest/api/aiservices/document-models/analyze-document?view=rest-aiservices-v4.0%20(2024-11-30)&viewFallbackFrom=rest-aiservices-2023-07-31&preserve-view=true&tabs=HTTP#contentformat

API – Azure GET Request
https://learn.microsoft.com/en-us/rest/api/aiservices/document-models/get-analyze-result?view=rest-aiservices-v4.0%20(2024-11-30)&tabs=HTTP

Azure Document Intelligence Service
https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/?view=doc-intel-4.0.0

Configurations

No.Field NameDescription
1Step NameName of the step. This name must be unique in a single workflow. The field is mandatory.

Input tab

Use the Input tab to specify the document file path and define your extraction preferences, such as specific pages or language hints.

No.Field NameDescription
1Input File PathSpecify the local path to the file you want to process. Use workflow variables (e.g., ${invoicePath}) to dynamically pass file paths. The field accepts static or environment variable value and the field is mandatory.
2Pages (Ex 1..2,4)Specify a 1-based list of page numbers to analyze. Example: 1-3,5 or 7-9. The field is mandatory.
3LocaleSpecify a language hint to improve text recognition using a language code (e.g., en) or BCP 47 language tag (e.g., en-US). Leave blank to auto-detect languages. The field is mandatory.
4FeaturesSpecify specific features to extract. For example:
• Enter barcodes to decode barcodes and QR codes.
• Enter keyValuePairs to extract text as key-value pairs
Note: This requires using prebuilt-layout or prebuilt-document in the ModelId field. The field is mandatory.

Connection

Use the Connection tab to configure network routing, authentication credentials, and API behaviors.

Field NameDescription
Use AE GatewaySelect to route requests through the AE Gateway. Enabling this disables the Key field and uses the gateway token for authentication.
Notes:
• If selected, the Key field is disabled.
• Supports Google Vertex AI and Azure OpenAI LLM providers.
OCR Analyze EndpointSpecify the Azure Document Intelligence endpoint URL. If using the AE Gateway, specify the gateway URL.
The field is mandatory.
OCR Result Retrieval EndpointSpecify the AE Gateway URL used to retrieve the final analysis results.
Note: The field is available only if the Use AE Gateway checkbox is selected.
KeySpecify your Azure Document Intelligence service key.
If the checkbox Accept Value as variable/static is selected, then the password field appears as a text box and accepts static or variable values.
OR
If the checkbox Accept Value as variable/static is cleared, then the password field appears as a dropdown in which you can select a field from the previous steps. The field is mandatory.
API VersionSpecify the Azure API version to use. Supported versions are V4(2024-11-30) and V3.1(2023-07-31). Default value: 2024-11-30
Retry CountSpecify the retry Count for Azure API. Default value: 6. The field is mandatory.
Retry Delay In SecondsSpecify the Azure API Retry Delay. Default value: 2. The field is mandatory.
ModelIdSpecify the type of model to use for OCR. For a list of supported models, see the Azure Document Intelligence overview. The field is mandatory.

Headers

Add custom HTTP headers to include with every API call.

Field NameDescription
Header KeySpecify a unique HTTP header key (for example, api-version, api-key, or x-tenant-id).
Important: Do not add an Authorization key if you use AE Gateway mode. The system automatically sets the Authorization: Bearer <gatewayToken> header. Adding it manually causes a duplicate-header error.
Note: Header key must not be null/empty. Duplicate header keys are rejected.
Header ValueSpecify the corresponding value for the header key.
Note: Header value must not be null/empty.

Output tab

Use the Output tab to define the workflow variables where the system saves the extracted results.

Field NameDescription
JSON Output VarSpecify the variable name to store the complete JSON response. Default value: JSONResult
Text Output VarSpecify the variable name to store the extracted plain text. Default value: OCRTextResult