Skip to main content

Surface Actions: Configure OCR provider

  1. In the step, click Configure Provider, the Surface Actions dialog appears.

alt text

  1. In the Tool list, select the OCR provider from the list. Available options are:
  • Tesseract
  • Google Vision
  • Microsoft Azure

alt text

  1. The configuration fields will change depending on the tool you select: Tesseract
Important

Before configuring Tesseract, download the required locale (.traineddata) file from the web. The System Administrator can upload the file through File Management → Files on the AutomationEdge Server. English language is available by default.

For Tesseract, configure the following field details:

Field NameDescription
LocaleSelect the required language for OCR processing from the list.

Notes:
• By default, only the English language file, that is, eng.traineddata is included in the tool.
• To support other language(s) in the Locale list, place additional language files in the tessdata folder within the folder where the JAR is placed.
For example, hin.traineddata for Hindi. The languages of added files will be available in the Locale list.
• Make sure the language file follows the format and is compatible with Tesseract OCR.

Allowed format: <lang-code>.traineddata

For details, see: https://github.com/tesseract-ocr/tessdata/tree/main

Google Vision For Google Vision, configure the following field details:

alt text

References:

Field NameDescription
Api KeySpecify valid Azure Document Intelligence service key.

Note:
• Select Accept value as variable/static to use the API key as a static value or from an environment variable in the API key box.
• Clear the checkbox to use the API key from the previous step.
LocaleSpecify the language code, for example, en.

Reference link: https://cloud.google.com/translate/docs/languages

Microsoft Azure

For Microsoft Azure, configure the following field details:

alt text

To create resources, see: https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/how-to-guides/create-document-intelligence-resource?view=doc-intel-4.0.0

For Microsoft Azure, the following field will be available to configure:

Field NameDescription
Api KeySpecify valid Azure Document Intelligence service key.

Notes:
• Select Accept value as variable/static to use the API key as a static value or from an environment variable in the API key box.
• Clear the checkbox to use the API key from the previous step.
API EndpointSpecify the valid Document Intelligence service endpoint.

Notes:
• Select Accept value as variable/static to use the API Endpoint as a static value or from an environment variable in the API key box.
• Clear the checkbox to use the API key from the previous step.
LocaleSpecify the language code, for example, en.
Or
Specify a BCP 47 language tag, for example, en-US.

For details, see: https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/language-support/ocr?view=doc-intel-4.0.0&tabs=read-print%2Clayout-print%2Cgeneral