Google Vision
Description
OCR: Google Vision plugin step detects and extracts text from an image and provides text output in JSON format.
Prerequisites:
- Create a Google Cloud Vision API key: https://cloud.google.com/docs/authentication/api-keys?hl=en&visit_id=637051029162974596-3924725435&rd=1#creating_an_api_key
- Add restrictions to API keys: https://cloud.google.com/docs/authentication/api-keys#api_key_restrictions
- Fill the details under the following as seen in the snapshot below, Billing -> Payment Settings and Billing -> Payment Method for API Key to work.
Configurations
No. | Field Name | Description |
---|---|---|
1 | Step Name | Name of the step. This name has to be unique in a single workflow. |
2 | API Key | |
3 | Accept Value as variable/static | Leave checkbox unchecked to accept API Key value from a field in the previous steps of the stream using a drop down list. Else enable checkbox for API Key field to appear as Text box. |
4 | API Key | Specify the API Key for authentication to Google Cloud Platform. This field is mandatory. API Key is encrypted and is not stored in the .psw file. API Key is entered using a widget. The widget handles both Text (static value or environment variable) and Combo (drop down containing values from previous steps). If checkbox above is enabled API Key field appears as Text box. Else if checkbox above is disabled API Key field appears as a drop down to select fields from previous steps. |
10 | Button: Test Connection | Test connection with the API provided. Verifies whether the connection is available or not. Note: If the connection fields are provided from previous step, then Test Connection Button does not work. |
Input Fields | ||
1 | Path/URL | Specify the path of the image file to be converted to text or click the Browse button to browse the file path. |
2 | Button: Browse | Clicking on this button brings up the dialog to browse the image file to be converted to text format. |
3 | Type | Specify an annotation feature that support optical character recognition (OCR). Specify one of the following annotation features: - ‘‘DOCUMENT_TEXT_DETECTION’ also extracts text from an image, but the response is optimized for dense text and documents. The JSON includes page, block, paragraph, word, and break information. - TEXT_DETECTION’ detects and extracts text from any image. For example, a photograph might contain a street sign or a traffic sign. The JSON includes the entire extracted string, as well as individual words, and their bounding boxes. - ‘CROP_HINTS’ suggests vertices for a crop region on an image. - ‘FACE_DETECTION’ detects multiple faces within an image along with the associated key facial attributes such as emotional state or wearing headwear. - The ‘IMAGE_PROPERTIES’ feature detects general attributes of the image, such as dominant color. - ‘LABEL_DETECTION’ detects and extracts information about entities in an image, across a broad group of categories. Labels can identify general objects, locations, activities, animal species, products, and more. - ‘LANDMARK_DETECTION’ detects popular natural and human-made structures within an image. - ‘LOGO_DETECTION’ detects popular product logos within an image. - ‘OBJECT_LOCALIZATION’ detects multiple objects in an image and provides information about the objects and where the object was found in the image. - ‘SAFE_SEARCH_DETECTION’ detects explicit content such as adult content or violent content within an image. This feature uses five categories (adult, spoof, medical, violence, and racy) and returns the likelihood that each is present in a given image. - ‘WEB_DETECTION’ detects Web references to an image. Notes - Allowed values: ‘DOCUMENT_TEXT_DETECTION’, ‘TEXT_DETECTION’, ‘CROP_HINTS’, ‘FACE_DETECTION’, ‘IMAGE_PROPERTIES’, ‘LABEL_DETECTION’, ‘LANDMARK_DETECTION’, ‘LOGO_DETECTION’, ‘OBJECT_LOCALIZATION’, ‘SAFE_SEARCH_DETECTION’, ‘WEB_DETECTION. - You can also provide comma-separated (,) values. |
Output Fields | ||
1 | Result | Specify an output field to hold converted json text on successful plugin execution. The default value is OutputText. |