Utility: Digital PDF
Description
Digital PDF is a Java utility method that creates a searchable digital PDF by combining Azure OCR JSON output with the original source file. Use this utility within a User Defined Java Class step when your workflow needs to convert scanned images or non-searchable PDFs into text-searchable PDF documents. It takes the Azure OCR JSON (generated by the Azure OCR plugin), the original input file (PDF, JPEG, JPG, or PNG), and an optional DPI setting, then produces a digital PDF at the specified output path.
Example:
Please find the below sample code to access Digital PDF Utility method using User defined java class.
import com.automationedge.ps.docedge.utils.*;
//Create utility class object
PDFUtils rpdf = new PDFUtils();
String azureJson = get(Fields.In, "azureJson").getString(r);
String inputFile="E:\\Files\\Issues\\replacePDF\\data\\Aadhar1.pdf";
String outputFile="E:\\Files\\Issues\\replacePDF\\data\\Aadhar1_digital.pdf";
try {
//Call function to convert azure JSON to digital PDF
String outputfilePath=rpdf.convertAzureJsonToDigitalPDF(inputFile, azureJson, outputFile,200);
} catch (Exception e) {
throw e;
}
Method Input / Output Parameters
| No. | Field Name | Data type | Supported Format | Description |
|---|---|---|---|---|
| 1 | file | String | Pdf,jpeg,jpg,png | Input file path |
| 2 | azureJson | String | NA | Azure OCR Json. We can generate it using Azure OCR plugin. |
| 3 | outputFile | String | Output file path | |
| 4 | dpi | float | NA | DPI value for internal conversion opration. It is optional parameter and its default value is 200. |