Skip to main content

Utility: Digital PDF

Description

Digital PDF is a Java utility method that creates a searchable digital PDF by combining Azure OCR JSON output with the original source file. Use this utility within a User Defined Java Class step when your workflow needs to convert scanned images or non-searchable PDFs into text-searchable PDF documents. It takes the Azure OCR JSON (generated by the Azure OCR plugin), the original input file (PDF, JPEG, JPG, or PNG), and an optional DPI setting, then produces a digital PDF at the specified output path.

Example:
Please find the below sample code to access Digital PDF Utility method using User defined java class.

import com.automationedge.ps.docedge.utils.*;
//Create utility class object
PDFUtils rpdf = new PDFUtils();
String azureJson = get(Fields.In, "azureJson").getString(r);
String inputFile="E:\\Files\\Issues\\replacePDF\\data\\Aadhar1.pdf";
String outputFile="E:\\Files\\Issues\\replacePDF\\data\\Aadhar1_digital.pdf";
try {
//Call function to convert azure JSON to digital PDF
String outputfilePath=rpdf.convertAzureJsonToDigitalPDF(inputFile, azureJson, outputFile,200);

} catch (Exception e) {
throw e;
}

Method Input / Output Parameters

No.Field NameData typeSupported FormatDescription
1fileStringPdf,jpeg,jpg,pngInput file path
2azureJsonStringNAAzure OCR Json. We can generate it using Azure OCR plugin.
3outputFileStringpdfOutput file path
4dpifloatNADPI value for internal conversion opration. It is optional parameter and its default value is 200.