Skip to main content

Microsoft Excel Input

Description

Microsoft Excel Input is an Input step in AutomationEdge workflows.

The Microsoft Excel Input step extracts data from Excel files and converts it into rows for AutomationEdge workflows. This step serves as the data source for automation, ETL, or reporting tasks by passing processed rows to subsequent workflow steps.

Key Capabilities:

  • Reads data from multiple Excel sheets and formats it into structured rows.
  • Processes large datasets and handles multiple files simultaneously.
  • Supports dynamic file names and specific error-handling scenarios.
  • Functions primarily as the initial step in a workflow to initiate data movement.

Configurations

Field NameDescription
Step nameSpecify a unique name for the step, which will help you identify and reference it easily when debugging or linking steps in the workflow.

For example, Read Employee Excel File

The field is mandatory.

Files Tab:Use the Files tab to select the Excel processing engine, define source file paths, and configure file filtering logic. You can specify static file paths or populate filenames dynamically from previous workflow steps.
Spread sheet type (engine)Select the engine type for the input file(s) that best matches your input file format and system resources.

- EXCEL Streamer: Select the option when you work with large .xlsx files. The engine uses significantly less memory than Apache POI Streaming.

- Excel 2007 XLSX (Apache POI Streaming): Select the option when you work with very large .xlsx files, run on low-memory machines, or need continuous data streaming.

    Note: Excel streaming engines, such as Apache POI Streaming, are used for large .xlsx files and consume less memory. Apache POI is faster but may fail on low-memory systems.
    Advantages:
    - Uses very little memory.
    - Provides stable performance.
    Limitations:
    - Slightly slower than the standard Apache POI engine.
    - Does not support all features (major features are supported).

- Excel 2007 XLSX (Apache POI): Select the option for standard .xlsx files. It provides fast reading performance, supports password-protected files, and works well for medium-sized datasets. This option requires more memory.

- Excel 97-2003 XLS (JXL): The engine serves as the default selection for legacy compatibility. Use it strictly for processing older Excel 97 through 2003 .xls files. It does not support the modern .xlsx format.

- Excel Binary XLSB: Select the option when you work with .xlsb (Excel Binary) files, especially when you process very large datasets (10k–1M+ rows).

    Advantages:
    - It reads and writes sheets slightly faster, making it suitable for large spreadsheets.
    Notes:
    - It provides fast performance but reads all columns as strings.
    - Some content and error-handling functions are disabled.
    - Use this option only when the dataset is very large and string-only columns are acceptable. You can change types in the Fields tab.
    - The step does not support runtime file type parameters or the Add filenames to result option.

- Open Office ODS (ODFDOM): Select the option only when your requirement specifically involves processing .ods files created in OpenOffice or LibreOffice.

Default selection: EXCEL 97-2003 XLS (JXL)

File or directorySpecify path of a file.

Or

Specify the path of a folder that can have multiple files.

Note: Add the file or folder path to the Selected Files table.

Button: AddClick Add to move the currently configured file or folder path, regular expression (wildcard or exclude) and passwords into the Selected files table.

Note: You must add the directories to the table. If not added, the step cannot read input file details such as sheet names, field names, and other information.

Button: FolderClick Folder to browse and select a folder.

Use the option to read multiple files from the selected folder.

Notes:
- The step reads files from the specified folder based on the regular expression you provide.
- The files must be of same type.
- In case password protected, all files must have same password.
Button: FileClick File to browse and select the file.

Note: You can select single file to read.

Regular ExpressionSpecify a regular expression to read file(s) from the specified folder.

Use when the folder is selected and includes only those files that matches the specified regex pattern.

For example,
- .*\.xlsx - Include only XLSX files
- Sales_.*\.xls - Files starting with Sales_
- 2023.* - Any file starting with 2023

Note: The regex is required to identify which files to process; without it, the files in the folder will not be read.

Exclude Regular ExpressionSpecify the regular expression to exclude the unwanted files from folder.

The step will not read files that matched the specified expression.

For example,
- .*backup\.xlsx - Skip backup files
- temp.* - Skip temporary files

Field excludes all files that match the pattern defined in the regular expression.

For example:
- To read all XLSX files, specify the regular expression: .*\.xlsx
- To exclude XLSX files that start with Backup, specify the exclude regular expression: Backup_.*\.xlsx
File Password?Specify the password if the file(s) is password protected.

The step uses the specified password to open secured password-protected XLSX files.

Notes:
This field is available only when the sheet type (engine) is set to:
- Excel 2007 XLSX (Apache POI) OR
- Excel 2007 XLSX (Apache POI Streaming).

Important:

When reading a folder, all files must have the same password. If any file has a different password, the step fails.
Selected filesTable shows the list of added file(s) or folder(s) path along with specified wildcard expressions (Regular or Exclude), password and other information.

You can also specify the file or folder path manually, wildcard expression, file password, whether the file is required, and whether it contains subfolders.

Notes:
- If a required file is not available or missing from specified location, the workflow fails.
- When using a folder, all files must be accessible, and all encrypted files must use the same password, else step fails.
RequiredSelect Y to fail the workflow if the file is missing. Select N to skip missing files without error.
Include SubfoldersSelect Y to search all subdirectories within a selected folder; otherwise, select N.
File Password?Click File Password? to open the password dialog.

Enter the password, and when you select OK, the encrypted password appears in the field.

DeleteClick Delete to remove the file or folder. You must first select the row you want to delete.
EditClick Edit to modify the selected file or folder details in table fields.
Accept filename from previous steps:Use this configuration to read filenames dynamically from a preceding step in your workflow, such as Generate Rows, Data Grid, Get File Names, and so on.

This allows the step to process files identified during runtime rather than relying solely on static file paths.

Accept filenames from previous stepSelect the checkbox to read file name from previous step. You can choose any input step that outputs file paths, such as Generate Rows, Data Grid, Get File Names and so on.
Step to read filenames fromSelect the step name from which you want to read the file names.
Field in the input to use as filenameSelect the field from the previous step that contains the input file path. The plugin uses this data to locate the file.
Field in the input to use as passwordSelect the field that contains the password for the specific file. Configure this field when you process multiple files that require different passwords.

Notes:

- The file name must be provided by the previous field.
- Leave the field blank for files that are not password protected.
- The field is enabled only when the sheet type is Excel 2007 XLSX (Apache POI Streaming) or Excel 2007 XLSX (Apache POI).
Button: Show filenamesClick Show filenames button to preview the final list of files. The list displays the result of combining manually added files, dynamic filenames from previous steps, and any applied regular expression filters.
Sheets Tab:Set which sheet to read and specify the starting point for reading.
Sheet nameSpecify sheet name of the input excel file.

You can specify sheet name as, a single sheet, Multiple sheets, Sheet name variables (for example, ${SHEET}),

Dynamic sheet names from a previous step, Sheet index (when selecting by index)

Note: If no sheets are specified, the step reads all sheets.

Select BySpecify how the step reads the sheet from the input file. Choose one of the following options:

Two options are available:

- Index: Read the sheet using order numbers (0, 1, 2).
- Name: Read the sheet by its name.

Step reads the sheet based on the provided name or index.

Start rowSpecify the row number to start reading from.

Default, step reads from row 0.

Note: You can also specify the start row as a variable or field.

Start columnSpecify the column number to start reading from. The step begins reading from this row.

For example,

- 0 = Column A
- 1 = Column B
- 2 = Column C

Notes:

- You can also specify the start row as a variable or field.
- Use this when you want to skip the first few columns or read only a specific portion of the sheet.
Button: Get Sheet name(s)Click to select and add required sheets to the list.

When you click, the Enter List window opens. Add the sheets to the Your Selection panel and click OK. The selected sheets then appear in the list.

Content Tab:Controls how the step reads data from the Excel file, including options for handling header, empty rows, limit, encoding and other content-related settings.
HeaderSelect the option to treat the first row of the input file as the header.

When selected:

- The step treats the first row as a header and excludes it from the output.
- The step extracts field names from the header row.

When not selected:

- The step treats the first row as data and includes it in the output.
No empty rowsSelect the checkbox to skip empty rows in the Excel file; otherwise, empty rows appear in the output.

The checkbox is selected by default.

Stop on empty rowSelect the checkbox to stop reading as soon as it finds the first completely empty row.

Use this when your sheet has an empty row after the valid data.

LimitSpecify the maximum number of rows to read from the file

For example:

- Limit = 100 → reads only the first 100 rows
- Limit = blank → reads all rows

This is useful for testing.

EncodingSelect the character encoding type from the list to ensure text renders correctly.

Change this setting only if the file uses a different specific encoding or if you observe incorrect characters (such as ) in the output.

Default value: UTF-8.

Enable BatchingSelect Enable Batching to read large XLSB files in smaller chunks rather than loading the entire file into memory.

Note: The checkbox is enabled only if Spread sheet type is set to Excel Binary (XLSB).

Batch SizeSpecify how many rows to read in each batch.

Specify the number of rows to process in each batch. This optimizes memory usage for large datasets.

For example, 10,000 → memory-friendly reading chunks.

Note: This field is available only when Enable Batching is selected.

Result filenames:
Add filenames to resultSelect this checkbox to add the filename of each processed file to the Workflow Result stream. This allows subsequent steps to access the list of processed files for logging or auditing purposes.
Error Handling Tab:Use the Error Handling tab to define how the step handles invalid or unreadable data during execution. You can configure the step to skip faulty rows, stop the workflow entirely, or log specific errors for debugging.

When configuring the Warning files, Error files, or Failing line numbers directories, you can specify the location using one of two methods:

- Variable: Enter an environment variable to define the path dynamically. This is useful for portability across different environments (for example, using ${Internal.Entry.Current.Directory} or project-level variables).
- Browse: Click Browse to select a static system directory from the local file system.
Strict types?Select the checkbox to enforce strict data type validation. The step reports an error if a value does not match the expected data type, such as text appearing in a numeric column. Use this option to ensure high data integrity for critical datasets like financial records.
Ignore errors?Select the checkbox to continue workflow execution even if data reading errors occur. The step ignores the error and replaces the specific incorrect values with null or default entries. Use this setting when maximum data extraction is preferred over strict accuracy and the data is not critical.
Skip error lines?Select the checkbox to completely exclude any row that contains an error from the output. The step discards the faulty row and continues processing the remaining valid rows. Use this option to ensure the final output contains only clean data.
Warning files directorySpecify the folder path to store warning logs. These files record non-fatal issues that do not stop execution but warrant review. You can browse to select a static system directory or use environment variables (such as internal directory or project name) to define the location dynamically.
Error files directorySpecify the folder path to store error logs. These files contain rows that caused fatal errors, accompanied by descriptions to assist in debugging. You can browse to select a static system directory or use environment variables to define the location dynamically.
Failing line numbers files directorySpecify the folder path to store files listing the line numbers of failed rows. These logs help you locate and highlight specific errors within the source Excel file. You can browse to select a static system directory or use environment variables to define the location dynamically.
Fields Tab (Important):Use the Fields tab to define the specific Excel columns to extract and configure their output properties. You can map columns by name or by index, and set data types, formatting, and text processing rules. This configuration determines the final structure of the data flowing to the next workflow step.
Read Based on Column Index:Select the checkbox to map input fields based on their numerical index (position) rather than the column header name.

Get fields from header row: Click this button to scan the first row of the Excel file. The step automatically detects the headers and populates the Get Fields Based on Column Index table with the corresponding names and default settings.

NameSpecify the name of the column as it appears in the output rows. If you use Get fields from header row, this defaults to the Excel header name.
TypeSelect the data type for the field to ensure correct processing.

Available data types:

- Number
- String
- Date
- Boolean
- Integer
- BigNumber
- Binary
- Timestamp
- Internet Address
LengthSpecify the maximum allowable length for the field. For strings, this represents the character count; for numbers, it represents the total number of significant digits.

For example, Set Length to 10 to allow a maximum of 10 characters, such as INV1234567.

PrecisionSpecify the number of decimal places allowed for numeric values.

For example, Set Precision to 2 to store values like 123.45

Trim TypeSelect a trimming method to remove unwanted whitespace from the data:

- None
- Left
- Right
- Both

RepeatSelect Y (Yes) to enable the repeat function. When enabled, the step fills empty cells with the value from the immediately preceding row. This is essential for processing vertically grouped data.

For Example:

If Apple appears once and the next rows are blank, if Repeat is set to Y, the step will repeat Apple for those rows.

FormatSpecify the format to parse the incoming data correctly.

The format is used to read or display field value based on the selected type.

For example, Select MM/dd/yyyy to read a date value like 12/25/2025 correctly.

CurrencySpecify the currency symbol used in the input data, such as ₹, $, or €.

For example, enter to display amounts as ₹5000.

DecimalSpecify the character used as the decimal separator, for example, a dot( . ) or a comma (,).
GroupingSpecify the character used to separate thousands or other distinct groups within large numbers. For example, a dot( . ) or a comma (,).
Read Based on Column Name:Select checkbox Read Based on Column Name to get inputs based on column name. If not selected, columns are read using their index (0, 1, 2...) and Get Fields Based on Column Index table is available.

When you click Get fields from header row, the Excel header names are automatically copied into the Get Fields Based on Column Name table.

Advantages:
- Column order changes do not break the workflow.
- Fields are easier to understand and maintain.

Notes:

- If this checkbox is not selected, the table uses column index instead of names.
- This feature is also supported for both standard Excel formats and binary .xlsb files.
Column NameSpecify the name of column.

The column name is the header text from the Excel file.

You can specify the column name in different ways, such as static value, field, parameter, or global/local PS variable.

The column name must follow the format: SheetName.ColumnName

For example: If the sheet name is Country and the column name is Capital:

- Create a parameter Param1 = Country.Capital
- Then use ${Param1} in the Get Fields Based on Column Name table. If the value is coming from a field, simply select that field in the table.
Field NameSpecify internal name used for the defined column in the workflow output.
Alias NameSpecify an alternate column name that the system uses when the specified column name is not available in the input file. During execution, the system attempts to match the field using the alias name if the primary column name is missing.

Note: You can configure a maximum of five alias names per field.

Example: If the specified column name is customerID and it is not present in the input file, the system uses the alias name (for example, customer_id, customerid, CustomerId) to locate and read the corresponding column.

TypeSelect the data type for the field to ensure correct processing.

Available data types:

- Number
- String
- Date
- Boolean
- Integer
- BigNumber
- Binary
- Timestamp
- Internet Address
LengthSpecify the maximum allowable length for the field. For strings, this represents the character count; for numbers, it represents the total number of significant digits.

For example, Set Length to 10 to allow a maximum of 10 characters, such as INV1234567.

PrecisionSpecify the number of decimal places allowed for numeric values.

For example, Set Precision to 2 to store values like 123.45

Trim typeSelect a trimming method to remove unwanted whitespace from the data:

- None
- Left
- Right
- Both

RepeatSelect Y (Yes) to enable the repeat function. When enabled, the step fills empty cells with the value from the immediately preceding row. This is essential for processing vertically grouped data.

For Example:

If Apple appears once and the next rows are blank, if Repeat is set to Y, the step will repeat Apple for those rows.

FormatSpecify the format to parse the incoming data correctly.

The format is used to read or display field value based on the selected type.

For example, Select MM/dd/yyyy to read a date value like 12/25/2025 correctly.

CurrencySpecify the currency symbol used in the input data, such as ₹, $, or €.

For example, enter to display amounts as ₹5000.

DecimalSpecify the character used as the decimal separator, for example, a dot( . ) or a comma (,).
GroupingSpecify the character used to separate thousands or other distinct groups within large numbers. For example, a dot( . ) or a comma (,).
Fail on column name not foundSelect this checkbox to stop the workflow and report an error if a specified column is missing from the input file. If unselected, the step continues execution and assigns a null value to the missing column, ensuring the workflow does not break due to minor schema changes.
Ignore case for Column name and Alias namesSelect this checkbox to match column names regardless of capitalization. This allows the step to identify columns even if the header case in the file differs from the configuration (for example, matching ID with id ).
Get fields from header rowClick this button to automatically scan the source file and populate the table with available field names.

This feature reduces manual effort, especially when working with large files containing many columns.

Additional Outputs fields Tab:Use the Additional Output Fields tab to capture file metadata alongside your extracted data. These fields do not exist within the Excel sheet content but provide essential system details—such as file paths, sizes, and timestamps—that are useful for logging, auditing, and troubleshooting.
Full filename fieldSpecify field name to store the complete file path and name.

Example, C:/Data/Sales_Report_Jan.xlsx

Sheet name fieldSpecify the field name to store the name of the Excel worksheet from which the current row was read.
Sheet row nr fieldSpecify the field name to record the physical row number of the data as it appears in the source Excel sheet.
Row nr written fieldSpecify the field name to record the sequence number of the row in the workflow output. This is a running count of rows processed by the step, useful for validating data volume.
Short filename fieldSpecify the field name to store only the file name, excluding the directory path

For example, Sales_Report_Jan.xlsx.

Extension fieldSpecify the field name to store the file extension

For example, .xlsx, .xls

Path fieldSpecify the field name to store only the directory path where the file is located

For example, C:/Data/

Size fieldSpecify the field name to store the file size

Note: Field size returns in bytes.

Is hidden fieldSpecify the field name to indicate if the source file has the "Hidden" attribute in the file system.

This field returns a Boolean value (Y for yes, N for no).

Last modification fieldSpecify the field name to record the timestamp of the file's last modification.
Uri fieldSpecify the field name to store the complete Uniform Resource Identifier (URI) of the file

For example, file: D:/ExcelInput/TestFiles/Report_2023_Feb.xlsx

Root uri fieldSpecify the field name to store only the root component of the URI

For example, file: D:/