Execute Python Script

Description

Execute Python Script is a workflow step in the Scripting Plugin for Process Studio. The step runs Python scripts directly inside your automation workflows and brings the full power of Python's data science ecosystem into your process orchestration layer.

The step automates the data pipeline in three continuous actions:

Ingests Data: Data automatically flows from previous steps directly into pandas DataFrames. Thus user don't need to explicitly make use of fields like ${FIELD_NAME} because user can access it via data frame declared in Input Frames in Configure tab.
Processes Logic: Your embedded Python script transforms, calculates, or filters the data.
Outputs Results: The processed data moves immediately to your next workflow step.

You do not need to move files manually or write extra code to connect your data.

When to use the step

Use the Execute Python Script step when your workflow needs more flexibility than the built-in steps provide — when standard transformations, lookups, or rules fall short and you need the power of code.

The following scenarios show where this step fits naturally into a workflow:

Generate files and dynamic reports

Your script reads a dynamic file path from a previous step, processes the data, and writes results to a new file — a filtered CSV, a summary report, or a formatted output for an external system. This works especially well for scheduled workflows that generate daily or weekly reports.

Python script:
```
# Generate a filtered report from a dynamic path
dynamic_path = config['file_path'].iloc[0]
raw_data = pd.read_csv(dynamic_path)
report = raw_data[raw_data['status'] == 'active']
report.to_csv('/output/active_users_report.csv', index=False)
```
Business value: Your compliance team gets an automated daily report of active users, delivered to a shared folder every morning — without anyone running a manual export.
Analyze and score data

When your process requires calculations beyond simple arithmetic — moving averages, percentiles, standard deviations, or custom scoring models — Python's numerical libraries handle this natively. You enrich your workflow data with calculated fields before passing it to a reporting step or a downstream system.

Python script:
```
# Add a percentile rank to each sales record
sales['percentile'] = sales['revenue'].rank(pct=True)
sales['tier'] = pd.cut(sales['percentile'], bins=[0, 0.25, 0.75, 1.0],
                        labels=['Low', 'Mid', 'High'])
```
Business value: Your sales managers receive pre-scored lead lists every morning. The workflow ranks, segments, and delivers — no spreadsheet gymnastics required.

The common thread: In each scenario, the Python step acts as a flexible processing engine inside your workflow. Data arrives automatically from upstream steps, your script does the work, and results flow downstream — all without leaving Process Studio.

How the step works

Input Frames: Bridge your workflow to Python

The Configure tab connects your workflow data to the Python environment. Use the Input Frames Table to map the output of a previous step to a named panda DataFrame. Once you create the mapping, the data appears in your script by that name. You don't need imports, file reads, or boilerplate.

Note: You can map multiple steps to multiple DataFrames. Each mapping creates a separate variable in your script's namespace.

Previous Step Name	Pandas Frame Name	What Happens
CSV_Input	customers	All rows from CSV_Input appear as a DataFrame named customers.
DB_Lookup	orders	All rows from DB_Lookup appear as DataFrame named orders.

Automatic Type Conversions

The plugin maps workflow types to their closest Python/pandas equivalents automatically:

Workflow Type	Python / Pandas Type	Notes
Numbers / Integers	Converted to numeric types	The plugin preserves precision.
Dates/Timestamps	Converted to datetime64	The plugin adjusts for local timezone offsets.
Booleans	Converted to bool	-
Strings	Converted to object/string types	-

Usage Example: Dynamic Content

A powerful pattern uses a previous step to pass configuration values — file paths, thresholds, date ranges, or feature flags — into your script. Every field from the previous step becomes a column in the DataFrame. You read these values and use them to drive your script's behavior dynamically.

Scenario: A previous step outputs a single row with two fields: threshold and file_path. Map the step to a DataFrame named config.

Python Script:

# Access the dynamic value from the first row of the input
dynamic_path = config['file_path'].iloc[0]
limit = config['threshold'].iloc[0]
# Use these values in your logic
import pandas as pd
raw_data = pd.read_csv(dynamic_path)
final_result = raw_data[raw_data['value'] > limit]

Note: Use .iloc[0] when the previous step provides configuration parameters in a single row. If your configuration step outputs multiple rows, iterate over them or filter as needed.

This pattern keeps your scripts reusable. The same script processes different files or applies different thresholds depending on what the workflow passes in — you don't edit a single line of code.

Prerequisites and Installation

Recommendation: For best results, use Python 3.10.0 or later.

Choose the installation method:

Three methods are available to set up Python for the step. Choose the one that best fits your environment.

Method	Best for	Trade-off
python.exe	Standard environments that already use Python.	You manage the installation and dependencies yourself.
Anaconda	Data science teams that need a rich package ecosystem.	Larger disk footprint: you manage the installation yourself.
AE_Python	A dedicated, conflict-free setup for Process Studio.	Windows only; smaller footprint, no PATH conflicts (AE-Specific option).

Note: For standard Python and Anaconda installations, you must install dependencies and troubleshoot setup issues manually.

Method 1: Standard Python Installation

Download and install Python from: https://www.python.org/downloads.
Use pip to install required third-party libraries.

Add the following paths to your PATH Environment Variable:

<path_till_python_directory>\Python310;
<path_till_python_directory>\Python310\Scripts;

Method 2: Anaconda Installation

Download and install Anaconda from: https://www.anaconda.com/download.
Use conda, pip, or the Anaconda GUI to install required third-party libraries.

Add the following paths to your PATH Environment Variable:

<ANACONDA_BASE>;
<ANACONDA_BASE>\Library\mingw-w64\bin;
<ANACONDA_BASE>\Library\user\bin;
<ANACONDA_BASE>\Library\bin;
<ANACONDA_BASE>\Scripts;

Method 3: AE_Python Installation (Windows Only)

AutomationEdge provides a pre-packaged Python distribution (AE_Python) as a .zip file. Choose this method when you want a lightweight, isolated Python environment specifically for Process Studio.

Important: The AE_Python.zip package is available only for Windows. If you use another operating system, contact your System Administrator.

Why choose AE_Python?

Smaller footprint: AE_Python uses significantly less disk space than a full Anaconda installation.

No conflicts: AE_Python runs alongside your existing system Python without modifying your system's PATH variable (when you use the AE-Specific option). If your machine already runs Python 3.x for other projects, AE_Python doesn't interfere.

Option A: Default AE_Python Installation

Get the Python setup zip file (for example, Python310.zip) from AutomationEdge.
Extract the zip file.

Add the following paths to your PATH Environment Variable:

<path_till_python_directory>\Python310;
<path_till_python_directory>\Python310\Scripts;

Option B: AE Specific Installation

Use this option to install a dedicated Python instance that only Process Studio and the AE Agent use. This approach doesn't change your system's PATH variable.

Get the Python setup zip file (for example, Python310.zip) from AutomationEdge.
Stop any running Process Studio instances or AE Agents that use the Machine Learning plugin.
Create a case-sensitive folder named python in your <Process Studio root> and/or <Agent root> directory.

Extract the zip file into the new folders:

<Process Studio root>/python/
<Agent root>/python/

Confirm that your folder structure looks like this:

<Process Studio root>/python/AE_Python<Version>
(for example, D:/process-studio/python/AE_Python310)
<Agent root>/python/AE_Python<Version>
(for example, D:/ae-agent/python/AE_Python310)

Start (or restart) your Process Studio and Agent instances to apply the changes.

Notes:

<Process Studio root> is the main folder that appears when you unzip the Process Studio package from AEUI. The default folder name is process-studio.
<Agent root> is the main folder that appears when you unzip the Agent package. The default folder name is ae-agent.

Configurations

Field Name	Description
Step name	Specify the name of the step as it appears in the workflow workspace. This name must be unique in a single workflow.
Configure tab:	The Configure tab acts as the bridge between the workflow and the Python environment. Use the Configure tab to dictate how the workflow chunks data, samples incoming streams, and maps workflow steps to pandas DataFrames.
Row Handling:
No. of rows to process	Specifies the execution frequency and the amount of data passed to the script at one time. Choose from the following options: - ALL: Waits for all rows from previous steps, creates a single DataFrame, and executes the script once. Example: You have a list of 1,000 employees. You want Python to sort the entire list alphabetically. You select ALL to send the complete list to Python at the same time. - Batch (N rows): Executes the script in batches. The DataFrame contains exactly N rows per execution. Example: You have 10,000 sales records. To prevent your computer from running out of memory, you enter 1000. The script runs 10 separate times, processing a group of 1,000 records each time. - Row by Row: Executes the script for every individual row. The DataFrame contains exactly one row per execution. Example: You have 50 phone numbers. You want Python to check if each phone number is valid. You select Row by Row, so the script processes exactly one phone number at a time.
Sampling Options	Use the sampling options to extract a randomized subset of data from large incoming streams, optimizing processing time and memory usage.
Reservoir Sampling	Select the check box to randomly sample rows from the incoming data stream. When you select this option, you must specify a numeric value in the Size field. Example: You receive a file containing 100,000 customer reviews, but you only need to check 100 of them. You select this check box and type 100 in the Size field. The workflow ignores the rest of the data and passes exactly 100 random reviews to your Python script.
Random seed	Specifies the value used to seed the random number generator. This option is available only when Reservoir Sampling is selected. Changing the seed value generates a different random sample for each workflow run. Example: You need 50 random rows from a large file. You type 10 in the Random seed field. The workflow picks 50 random rows. - If you run the workflow again with the seed set to 10, you get the exact same 50 rows. - If you change the seed to 20, the workflow picks a different group of 50 rows.
Options:
Include Input Fields as Output Fields	Select the check box to pass all incoming data fields directly through as output fields. Example: Your input data contains CustomerID and OrderTotal. Your Python script calculates a new DiscountRate. Selecting the check box ensures the next step in your workflow receives all three columns: CustomerID, OrderTotal, and DiscountRate.
Input Frames:	Use the Input Frames table to map incoming data streams to Python DataFrames. Note: Multiple DataFrames can be define if more than one input step is added to the step.
Step name	Select the input step whose data you want to use. The data from this step is passed as a DataFrame. Example: Select Read_Customer_Database from the list to use its data.
Pandas frame name	Specify the variable name for the pandas DataFrame. You can use this name in your Python script. If left blank, the system assigns a default name. For example, ps_data0. Example: Define customer_df as panda frame name for your python script. Access it in your script as, print(customer_df.head()), to view the first rows of the incoming data.
Python Script tab:	Use the Python Script tab to provide the execution logic. You can either write the code directly within the step or load an external .py file dynamically during runtime.
Load Script from file at Runtime	Select this checkbox to load a Python script dynamically from an external file when the workflow runs. Note: When the checkbox is selected, the Manual Python Script editor is disabled. Example: You already have a Python file named format_dates.py saved on your computer. Select the check box to run that exact file instead of typing the code over again inside the workflow.
Script file Location	Specify the file path. Or Click Browse to locate the Python script file. Note: The field is available only when you select Load Script from file at Runtime checkbox.
Manual Python Script	Enter your Python code directly into the text editor. Use the field when you want to embed the script directly within the workflow instead of loading an external file.
Python Variables to get	Specify the exact names of the output variables or pandas DataFrames generated by your Python script that you want to return to the workflow pipeline. Example: Your Python script processes the input and stores the result in a DataFrame named clean_customers. You enter clean_customers in this field so the subsequent workflow steps can process that specific data.
Continue on unset variables	Select this check box to allow the workflow to continue running even if the Python script fails to generate the variables you specified in Python Variables to get field. If you clear this check box, the workflow throws an error when variables are missing.
Output Fields tab:	Use the Output Fields tab to return data to the workflow. To do this, specify the name of the variable you created in your script in the Python Variables to get field. The plugin returns data in one of two ways: - DataFrame Output: If the variable is a pandas DataFrame, each column becomes a separate output field in the workflow. - Single Variable: If the variable is a string or an image (for example, a Matplotlib plot), it returns as a single field. Click Get Fields in the Fields tab to automatically populate the metadata based on a test run of your script.
Name	Specify the name of the output field that the workflow receives.
Type	Specify the workflow data type for the corresponding field. Example: Select Number from the list for the tax_amount field to ensure the workflow processes it mathematically.
(Button) Vars to Fields	Click Vars to Fields to execute script and automatically retrieve the available column names from your pandas DataFrame to populate the table. Example: Click Vars to Fields, and the plugin reads your clean_customers DataFrame, automatically adding first_name, last_name, and email as individual output fields.
(Button) Get Frame Fields	Click Get Frame Fields to execute a test run of your script and automatically retrieve the specific column names from your pandas DataFrame to populate the table.
Include frame row index as an output field checkbox	Select the checkbox to pass the internal pandas DataFrame row index as a separate, visible data column in your workflow pipeline. Example: You send a list of 10 names to Python, and your script sorts them alphabetically. Select this check box to add a new column that shows the original starting row number (0 through 9) for each name. This helps you track exactly where each name came from before the script changed the order.

Description​

When to use the step​

How the step works​

Usage Example: Dynamic Content​

Prerequisites and Installation​

Configurations​