Skip to main content

Memory Group By

Description

Memory Group by is a step in the Statistics Plugin for Process Studio Workflows. This step groups rows based on specified fields. This step builds aggregates in the same way as group by step. However, it does not require a sorted input since it processes all rows within memory. When the number of rows is too large to fit into memory, you need to use the combination of the Sort rows and Group by steps.

Configurations

No.Field NameDescription
1Step nameSpecify the name of the step as it appears in the workflow workspace. This name has to be unique in a single workflow.
2Always give back a result rowIf you enable this option, the Group By step will always give back a result row, even if there is no input row.This can be useful if you want to count the number of
rows. Without this option you would never get a count of zero (0).
3The field that make up the groupClick Get Fields to add all fields from the input stream(s).

- Group field: Specify the fields over which you want to group.

4AggregatesSpecify the fields that must be aggregated, the method and the name of the resulting new field.

• Name: Specify the name you want this new field to be named on the stream

• Subject: Specify the fields which you want to aggregate.

• Type: Here are the available aggregation method types:

- Sum

- Average (Mean)

- Median

- Percentile

- Minimum

- Maximum

- Number of values (N)

- Concatenate strings separated by , (comma)

- First non-null value

- Last non-null value

- First value (including null)

- Last value (including null)

- Cumulative sum (all rows option only!)

- Cumulative average (all rows option only!)

- Standard deviation

- Concatenate strings separated by <Value>:specify the separator in the Value column

- Number of distinct values

- Number of rows (without field argument)