Skip to main content

Regex Evaluation

Description

Regex Evaluation is a step in the Scripting Plugin for Process Studio Workflows. This step matches the String value of an input field against a text pattern defined by a regular expression. Optionally, you can use the regular expression step to extract particular substrings from the input text field matching a portion of the text pattern into new output fields. This is known as "capturing".

Configurations

No.Field NameDescription
Settings Tab:
1Step nameSpecify the name of the step as it appears in the workflow workspace. This name has to be unique in a single workflow.
2Field to evaluateSpecify the name of the field from the incoming stream which is to be matched against the regular expression
3Result FieldnameSpecify the name of the result output field (boolean). This field will be added to the output stream and indicate whether the value of the input field matched the regular expression (Y/N).
4Create fields for capture groupsEnable this if you want to create new fields based on capture groups in the regular expression. Capturing groups are those parts of the regular expression pattern which are enclosed in a pair of left and right parenthesis. If this option is enabled, substrings of the input field value corresponding to the capturing groups in the regular expression will be extracted and stored in new output fields. If this option is enabled, the "Capture group fields" grid needs to define one field for each capturing group.
5Replace previous fieldsThis option is available in case the "Create fields for capture groups" option is enabled. When the "Replace previous fields" option is checked, fields created for capturing groups will replace existing fields in the incoming stream with the same name. If not enabled, new fields will be added to the output stream for each capturing group field.
6Regular expressionPut here the regular expression to match. See the java.util.regex.Pattern javadoc for reference documentation of the particular regular expression syntax used by this step.
7Use variable substitutionEnable this if your regular expression contains variable
references. By enabling this, variable references will be expanded to their value before evaluating the regular expression pattern.
8Capture group fieldsHere you can specify the new fields for any substrings captured by the regular expression from the input string. If the "Create fields for capture groups" option is enabled, you need to use this grid to enter a field definition corresponding to each captur ing group in the regular expression. The order of the fields is the same as the order of the capturing groups in the regular expression. The columns in the grid allow you to change to the required data type right away.
Content Tab
1Ignore differences in Unicode encodingsEnable checkbox to ignore differences.

Note: This may improve performance, but be sure your data only contains US ASCII characters.

2Enables case-insensitive matchingBy default, case-insensitive matching assumes that only characters in the US-ASCII charset are being matched. Unicode-aware case-insensitive matching can be enabled by specifying the 'Unicode-aware case...' flag in conjunction with this flag.
3Permit whitespace and comments in patternWhen enabled, the step will ignore whitespace and embedded comments starting with # through the end of the line. In this mode, you must use the \s token to match whitespace. (If this option is not enabled, any whitespace characters appearing in the regular expression are matched as-is).
4Enable dotall modeWhen enabled, the expression '.' matches any character including the line terminator. By default, this expression matches any character except line terminators.
5Enable multiline modeWhen enabled, the expressions '^' and '$' match just after or just before, respectively, a line terminator or the end of the input sequence. By default, these expressions only match at the beginning and the end of the entire input sequence.
6Enable Unicode-aware case foldingWhen enabled, in conjunction with the Case-insensitive flag, case-insensitive matching is done in a manner consistent with the Unicode standard. By default, case-insensitive matching assumes that only characters in the US-ASCII charset are being matched.
7Enables Unix lines modeWhen enabled, only the line terminator is recognized in the behavior of '.', '^', and '$'.