Fuzzy Match
Description
Fuzzy match is a step in the Lookup Plugin for Process Studio Workflows. The Fuzzy Match step finds strings that potentially match using duplicate-detecting algorithms that calculate the similarity of two streams of data. This step returns matching values as a separated list as specified by user-defined minimal or maximal values.
Configurations
No. | Field Name | Description |
---|---|---|
1 | Step name | Specify the name of the step as it appears in the workflow workspace. This name has to be unique in a single workflow. |
Lookup stream(source): | ||
2 | Lookup step | Specify the step that contains the fields to match. |
3 | Lookup field | Specify the field in the Lookup step above to match. |
Main Stream: | ||
4 | Main stream field | Identifies the primary stream to match with the Lookup field. |
Settings: | ||
5 | Algorithm | Identifies which string-matching algorithm to use. Options include, - Jaro - Jaro Winkler - Pair letters similarity - Levenshtein - Damerau-Levenshtein - Needleman Wunsch - Metaphone - Double Metaphone - SoundEx - Refined SoundEx |
6 | Case sensitive | Enable or disable checkbox to determine if streams can or cannot differ based on the use of uppercase and lowercase letters---only for use with the Levenshtein algorithms |
7 | Get closer value | When checked, returns a single result with the highest similarity score---when unchecked, returns all matches that satisfy the minimal and maximal value setting as a separated list, separated by the values separator |
8 | Minimum value | Specify the lowest possible similarity score |
9 | Maximal value | Specify the highest possible similarity score |
10 | Values separator | Specify the string that separates the matches. Only available for specific algorithms and when the Get closer value option is unchecked. |
Fields Tab: | ||
Output fields: | ||
1 | Match field | Defines the name of the column that contains the comparison value. |
2 | Value field | Defines the similarity score for which to return a value. |
3 | Additional fields: | Specify the list of additional fields to retrieve from the lookup stream. |