Fuzzy Match
Description
Fuzzy match is a step in the Lookup Plugin for Process Studio Workflows. The Fuzzy Match step finds strings that potentially match using duplicate-detecting algorithms that calculate the similarity of two streams of data. This step returns matching values as a separated list as specified by user-defined minimal or maximal values.
Configurations
| No. | Field Name | Description |
|---|---|---|
| 1 | Step name | Specify the name of the step as it appears in the workflow workspace. This name has to be unique in a single workflow. |
| Lookup stream(source): | ||
| 2 | Lookup step | Specify the step that contains the fields to match. |
| 3 | Lookup field | Specify the field in the Lookup step above to match. |
| Main Stream: | ||
| 4 | Main stream field | Identifies the primary stream to match with the Lookup field. |
| Settings: | ||
| 5 | Algorithm | Identifies which string-matching algorithm to use. Options include, - Jaro - Jaro Winkler - Pair letters similarity - Levenshtein - Damerau-Levenshtein - Needleman Wunsch - Metaphone - Double Metaphone - SoundEx - Refined SoundEx |
| 6 | Case sensitive | Enable or disable checkbox to determine if streams can or cannot differ based on the use of uppercase and lowercase letters---only for use with the Levenshtein algorithms |
| 7 | Get closer value | When checked, returns a single result with the highest similarity score---when unchecked, returns all matches that satisfy the minimal and maximal value setting as a separated list, separated by the values separator |
| 8 | Minimum value | Specify the lowest possible similarity score |
| 9 | Maximal value | Specify the highest possible similarity score |
| 10 | Values separator | Specify the string that separates the matches. Only available for specific algorithms and when the Get closer value option is unchecked. |
| Fields Tab: | ||
| Output fields: | ||
| 1 | Match field | Defines the name of the column that contains the comparison value. |
| 2 | Value field | Defines the similarity score for which to return a value. |
| 3 | Additional fields: | Specify the list of additional fields to retrieve from the lookup stream. |