Skip to main content

Fuzzy Match

Description

Fuzzy match is a step in the Lookup Plugin for Process Studio Workflows. The Fuzzy Match step finds strings that potentially match using duplicate-detecting algorithms that calculate the similarity of two streams of data. This step returns matching values as a separated list as specified by user-defined minimal or maximal values.

Configurations

No.Field NameDescription
1Step nameSpecify the name of the step as it appears in the workflow workspace. This name has to be unique in a single workflow.
Lookup stream(source):
2Lookup stepSpecify the step that contains the fields to match.
3Lookup fieldSpecify the field in the Lookup step above to match.
Main Stream:
4Main stream fieldIdentifies the primary stream to match with the Lookup field.
Settings:
5AlgorithmIdentifies which string-matching algorithm to use. Options include,

- Jaro

- Jaro Winkler

- Pair letters similarity

- Levenshtein

- Damerau-Levenshtein

- Needleman Wunsch

- Metaphone

- Double Metaphone

- SoundEx

- Refined SoundEx

6Case sensitiveEnable or disable checkbox to determine if streams can or cannot differ based on the use of uppercase and lowercase letters---only for use with the Levenshtein algorithms
7Get closer valueWhen checked, returns a single result with the highest similarity score---when unchecked, returns all matches that satisfy the minimal and maximal value setting as a separated list, separated by the values separator
8Minimum valueSpecify the lowest possible similarity score
9Maximal valueSpecify the highest possible similarity score
10Values separatorSpecify the string that separates the matches. Only available for specific algorithms and when the Get closer value option is unchecked.
Fields Tab:
Output fields:
1Match fieldDefines the name of the column that contains the comparison value.
2Value fieldDefines the similarity score for which to return a value.
3Additional fields:Specify the list of additional fields to retrieve from the lookup stream.