Analytic Query
Description
Analytic Query executes analytic functions over a sorted dataset, enabling you to look forward or backward across rows using operations such as LEAD, LAG, FIRST, and LAST. Use this step when you need to compute values based on neighboring rows within a group — for example, calculating the time between consecutive events or comparing each row's value with the previous row's value. The input data must be sorted on the relevant fields before reaching this step, and you define grouping fields to control the scope of each analytic window.
Configurations
| Field Name | Description |
|---|---|
| Step name | Specify the name of the step as it appears in the workflow workspace. This name has to be unique in a single workflow. |
| Group fields table | Specify the fields you want to group. Click Get Fields to add all fields from the input stream(s). The step will do no additional sorting, so in addition to the grouping identified (for example CUSTOMER_ID) here you must also have the data sorted (for example ORDER_DATE). |
| Analytic | |
| Functions table: | Specify the analytic functions to be solved. Examples of common use cases are: - Calculate the "time between orders" by ordering rows by order date, and LAGing 1 row back to get previous order time. - Calculate the "duration" of a web page view by LEADing 1 row ahead and determining how many seconds the user was on this page. |
| New Field Name | Specify the name you want this new field to be named on the stream (for example PREV_ORDER_DATE). |
| Subject | Specify the existing field to grab (for example ORDER_DATE) . |
| Type | Specify the type of analytic function: Lead - Go forward N rows and get the value of Subject. Lag - Go backward N rows and get the value of Subject. |
| N | Specify the number of rows to offset (backwards or forwards). |