Part 1 of this series described a way to use recursive SQL to build up a single comma-separated field from a set of rows that were all part of the same grouping. Part 2 describes how to do just the inverse: take a field that has a comma-separated list of values and produce one row for each of the items in that list. What I refer to as "string normalization."
In the code below, the seed statement is simply a select * from our source table. The recursive statement takes the field to be pivoted into multiple rows and uses the POSITION() function to repeatedly pull off just the first item in the list. The recursion continues until our POSITION() calculation in the "dpos" field tells us the field is blank.
The context for this example is a set of survey responses in a table that contains a RESPONSE_KEY (just some primary key) and RESPONSE_TXT (a comma-separated list of responses). The SQL below parses that RESPONSE_TXT field into a single elements and outputs one row per element along with the associated RESPONSE_KEY.
WITH RECURSIVE parse_list (response_key, delim_pos, item_num, element, remainder) AS ( SELECT response_key, 0, 0, CAST('' AS VARCHAR(100)), response_txt FROM response UNION ALL SELECT response_key, CASE WHEN POSITION(',' IN remainder) > 0 THEN POSITION(',' IN remainder) ELSE CHARACTER_LENGTH(remainder) END dpos, item_num + 1, TRIM(BOTH ‘,’ FROM SUBSTR(remainder, 0, dpos+1)), TRIM(SUBSTR(remainder, dpos+1)) FROM parse_list WHERE dpos > 0 ) SELECT response_key, element FROM parse_list p WHERE item_num > 0 ORDER BY response_key, item_num;
If the input RESPONSE table had the following data:
RESPONSE_KEY RESPONSE_TXT 643 eggs, milk, toast 872 cereal, juice, bagel, cream cheese
Then the output would look like this:
RESPONSE_KEY ELEMENT 643 eggs 643 milk 643 toast 872 cereal 872 juice 872 bagel 872 cream cheese
If you're new to recursive SQL, take the time to read, try, and experiment with these first two uses. Part 3 of this series will describe a way to consolidate multiple overlapping time segments, using some specified order or precidence, to create a single consolidated time line. That sounds confusing and potentially worthless until you see the real world example.
very good explanation with clear sample query
ReplyDelete