If I had a nickel for everytime I had to convert a comma-separated string into individual rows in SQL Server, I'd be filthy rich right now.💰
No really, I have had to do this enough times that I finally decided to write about my most preferred ways to achieve this.
Lets first create a dummy table with some dummy comma-separated strings as data that we can then use for all our queries.
1CREATE TABLE tblCsvData (2 id INT,3 csv VARCHAR(MAX)4)5INSERT tblCsvData SELECT 1, '18,20,45,68';6INSERT tblCsvData SELECT 2, 'Jon,Sansa,Arya,Robb,Bran,Rickon';7INSERT tblCsvData SELECT 3, '56';
This is my most preferred way since it is simple and easily understandeable.
If you don't know how to write Common Table Expressions or CTEs in SQL Server, I would highly suggest that you read this article first to help you get up to speed with them before proceeding.
1-- create the CTE2WITH cte_split(id, split_values, csv) AS3(4 -- anchor member5 SELECT6 id,7 LEFT(csv, CHARINDEX(',', csv + ',') - 1),8 STUFF(csv, 1, CHARINDEX(',', csv + ','), '')9 FROM tblCsvData1011 UNION ALL1213 -- recursive member14 SELECT15 id,16 LEFT(csv, CHARINDEX(',', csv + ',') - 1),17 STUFF(csv, 1, CHARINDEX(',', csv + ','), '')18 FROM cte_split19 -- termination condition20 WHERE csv > ''21)22-- use the CTE and generate the final result set23SELECT id, split_values24FROM cte_split25ORDER BY id;2627/*28id split_values29----------- ------------301 18311 20321 45331 68342 Jon352 Sansa362 Arya372 Robb382 Bran392 Rickon403 5641*/
Lets go through this step-by-step by first looking at why we are using
LEFT() will basically extract the individual values from CSV strings that we'll then use to generate the rows. So in the anchor member, it will extract
18 out of
Jon out of
STUFF() will kind of do the opposite. It will replace the individual row value extracted by
LEFT() with a
'' meaning it'll remove the individual row values from the CSV data which will then be used by the future iterations of the CTE. So in the anchor member, it will extract
20,45,68 out of
Sansa,Arya,Robb,Bran,Rickon out of
Jon,Sansa,Arya,Robb,Bran,Rickon. These extracted values will be passed on to the next iteration of the CTE.
Consider the below example with a modified anchor member query to better understand the role of
1-- anchor member2SELECT3 id,4 csv,5 LEFT(csv, CHARINDEX(',', csv + ',') - 1) as 'split_values',6 STUFF(csv, 1, CHARINDEX(',', csv + ','), '') as 'future_cte_iteration_value'7FROM tblCsvData89/*10id csv split_values future_cte_iteration_value11----------- ----------------------------------- ------------- ------------------------------121 18,20,45,68 18 20,45,68132 Jon,Sansa,Arya,Robb,Bran,Rickon Jon Sansa,Arya,Robb,Bran,Rickon143 56 5615*/
Wondering why we are concatenating a
, with the csv data in
CHARINDEX()? Try removing the concat operators and running the query. You get an error. The concatenation helps the query handle single values with no commas like
If you want to understand how each iteration of the CTE processes and produces different rows, then run this modified version of the first query.
1-- create the CTE2WITH cte_split(id, split_values, csv, iteration) AS3(4 -- anchor member5 SELECT6 id,7 LEFT(csv, CHARINDEX(',', csv + ',') - 1),8 STUFF(csv, 1, CHARINDEX(',', csv + ','), ''),9 0 as iteration10 FROM tblCsvData1112 UNION ALL1314 -- recursive member15 SELECT16 id,17 LEFT(csv, CHARINDEX(',', csv + ',') - 1),18 STUFF(csv, 1, CHARINDEX(',', csv + ','), ''),19 iteration + 1 as 'iteration'20 FROM cte_split21 WHERE22 csv > ''23)24-- use the CTE and generate the final result set25SELECT *26FROM cte_split27ORDER BY iteration,id;2829/*30id split_values csv iteration31----------- ------------ ------------------------------ -----------321 18 20,45,68 0332 Jon Sansa,Arya,Robb,Bran,Rickon 0343 56 0351 20 45,68 1362 Sansa Arya,Robb,Bran,Rickon 1371 45 68 2382 Arya Robb,Bran,Rickon 2391 68 3402 Robb Bran,Rickon 3412 Bran Rickon 4422 Rickon 543*/
CROSS APPLYon XML
This method is shorter but uses XQuery which I personally am not a big fan of. But you can use this if you want. In the query, just replace
tblCsvData with your table and the column
csv with your column that has delimited values.
1-- main query2SELECT3 temp.id,4 Split.a.value('.', 'VARCHAR(100)') AS 'split_values'5FROM (6 SELECT7 id,8 CAST('<M>' + REPLACE(csv, ',', '</M><M>') + '</M>' AS XML) AS xml_data9 FROM tblCsvData10 ) AS temp CROSS APPLY xml_data.nodes('/M') AS Split(a);1112/*13id split_values14----------- --------------151 18161 20171 45181 68192 Jon202 Sansa212 Arya222 Robb232 Bran242 Rickon253 5626*/
This method has been introduced in SQL Server 2016 and seems to be a dedicated solution to our problem. I have never had a chance to use it as the project servers that I work on(and my own computer) are not using SQL Server 2016 but if you are using that version then this will probably be a better and more straight forward option than using a CTE or XML.
You can find more information about it in Microsoft's documentation.
If the above "SQL Server specific" ways don't work for you, we can always rely on old-school application programming methods like using a LOOP.
So we'll create a function that will take in a comma separated string and return a set of rows of the split values.
1CREATE FUNCTION [dbo].[udf_split_strings_with_while] (2 @p_csv_string varchar(100)3)4RETURNS @tblSplitData TABLE (5 split_values varchar(50)6)7AS8BEGIN9 WHILE len(@p_csv_string) > 010 BEGIN11 DECLARE @split_value varchar(50);12 DECLARE @first_comma_index int = CHARINDEX(',', @p_csv_string);1314 -- if a comma exists in the csv string, then...15 IF @first_comma_index > 016 BEGIN17 -- extract the single left most value from the csv string and use that as the split value.18 SET @split_value = LEFT(@p_csv_string, @first_comma_index - 1);19 -- extract the rest of the csv string leaving out the split value.20 SET @p_csv_string = STUFF(@p_csv_string, 1, @first_comma_index, '');21 END22 ELSE23 BEGIN24 -- use the entire string as the split value25 SET @split_value = @p_csv_string;26 -- set the original csv string as blank27 SET @p_csv_string = '';28 END2930 -- insert the split value into the return table.31 INSERT INTO @tblSplitData VALUES( @split_value );32 END3334 RETURN;35END
We'll then use this function in a query and pass it comma-separated string data one row at a time. For each row of comma-separated values, the function will return one or more rows of split values.
1-- query that uses our split function2SELECT a.id, b.split_values3FROM tblCsvData a4CROSS APPLY udf_split_strings_with_while( a.csv ) b;56/*7id split_values8----------- -------------91 18101 20111 45121 68132 Jon142 Sansa152 Arya162 Robb172 Bran182 Rickon193 5620*/
CROSS APPLY is similar to an
INNER JOIN but it has the advantage that it can work with complex expressions instead of simple result set producing queries. We have used
CROSS APPLY because an
INNER JOIN wouldn't work with a function call on the right which uses a column value(
a.csv) from the table on the left.
This is a popular approach for this problem. For the splitting use-cases that I have encountered, this seemed like somewhat of an overkill in my opinion. But you can try this out if it suits your needs and if none of the above discussed methods work for you the way you want them to. Here is an answer on stackoverflow illustrating the use of this method in splitting delimited strings.
This stackoverflow question helped me out a great deal in writing this article. Do check it out in case you need more insight and tips on how to split delimited strings into rows. Thanks for reading and keep on rocking!🤘
Hi!👋, I'm Saurabh Misra, a full-stack software developer.