11 Sep

SQL Server: Parsing A String Into Rows

I haven’t written a SQL article in, well, quite awhile, and I guess this won’t really count as one either.  Anyways, I tossed this up over on the SQL Team forum and I liked it so I’m putting it here.

The solution relied on a couple of arcane concepts and functions.  First, I used PARSENAME which parses a string separated by periods.  It’s basically the same idea as SUBSTRING_INDEX in MySQL only a bit less flexible because of the period.

I’ve talked about walking the tree before but, in essence, it breaks down a delimited string by joining a table of numbers and using that sequence to get, say, the third delimited value.

You have the following string.


The first part is the ORDER_ID and the rest of the string is the SKU’s attached to that order.  Yeah, it’s a bit weird but you see things like this in converted older system or in some really ad hoc reporting tools.

Now, we want to convert that string as follows:

Order_ID	SKU
12345678 	SRPPR             
12345678 	SCNRQ
12345678 	SCINC

Obviously, that isn’t easy but here’s one way to do it.

DECLARE @var char(30);
DECLARE @var1 char(30);
SET @var = '12345678 SCINC, SCNRQ, SRPPR';
SET @var1 = REPLACE(REPLACE(@var,SUBSTRING(@var,1,CHARINDEX(' ',@var)),''),', ','.')

SELECT SUBSTRING(@var,1,CHARINDEX(' ',@var)-1) AS Order_ID,
PARSENAME(@var1,zombie) AS SKU
FROM walkers
WHERE LEN(@var) - LEN(REPLACE(@var,' ','')) >= zombie

First, I created a walkers table as follows:

CREATE TABLE walkers (zombie int);

Which I populated with a running sequence of numbers starting at 1.

SELECT * FROM walkers


Here’s what it does. @var1 does two things: it converts the comma/space to a period so PARSENAME can use it and it also removes the ORDER_ID field and using the variable makes it easier to explain. This is what it does:

SELECT REPLACE(REPLACE(@var,SUBSTRING(@var,1,CHARINDEX(' ',@var)),''),', ','.')


This code:

SUBSTRING(@var,1,CHARINDEX(' ',@var)-1) AS Order_ID

is a straight-forward grab of the order id from the string. Basically everything from the first space backwards.

Note: My original post or the forum didn’t include the -1.

Finally, this code

PARSENAME(@var1,zombie) AS SKU

uses the PARSENAME function and the zombie field from the walkers table.

This might make more sense if we change the code slightly to include the zombies field.

SELECT SUBSTRING(@var,1,CHARINDEX(' ',@var)-1) AS Order_ID,
PARSENAME(@var1,zombie) AS SKU, zombie
FROM walkers
WHERE LEN(@var) - LEN(REPLACE(@var,' ','')) >= zombie

Order_ID	SKU	zombie
12345678	SRPPR   1
12345678	SCNRQ	2
12345678	SCINC	3

Finally, this was for a single record. If we had full order entry system you would simply join the walkers table to the order data and limit the results to the number SKU’s for each record. My code does this but since it’s only for a single record it might seem confusing. Anyways, a full outer join or a non equi join would fit the bill.

I hope this isn’t too confusing. I did it on the fly and didn’t sequence it as well as I have in most of my examples. Honestly, I just liked the solution and wanted to add it here. Too often, I think, you see people run to a function to solve this kind of problem, especially on the SQL Server side of things when there are other ways to solve the problem.