A couple of weeks ago, I posted about using User Defined functions and Cross Apply to parse data in SQL server. The second part of the import I was working on required me to take that parsed data, perform some manipulations, and then recombine that data. Surprisingly, I found that putting that data together was much more difficult then pulling it apart in the first place. After much experimentation, this is what I came up with…
As you may recall from my previous post, I created a Table-Valued function called Parsedata to pull apart the data from a field. The next step was to join each parsed data value, and join it to another table holding those values as well as an ID for each value. I then needed to combine the ID’s from that second table.
My Original Data Looked Like:
1 ABC, DEF
2 HIJ, KLM
3 NOP, QRS
First, I created a view using the ParseData function: (This is what I did in my previous blog post)
Create View [dbo].[vDataSplitSource] as
select Table.ID, s.Value from Table
dbo.ParseData(Table.Value, ‘,’) s
After parsing the data, I get a view containing:
That view functions as my source. My target is a hash table which contains IDs for each value, similar to:
I create a simple view joining my parsed data to the hash table data:
Create View [dbo].[vJoinedData2] as
select distinct a.ID, b.Value
from vDataSplitSource a inner join HashTable b
on a.Value = b.Value
Now I have a view containing values from both the view and the table.
For the next step, I created a new Scalar-Valued Function to actually put the data back together. In this particular case, my result set needed to not only have commas as delimiters, but also needed commas at the beginning and the end of the string as well. The function requires that you pass the ID for each row as well:
Create FUNCTION [dbo].[CombineData] ( @c INT )
RETURNS VARCHAR(MAX) AS
DECLARE @p VARCHAR(MAX) ;
SET @p = ‘,’ ;
SELECT @p = @p + rtrim(ltrim(Value)) + ‘,’
WHERE Id = @c ;
Finally, It was just a matter of calling the function to return the data as I needed it:
select id, cast(dbo.CombineData(id) as nchar(40)) as RetValue
This last data returns:
Unfortunately, I found that the whole process runs very slowly. I actually used the final statement here as a view, and then joined that view to another view by IDs which proved to be unmanageable when working with larger datasets. Ultimately, I ended up stuffing the final returned data into a temporary table and joining to that instead.
Anyway… It look me a while to figure out this whole processes, and most examples I found didn’t exactly conform to my particular requirements, so I hope that someone out there can find this useful.
Thanks for reading!