I recently had to monitor a MySQL field containing rather long URI strings with 7 GET parameters, and after some research I found a way to extract each parameter in separate SELECT fields, thanks to a somewhat unrelated Stack Overflow answer.
The Solution
Input string
http://domain.ext/page?param1=value1¶m2=value2¶m3=value3¶m4=value4...
Intended result
param1 | param2 | param3 | param4 | ... |
---|---|---|---|---|
value1 | value2 | value3 | value4 | ... |
Final query
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(SUBSTRING_INDEX(string, '&', 1), '&', -1), '=', -1) AS param1, SUBSTRING_INDEX(SUBSTRING_INDEX(SUBSTRING_INDEX(string, '&', 2), '&', -1), '=', -1) AS param2, SUBSTRING_INDEX(SUBSTRING_INDEX(SUBSTRING_INDEX(string, '&', 3), '&', -1), '=', -1) AS param3, SUBSTRING_INDEX(SUBSTRING_INDEX(SUBSTRING_INDEX(string, '&', 4), '&', -1), '=', -1) AS param4, ...
Breakdown
MySQL natively offers very few functions to manage delimiter-separated strings. There's no split
function that would effectively transform a string in an array that we would also have trouble to handle in a query. This is expected behavior from a database software whose very job is to store lists of values.
Fortunately, the SUBSTRING_INDEX(string, delimiter, count)
allows to manipulate a delimiter-separated string. In essence, it returns a substring of string
until the count
-th delimiter
has been found in string
. If count
is positive, it starts from the left of the string. If it's negative, it starts from the right.
A chain of two such functions allows to get the N-th element in a delimiter-separated string, the third one is there only to remove the GET parameter name from the output.
For example, to get param3 value:
The first call makes sure that the third parameter is at the end of the string.
> SUBSTRING_INDEX(string, '&', 3) http://domain.ext/page?param1=value1¶m2=value2¶m3=value3
This is where the magic happens. Once the third parameter is at the end of the string, we just have to extract the last part of the string delimited by ampersands ( & ).
> SUBSTRING_INDEX(SUBSTRING_INDEX(string, '&', 3), '&', -1) param3=value3
This is just the cherry on the cake to extract only the righthand part of param3=value3
.
> SUBSTRING_INDEX(SUBSTRING_INDEX(SUBSTRING_INDEX(string, '&', 3), '&', -1), '=', -1) value3
Astute readers will point out that I could skip the second SUBSTRING_INDEX call and directly take what's right of the last =
, but this wouldn't make a good general example on how to handle delimiter-separated strings in MySQL.
I wanted to see if I could do the same with Microsoft SQL Server. Unfortunately there is no equivalent to the behavior of the
SUBSTRING_INDEX
function. So I tried to write my own query even if I would recommend to create a dedicated function for better readability. This query is using CTE so it is only compatible with SQL Server 2005+.NULL
.SELECT
is the recursion initialization. The RemainingValue column is the remaining part of the url to process, on the initial state the url is reversed to be in a situation of value-first.ValueDelimiterIdx
gives the first position of the character '=' in RemainingValue.ParamDelimiterIdx
gives the first position of the character '&' in RemainingValue.InvertedOrder
= 0 as it does not contains any value.Just to help the comprehension here is the raw results given for an url with four parameters :
4444=4marap&333=3marap&22=2marap&1=1marap?lmth.egap/rf.ikatsotot//:ptth
333=3marap&22=2marap&1=1marap?lmth.egap/rf.ikatsotot//:ptth
22=2marap&1=1marap?lmth.egap/rf.ikatsotot//:ptth
1=1marap?lmth.egap/rf.ikatsotot//:ptth
lmth.egap/rf.ikatsotot//:ptth
Boy am I glad not to have to work with SQL Server. Thanks for trying though!
Like I said if you factorize the logic in a function it became way simpler and easier to use.
As a matter of fact, in a function I would not use the same algorithm because here the challenge is to do the job in one query/instruction while in a function I can use loop and such
Still, I prefer a one-liner over a SQL function any time of day!
I'm not sure what you meant by that since in this case you used a (built-in) function to achieve your goals which SQL Server allows you to do the same once you created your user function.
Even better you can create functions that will returns a table-like result rather than a scalar value. And if you're really hardcore you can create your function in a .NET DLL (using C# or VB.NET rather than T-SQL) and importing it in SQL Server to create CLR functions. Though I'm not using that because I'm a little concerned performance-wise and the importation requires high privileges that usually are not provided if you're using a shared hosting platform.
In that case, I needed a quick way to monitor a table being fed complete URLs. I didn't need to do any processing on those URL, so writing a complete SQL function would have been rather overkill.
Experience proved to me that sometimes you should really spend the time to build debug/monitoring tools in order to save times in subsequent debugging/monitoring sessions. Especially if you found yourself repeating a lot of times those actions.
Wise words I can abide by, however in this particular case it was a one-time task and MySQL proved itself up to the challenge.
Fair enough. Still that gave me some kind of challenge to make with SQL Server instead of working on that pain-in-the-ass project that come back every year for Christmas and on which I(we) should have coded some utility functions years ago. To which I, finally, decided to do this year.
Good for you! \o/