SQL提取反斜杠之间的数字
Posted
技术标签:
【中文标题】SQL提取反斜杠之间的数字【英文标题】:TSQL Extracting Numbers between backslashes 【发布时间】:2017-09-07 08:15:53 【问题描述】:我正在尝试提取出现在同一位置但长度可能不同的第一批数字。 SUBSTRING、CHARINDEX、PATINDEX、REVERSE 尝试过不同的方法,但还是破解不了。
这里是字符串的格式
\zfilemgr3-00\Corporate\On the Market Information\13030\12743\Contract\12743.pdf
\zfilemgr3-00\Corporate\On the Market Information\141590\Contract\141590.pdf
所以这两个的结果是 13030 141590
【问题讨论】:
我认为更好的方法是使用正则表达式的 CLR 函数 【参考方案1】:考虑appear in the same place
:
declare @temp table (val varchar(250));
insert @temp values
('\zfilemgr3-00\Corporate\On the Market Information\13030\12743\Contract\12743.pdf'),
('\zfilemgr3-00\Corporate\On the Market Information\141590\Contract\141590.pdf');
declare @start int = 51;
select substring(val, @start, charindex('\', val, @start) - @start) num
from @temp;
输出:
num
------
13030
141590
此代码适合您吗?
【讨论】:
当问题中提到的字符串长度可以变化时,为什么要将@start
声明为 51【参考方案2】:
你尝试过强制转换方法吗?
SELECT CAST('<x>' + REPLACE('\zfilemgr3-00\Corporate\On the Market Information\13030\12743\Contract\12743.pdf','\','</x><x>') + '</x>' AS XML).value('/x[5]','int');
SELECT CAST('<x>' + REPLACE('\zfilemgr3-00\Corporate\On the Market Information\141590\Contract\141590.pdf','\','</x><x>') + '</x>' AS XML).value('/x[5]','int');
在snippet查看直播
【讨论】:
【参考方案3】:你可以试试这个:
DECLARE @value NVARCHAR(4000) = N'\zfilemgr3-00\Corporate\On the Market Information\13030\12743\Contract\12743.pdf'
-- 12743.pdf
SELECT REVERSE(SUBSTRING(REVERSE(@value), 0, CHARINDEX('\', REVERSE(@value))))
-- 12743
SELECT REVERSE(SUBSTRING(REVERSE(@value), PATINDEX('%[0-9]%', REVERSE(@value)), CHARINDEX('\', REVERSE(@value)) - PATINDEX('%[0-9]%', REVERSE(@value))))
如果您一直在使用.pdf
或某些扩展,您可以使用第一个变体并替换扩展。如果您需要即时执行此操作,我们需要获取第一个数字,以便将其用作SUBSTRING
函数中的起始索引。
【讨论】:
【参考方案4】:我使用NGrams8K 为这类事情开发了一个非常快速的 T-SQL 函数。
函数
CREATE FUNCTION dbo.SubstringBetweenChar8K
(
@string varchar(8000),
@start tinyint,
@stop tinyint,
@delimiter char(1)
)
/*****************************************************************************************
Purpose:
Takes in input string (@string) and returns the text between two instances of a delimiter
(@delimiter); the location of the delimiters is defined by @start and @stop.
For example: if @string = 'xx.yy.zz.abc', @start=1, @stop=3, and @delimiter = '.' the
function will return the text: yy.zz; this is the text between the first and third
instance of "." in the string "xx.yy.zz.abc".
Compatibility:
SQL Server 2008+
Syntax:
--===== Autonomous use
SELECT sb.token, sb.position, sb.tokenLength
FROM dbo.SubstringBetweenChar8K(@string, @start, @stop, @delimiter); sb;
--===== Use against a table
SELECT sb.token, sb.position, sb.tokenLength
FROM SomeTable st
CROSS APPLY dbo.SubstringBetweenChar8K(st.SomeColumn1, 1, 2, '.') sb;
Parameters:
@string = varchar(8000); Input string to parse
@start = tinyint; the instance of @delimiter to search for; this is where the output
should start. When @start is 0 then the function will return everything from
the beginning of @string until @end.
@stop = tinyint; the last instance of @delimiter to search for; this is where the
output should end. When @end is 0 then the function will return everything
from @start until the end of the string.
@delimiter = char(1); this is the delimiter use to determine where the output starts/ends
Return Types:
Inline Table Valued Function returns:
token = varchar(8000); the substring between the two instances of @delimiter
defined by @start and @stop
position = smallint; the location of where the substring begins
------------------------------------------------------------------------------------------
Developer Notes:
1. Requires NGrams8K. The code for NGrams8K can be found here:
http://www.sqlservercentral.com/articles/Tally+Table/142316/
2. This function is what is referred to as an "inline" scalar UDF." Technically it's an
inline table valued function (iTVF) but performs the same task as a scalar valued user
defined function (UDF); the difference is that it requires the APPLY table operator
to accept column values as a parameter. For more about "inline" scalar UDFs see this
article by SQL MVP Jeff Moden: http://www.sqlservercentral.com/articles/T-SQL/91724/
and for more about how to use APPLY see the this article by SQL MVP Paul White:
http://www.sqlservercentral.com/articles/APPLY/69953/.
Note the above syntax example and usage examples below to better understand how to
use the function. Although the function is slightly more complicated to use than a
scalar UDF it will yield notably better performance for many reasons. For example,
unlike a scalar UDFs or multi-line table valued functions, the inline scalar UDF does
not restrict the query optimizer's ability generate a parallel query execution plan.
3. dbo.SubstringBetweenChar8K is deterministic; for more about deterministic and
nondeterministic functions see https://msdn.microsoft.com/en-us/library/ms178091.aspx
Examples:
-- beginning of string to 2nd delimiter, 2nd delimiter to end of the string
DECLARE @string varchar(100) = 'abc.defg.hi.jk.lmnop.qrs.tuv';
SELECT string=@string, token, position FROM dbo.SubstringBetweenChar8K(@string,0,2, '.');
SELECT string=@string, token, position FROM dbo.SubstringBetweenChar8K(@string,2,0, '.');
-- Between the 1st & 2nd, then 2nd & 5th delimiters
SELECT string=@string, token, position FROM dbo.SubstringBetweenChar8K(@string,1,2, '.');
SELECT string=@string, token, position FROM dbo.SubstringBetweenChar8K(@string,2,5, '.');
-- dealing with NULLS, delimiters that don't exist and when @first = @last
SELECT string=@string, token, position FROM dbo.SubstringBetweenChar8K(@string,2,10,'.');
SELECT string=@string, token, position FROM dbo.SubstringBetweenChar8K(@string,1,NULL,'.');
SELECT string=@string, token, position FROM dbo.SubstringBetweenChar8K(@string,NULL,1,'.');
---------------------------------------------------------------------------------------
Revision History:
Rev 00 - 20160720 - Initial Creation - Alan Burstein
Rev 01 - 20160821 - Re-wrote a single-char version (this); removed tokenLen
****************************************************************************************/
RETURNS TABLE WITH SCHEMABINDING AS RETURN
WITH
chars AS
(
SELECT instance = 0, position = 0 WHERE @start = 0
UNION ALL
SELECT ROW_NUMBER() OVER (ORDER BY position), position
FROM dbo.NGrams8k(@string,1)
WHERE token = @delimiter
UNION ALL
SELECT -1, DATALENGTH(@string)+1 WHERE @stop = 0
)
SELECT
token = SUBSTRING
(
@string,
MIN(position)+1,
NULLIF(MAX(position),MIN(position)) - MIN(position)-1
),
position = CAST
(
CASE WHEN NULLIF(MAX(position),MIN(position)) - MIN(position)-1 > 0
THEN MIN(position)+1 END AS smallint
)
FROM chars
WHERE instance IN (@start, NULLIF(@stop,0), -1);
GO
使用您的数据的示例:
declare @sometable table (someid int identity, someString varchar(1000));
insert @sometable(someString) values
('\zfilemgr3-00\Corporate\On the Market Information\13030\12743\Contract\12743.pdf'),
('\zfilemgr3-00\Corporate\On the Market Information\141590\Contract\141590.pdf');
select *
from @sometable s
cross apply dbo.SubstringBetweenChar8K(s.someString, 4, 5, '\');
结果:
someid someString token position
----------- ----------------------------------------------------------------------------------- ------- --------
1 \zfilemgr3-00\Corporate\On the Market Information\13030\12743\Contract\12743.pdf 13030 51
2 \zfilemgr3-00\Corporate\On the Market Information\141590\Contract\141590.pdf 141590 51
【讨论】:
以上是关于SQL提取反斜杠之间的数字的主要内容,如果未能解决你的问题,请参考以下文章