从记录中提取连续整数

Posted

技术标签:

【中文标题】从记录中提取连续整数【英文标题】:Extract consecutive integers from a record 【发布时间】:2021-04-01 19:47:24 【问题描述】:

我正在尝试从具有 4 个或更多连续整数的记录中提取数字以用作另一个表的参考。

到目前为止,我已经尝试过使用PATINDEX,但它并没有真正按照我需要的方式工作。如果有记录包含一组以上的序列号,那么我需要它只提取第一个。

我尝试过的:

SELECT SUBSTRING(nvt.AdditionalInformation, PATINDEX('%[0-9]%', nvt.AdditionalInformation), LEN(nvt.AdditionalInformation))
FROM dbo.NSReportVtest nvt;

SELECT PATINDEX('%[0-9]%', nvt.AdditionalInformation) AddtionalInformation
FROM dbo.NSReportVersionsTest nvt

表格中的数据:

Column 1
Added the COB 1.1 & COB 5 Learning types to the report.
Added the previous agreement year and previous agreement final score fields to the report.
Demo Certificate TP35356
TP45905, TP46379, TP44804, TP46432 - Added HasTalentAssessment, AssessmentDate, AssessmentCompleted, PMScore, ValueSurveyScore, OverallPMSCore, Drivers Licence
TP38298 - Removed the Sales Support and Customer Service Consultant - CIC job titles from the report.

预期结果:

Column 2
35356
45905
38298

【问题讨论】:

【参考方案1】:

以下似乎可以完成您想要的。找到第一组匹配的数字,然后找到它们之间的第一个非数字字符和子字符串。

declare @Test table (AdditionalInformation nvarchar(max));

insert into @Test (AdditionalInformation)
values
('Added the COB 1.1 & COB 5 Learning types to the report.'),
('Added the previous agreement year and previous agreement final score fields to the report.'),
('Demo Certificate TP35356'),
('TP45905, TP46379, TP44804, TP46432 - Added HasTalentAssessment, AssessmentDate, AssessmentCompleted, PMScore, ValueSurveyScore, OverallPMSCore, Drivers Licence'),
('TP38298 - Removed the Sales Support and Customer Service Consultant - CIC job titles from the report.');

select T.Original
  -- If we are expecting digits then substring using the 2 calculated positions
  , case when M.FirstMatch > 0 then substring(AdditionalInformation, M.FirstMatch-1, N.SecondMatch-1) else null end
from (
  select T.AdditionalInformation Original
    -- Add a char before and after to make it easier to find the number if its first or last
    , ':' + T.AdditionalInformation + ':' AdditionalInformation
  from @Test T
) T
-- Find the first set of at least 4 digits
cross apply (values (PATINDEX('%[0-9][0-9][0-9][0-9]%', ':'+T.AdditionalInformation+':'))) M (FirstMatch)
-- Find the first non-digit after the 4 digit block starts
cross apply (values (PATINDEX('%[^0-9]%', substring(AdditionalInformation, M.FirstMatch-1, len(AdditionalInformation))))) N (SecondMatch);

返回:

Original Matching Number
Added the COB 1.1 & COB 5 Learning types to the report. NULL
Added the previous agreement year and previous agreement final score fields to the report. NULL
Demo Certificate TP35356 35356
TP45905, TP46379, TP44804, TP46432 - Added HasTalentAssessment, AssessmentDate, AssessmentCompleted, PMScore, ValueSurveyScore, OverallPMSCore, Drivers Licence 45905
TP38298 - Removed the Sales Support and Customer Service Consultant - CIC job titles from the report. 38298

【讨论】:

以上是关于从记录中提取连续整数的主要内容,如果未能解决你的问题,请参考以下文章

SQL快速生成连续整数

从本地数据源到红移的连续数据摄取

找出一个整数数组的和最大的连续子数组

从 Ruby 数据哈希中提取值并将值连接成单个连续字符串

长度最小的连续子数组

C/C++初阶算法 | 提取字符串中第一个连续数字