T-SQL 替换函数缺少一些使用 nvarchar(max) 的匹配项

Posted

技术标签:

【中文标题】T-SQL 替换函数缺少一些使用 nvarchar(max) 的匹配项【英文标题】:T-SQL replace function missing some occurrences with nvarchar(max) 【发布时间】:2019-04-05 14:40:52 【问题描述】:

给定以下 T-SQL 代码:

declare @lf char(1) set @lf = char(10);
declare @cr char(1) set @cr = char(13);

select  replace(isnull(note,''), @cr+@lf,@lf)  from T

在某些情况下,@cr+@lf 在note 列中不是每次出现都会被@lf 替换?

我正在尝试对确实发生这种情况的情况进行故障排除。

note 列定义为nvarchar(max)。 documentation 说:

如果 string_expression 不是 varchar(max) 或 nvarchar(max) 类型,REPLACE 会在 8,000 字节处截断返回值。要返回大于 8,000 字节的值,必须将 string_expression 显式转换为大值数据类型。

如果我理解正确,则无需强制转换,因为 note 已经是允许返回值大于 8000 字节的正确数据类型。

我想也许isnull 函数没有返回nvarchar(max) 但documentation 说它返回被测试值的类型:

...返回类型 返回与 check_expression 相同的类型。

并且返回的值没有被截断;只是有些crlf对被忽略了。

我一定是忽略了什么。

declare @t table( notes nvarchar(max));

insert @t(notes)
values
(
'From: kkkkkkk@aaaaaaaaa.com <kkkkkkk@aaaaaaaaa.com> 
Sent: Monday, May 00, 0008 00:55 PM
To: Jan Zzzz <sZzzz@dddd.dd.com>
Subject: RE: [Secure Message] aaaaaaaaa ABC ddddddddddddd--XXXXX-X

Hi Jan, 

The ddddddddddddd is valid for one year.  I have attached the Zzzzzzz Rrrrrrrr which you will find behind the  blank cover page and ddddddddddddd form.  Please let me know if this is what you need.  

Best Regards, 

Yyyyyy 

Kkkkkkkk Kkkkkk, ABC, DEF
ABC Mmmmmmmm
P 800.007.0000 ext 000 | F 600.000.0070 



Electronic mail is not considered to be a secure form of communication for Protected Health Information. If you are concerned about privacy, please call aaaaaaaaa directly at 0-800-007-0000. If you would like to review our privacy policy, it can be found on our website: www.ddddddddddddd.com.

This email, including any attached files, may contain confidential and privileged information for the sole use of the intended recipient(s). Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the addressee indicated in this message (or authorized to receive information for the recipient), please contact the sender by reply e-mail and delete all copies of this message (including any attachments).

From: Jan Zzzz <sZzzz@dddd.dd.com> 
Sent: Monday, May 00, 0008 8:56 AM
To: Kkkkkkkk Kkkkkk <kkkkkkk@aaaaaaaaa.com>; Jan Zzzz <sZzzz@dddd.dd.com>
Subject: Re: [Secure Message] aaaaaaaaa ABC ddddddddddddd--XXXXX-X

Hi, this expired, I need a copy of the aaaaaaa aaaa so I can submit my aaaaaaa aaa aaaaaaaa. Thank you. SZzzz

On 0/00/0008 8:00 AM, Jan Zzzz wrote:
Thank you for the dddddddd, I am mmmmmmm mmm today.

On 0/0/0008 6:05 PM, Kkkkkkkk Kkkkkk wrote:

[Secure Message] aaaaaaaaa ABC ddddddddddddd--XXXXX-X    

Kkkkkkkk Kkkkkk has sent you a secure message.
Subject:    aaaaaaaaa ABC ddddddddddddd--XXXXX-X
From:   Kkkkkkkk Kkkkkk <kkkkkkk@aaaaaaaaa.com>

To: Jan Zzzz <sZzzz@dddd.dd.com>

Expires:    May 00, 0008

View Secure Message 



Note: If you have any concerns about the authenticity of this message please contact the original sender directly.'   
)

select notes from @t;
select replace(notes, char(13),'') from @t;

【问题讨论】:

“对确实发生这种情况的情况进行故障排除。”您能否创建一个可重现的示例? @UnhandledExcepSean 我正在整理一些东西,但由于这是一个时间敏感的项目,我希望这个问题会在之前遇到过它的人身上“跳出来”。 提示:您可以显示表达式的数据类型:select SQL_Variant_Property( IsNull( Note, '' ), 'BaseType' )。有关可用的其他数据,请参阅SQL_Variant_Property @HABO: 错误:“操作数类型冲突:nvarchar(max) 与 sql_variant 不兼容”但我想您需要的信息在错误中:) 不重现,我认为您有一个不同的问题,例如在 SSMS 中将 CR 添加到查看器或任何消耗剥离字符串的内容,或者您​​正在连接字符串并且仅替换一部分中的 CR。我运行它以显示 CR 已正确删除。 UPDATE @t SET notes=replace(notes, char(13),'');SELECT * FROM @t WHERE notes like '%' + CHAR(13) + '%' 【参考方案1】:

如果@cr+@cr+@lf 出现在note 中,您将在SELECT 中留下@cr+@lf,除非您需要@cr,当它自己出现时,您可能最好这样做:

declare @cr char(1) set @cr = char(13);

select  replace(isnull(note,''), @cr,'')  from T

【讨论】:

我无法替换单独出现的@cr。只有这对应该变成一个LF。字段中的数据是@cr@lf@cr@lf@cr@lf@cr@lf。连续四对。 您能否提供一些有关问题何时出现的示例数据?注意:如果note 中的数据为@cr@lf@cr@lf@cr@lf@cr@lf,则两种方法都会将其更改为@lf@lf@lf@lf 遇到问题的字段,请提供十六进制值 CAST(note AS VARBINARY(50)) where id = whatever @MichaelEvanchik :这是真实的生产数据。当您将 varbinary 转换回 nvarchar(max) 时,它是我们组织中某人的电子邮件地址。 @Tim - 我知道你不想分享生产数据,但没有一个最小可行的例子,我认为我们不能走得更远,你能把所有的字母都改成 a 吗?如果仍然存在错误,请与我们分享,例如ryan@sparks.rocks =&gt; aaaa@aaaaaa.aaaaa =&gt; 0x610061006100610040006100610061006100610061002E00610061006100610061000D000A00【参考方案2】:

如果它只是一行,应该这样做,将它们全部删除并放回去

set @note =  replace(replace(isnull(@note,''),@cr,''),@lf,'')+@lf . //or whatever line endings you want

如果它的多行尝试这样的事情

declare @note as nvarchar(max)
declare @lf char(1) set @lf = char(10);
declare @cr char(1) set @cr = char(13);

set @note = 'A'+char(10)+char(13)+char(10)+char(13)+char(10)+char(13)+char(10)+char(13)+'A'+char(10)+char(13)

set @note = replace(isnull(@note,''),@cr,'')

--not sure if you want to keep all the user lf's but if you want only one try this?
if (patindex(isnull(@note,''),@lf+@lf) >= 0)
begin
set @note = replace(isnull(@note,''),@lf+@lf,@lf)
end 

select @note
select cast(@note as VARBINARY(100))
select len(@note)

【讨论】:

这仅适用于一个领域,看着他的例子,@note 并没有完全进入我的大脑::grabs more coffee::【参考方案3】:

每种情况都会被替换,但您可能会通过替换创建一些 CrLfs。请参阅下面的示例以及如何缓解它。

DECLARE @Cr CHAR(1)=CHAR(13)
DECLARE @Lf CHAR(1)=CHAR(10)
DECLARE @CrLf CHAR(2)=CHAR(13)+CHAR(10)
DECLARE @NoteTbl TABLE(Note NVARCHAR(MAX))
INSERT INTO @NoteTbl (Note) SELECT @Cr + @CrLf

--example can result in CrLF being created
SELECT [NewNote],LEN([NewNote]) FROM (SELECT replace(isnull(note,''), @CrLf,@lf) AS [NewNote] FROM @NoteTbl) AS a

--Option 1: Replace all Cr with nothing; this is effectively the same as replacing CrLf with Lf
SELECT [NewNote],LEN([NewNote]) FROM (SELECT replace(isnull(note,''), @Cr,'') AS [NewNote] FROM @NoteTbl) AS a

--Option 2: insert the notes into a table and loop until CrLf is gone, this might be useful if you need to do multiple different  data scrubs
DECLARE @NotesCleaned TABLE(Note NVARCHAR(MAX))
INSERT INTO @NotesCleaned (Note) SELECT Note FROM @NoteTbl
WHILE EXISTS(
    SELECT * FROM @NotesCleaned WHERE Note Like '%' + @CrLf + '%'
)
    BEGIN
        UPDATE @NotesCleaned SET Note=replace(isnull(note,''), @CrLf,@lf)
    END

SELECT Note,LEN(Note) FROM @NotesCleaned 

【讨论】:

我已经确认 CRLF 对不是作为删除的副作用而创建的,例如数据中不存在这样的东西:CR CRLF LF。【参考方案4】:

我相信我可能已经找到了部分答案。在 SSMS 中:

Tools->Options->SQL Server->Results to Grid

[ x ]  Retain CR/LF on copy or save

实际上会恢复您对 replace() 的调用已删除的 CR。

【讨论】:

以上是关于T-SQL 替换函数缺少一些使用 nvarchar(max) 的匹配项的主要内容,如果未能解决你的问题,请参考以下文章

在 t-sql 中转换 nvarchar 变量的排序规则

t-sql 用户定义函数,用表中的查找替换文本

用于生成 slug 的 T-SQL 函数?

s-s-rS - nvarchar 自定义格式

SQL Server数据库的T-SQL高级查询

javascript替换函数vuejs中缺少的字符