使用正则表达式从逗号分隔列表中删除重复项 [重复]
Posted
技术标签:
【中文标题】使用正则表达式从逗号分隔列表中删除重复项 [重复]【英文标题】:Remove duplicates from comma separated list with regexp [duplicate] 【发布时间】:2018-01-07 08:26:59 【问题描述】:我有
contract, clause 1, Subsection 1.1, contract, clause 1, Subsection 1.2,
paragraph (a), contract, clause 1, Subsection 1.2, paragraph (b), contract,
clause 2
我想得到
contract, clause 1, Subsection 1.1, Subsection 1.2, paragraph (a), paragraph
(b), clause 2
我发现 regexp 可以做到这一点,但我找不到使用哪个字符串来做到这一点
请帮忙..
【问题讨论】:
除非您尝试并在此处发布您的尝试,否则人们可能不愿意提供帮助。 【参考方案1】:基于this link将逗号分隔的值拆分为行,我将字符串拆分为行,保留第一次出现的位置,对值进行重新聚合
with test_string as (
select 1 as id,
'contract, clause 1, Subsection 1.1, contract, clause 1, Subsection 1.2, paragraph (a), contract, clause 1, Subsection 1.2, paragraph (b), contract, clause 2' val
from dual)
select id, listagg(word,', ') WITHIN GROUP (order by position) FROM (
select distinct id, first_value(position) over ( partition by word order by position ) position, word from (
select
distinct t.id,
levels.column_value as position,
trim(regexp_substr(t.val, '[^,]+', 1, levels.column_value)) as word
from
test_string t,
table(cast(multiset(select level from dual connect by level <= length (regexp_replace(t.val, '[^,]+')) + 1) as sys.OdciNumberList)) levels
)
) GROUP BY id
如果您对保持订单不感兴趣
with test_string as (
select 1 as id,
'contract, clause 1, Subsection 1.1, contract, clause 1, Subsection 1.2, paragraph (a), contract, clause 1, Subsection 1.2, paragraph (b), contract, clause 2' val
from dual)
select id, listagg(word,', ') WITHIN GROUP (order by 1) FROM (
select
distinct t.id,
trim(regexp_substr(t.val, '[^,]+', 1, levels.column_value)) as word
from
test_string t,
table(cast(multiset(select level from dual connect by level <= length (regexp_replace(t.val, '[^,]+')) + 1) as sys.OdciNumberList)) levels
) GROUP BY id
【讨论】:
也许有更简单的解决方案,但我没有想到 没看到这里的答案完全一样(***.com/questions/40259200/…)可能更好 非常感谢@LauDec,你解决了我的问题...以上是关于使用正则表达式从逗号分隔列表中删除重复项 [重复]的主要内容,如果未能解决你的问题,请参考以下文章
如何通过 Oracle regexp_replace 中的正则表达式从逗号分隔列表中删除重复项? [复制]
如何通过 Oracle 中的 regexp_replace 从逗号分隔列表中删除重复项?