Regexp_substr 将字符串解析成片段

Posted

技术标签:

【中文标题】Regexp_substr 将字符串解析成片段【英文标题】:Regexp_substr to parse a string into pieces 【发布时间】:2014-02-25 19:00:02 【问题描述】:

我整个早上都被困在这个问题上,希望能得到一些帮助。我一直在阅读我能找到的内容,但在我的情况下应用它时遇到了麻烦。

我有类似的记录:

A123-700
A123-700 / WORD-8
A123 / A456
WORD-8 / A456-800

我需要将它们分成“类型”和“系列”并忽略“WORD-8”

例如

A123-300 would be type=A123, series=300
A123-300 / WORD-8 would be type=A123, series=300
A123 / A456 would be type=A123, type=A456
WORD-8 / A456-200 would be type=A456, series=200

到目前为止,我有这样的事情:

WITH gen AS
  ( select 'A123-700' x from dual
  UNION ALL
  select 'A123-700 / WORD-8' x from dual
  union all
  select 'A123 / A456' x from dual
  union all
  select 'WORD-8 / A456-800' x from dual
  )
SELECT x ,
  regexp_substr(x, '[^/]+')              as first_slash,
  regexp_substr(x, '[^-]+')              as first_type,
  regexp_substr(x, '-\w*')               as first_series,
  regexp_substr(x, '[^/][^DASH]+', 1, 2) as second_slash,
  regexp_substr(x, '[^/]+', 1, 2)        as second_type,
  regexp_substr(x, '-\w+', 1, 2)         as second_series
FROM gen;

但结果并不是我所希望的。 我不想有 -,而且我的“第二个”信息也没有正确显示。

X                 FIRST_SLASH FIRST_TYPE  FIRST_SERIES  SECOND_SLASH  SECOND_TYPE SECOND_SERIES
A123-700          A123-700    A123        -700          (null)        (null)      (null)
A123-700 / WORD-8 A123-700    A123        -700          D-8           WORD-8      -8
A123 / A456       A123        A123 / A456 (null)        A456          A456        (null)
WORD-8 / A456-800 WORD-8      WORD        -8            D-8 /         A456-800    -800

有人可以帮我指出正确的方向吗?

谢谢!

【问题讨论】:

【参考方案1】:
WITH 
  gen AS ( 
    select 'A123-700' x from dual
    UNION ALL
    select 'A123-700 / WORD-8' x from dual
    union all
    select 'A123 / A456' x from dual
    union all
    select 'WORD-8 / A456-800' x from dual
  ),
  t_slash as (
    SELECT x ,
      nullif(regexp_replace(x, '\s*/.*$'),'WORD-8') as first_slash,
      nullif(regexp_replace(x, '^[^/]*/?\s*'),'WORD-8') as second_slash
    FROM gen
  )
select x, first_slash, 
  regexp_substr(first_slash, '^[^-]*') as first_type,
  regexp_replace(first_slash, '^[^-]*-?') as first_series,
  second_slash,
  regexp_substr(second_slash, '^[^-]*') as second_type,
  regexp_replace(second_slash, '^[^-]*-?') as second_series
from t_slash

fiddle

【讨论】:

这真的很有帮助,谢谢!一个问题,如果它可能是“WORD-8”或“WORD 8”呢? @froglander - 您可以使用任何字符串代替 WORD-8。【参考方案2】:
WITH gen AS
  ( select 'A123-700' x from dual
  UNION ALL
  select 'A123-700 / WORD-8' x from dual
  union all
  select 'A123 / A456' x from dual
  union all
  select 'WORD-8 / A456-800' x from dual
  )
SELECT x ,
  regexp_substr(x,'A[[:alnum:]]+',1) as first_type,
  NULLIF( regexp_substr(x,'A[[:alnum:]]+',2),
          regexp_substr(x,'A[[:alnum:]]+',1))  as second_type,
  regexp_substr(x,'(A[[:alnum:]]+)-([[:digit:]]+)',1) as full,
  regexp_substr(regexp_substr(x,'(A[[:alnum:]]+)-([[:digit:]]+)',1),
                '-([[:digit:]]+)',
                1) as first_series
FROM gen;

【讨论】:

以上是关于Regexp_substr 将字符串解析成片段的主要内容,如果未能解决你的问题,请参考以下文章

Oracle分割字符串 REGEXP_SUBSTR用法

Cpp:将字符串片段解析为元组

Oracle 字符串转多行(REGEXP_SUBSTR)

oracleoracle REGEXP_SUBSTR分割字符串

REGEXP_SUBSTR - 如何“避免”字符串中的括号

在 Oracle 中使用 regexp_substr 按顺序拆分字符串