在 Redshift 中使用正则表达式来获取匹配模式之前的单词

Posted 2023-03-31

技术标签:

【中文标题】在 Redshift 中使用正则表达式来获取匹配模式之前的单词【英文标题】：Using Regular Expressions in Redshift to get the word prior to matched pattern 【发布时间】：2019-10-27 14:33:22 【问题描述】：

使用 Regexp_substring() 查找单词“OF”之前的单词（第一次出现）。以下代码不起作用，因为 Redshift 似乎不支持非贪婪模式匹配。

请帮忙

select regexp_substr('SAFETY COUNCIL OF PALM BEACH COUNTY, INC. ','[[:print:]].*?\\sOF\\s')

Query execution failed

Reason:
SQL Error [XX000]: ERROR: Invalid preceding regular expression prior to repetition operator.  The error occurred while parsing the regular expression fragment: 'rint:]].*?>>>HERE>>>\sOF\s'.
  Detail: 
  -----------------------------------------------
  error:  Invalid preceding regular expression prior to repetition operator.  The error occurred while parsing the regular expression fragment: 'rint:]].*?>>>HERE>>>\sOF\s'.
  code:      8002
  context:   T_regexp_init
  query:     0
  location:  funcs_expr.cpp:189
  process:   padbmaster [pid=74292]
  -----------------------------------------------

  Where: SQL function "regexp_substr" statement 1

我目前正在使用这种破旧的方法，并且认为应该有更好的方法

select 'SAFETY OF COUNCIL OF PALM OF BEACH COUNTY, INC. ' as name, regexp_instr(name,'\\sOF\\s',1) as ind1,substr(name,1,ind1-1) as name_2,regexp_replace(name_2,regexp_substr(name_2,'.*\\s'),'')

【问题讨论】：

Redshift regexp_substr的可能重复 @JohnRotenstein 我不认为它是另一个问题的重复，我已经修改了标题以最好地代表这个问题。谢谢仅供参考，该链接问题包括一种通过控制贪婪来实现结果的方法。 【参考方案1】：

为了实现这个功能，我通常使用split_part 函数。在 Postgresql 中同样有效。

select split_part('SAFETY COUNCIL OF PALM BEACH COUNTY, INC. ', 'OF',1)

【讨论】：

以上是关于在 Redshift 中使用正则表达式来获取匹配模式之前的单词的主要内容，如果未能解决你的问题，请参考以下文章