如何使用正则表达式过滤掉任何没有逗号的东西?

Posted

技术标签:

【中文标题】如何使用正则表达式过滤掉任何没有逗号的东西?【英文标题】:How to use Regex To Filter Out Anything That Does Not Have A Comma? 【发布时间】:2021-03-24 01:36:22 【问题描述】:

所以我使用了不和谐的机器人,我想制作一个正则表达式,它可以删除任何后面没有逗号的单词。目前,这是我的代码:

const  content  = message;
var argsc = content.split(/[,]+/);
argsc.shift();
console.log(argsc); //returns [ 'hello', 'sky', 'hi hello there' ]

原始消息是+template hi,hello,sky,hi hello there,我想出了如何删除第一个单词。现在我希望 hello there 被过滤掉。我想要结果是 ['hi', 'hello', 'sky','hi']。我知道它很复杂,但我已经尝试了所有方法,但我无法过滤掉你好。谢谢!

【问题讨论】:

为什么不删除第二个'hi',因为它后面没有逗号? 【参考方案1】:

使用

const s = "hi,hello,sky,hi hello there";
console.log(s.split(/(?:^|,)([^\s,]+)(?:\s+[^\s,]+)*/).filter(Boolean));

见regex proof。

表达式解释

--------------------------------------------------------------------------------
  (?:                      group, but do not capture:
--------------------------------------------------------------------------------
    ^                        the beginning of the string
--------------------------------------------------------------------------------
   |                        OR
--------------------------------------------------------------------------------
    ,                        ','
--------------------------------------------------------------------------------
  )                        end of grouping
--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    [^\s,]+                  any character except: whitespace (\n,
                             \r, \t, \f, and " "), ',' (1 or more
                             times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
  )                        end of \1
--------------------------------------------------------------------------------
  (?:                      group, but do not capture (0 or more times
                           (matching the most amount possible)):
--------------------------------------------------------------------------------
    \s+                      whitespace (\n, \r, \t, \f, and " ") (1
                             or more times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    [^\s,]+                  any character except: whitespace (\n,
                             \r, \t, \f, and " "), ',' (1 or more
                             times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
  )*                       end of grouping

【讨论】:

【参考方案2】:

您可以尝试使用正则表达式替换来进行清理,然后进行简单的拆分:

var input = "hi,hello,sky,hi hello there";
input = input.replace(/(\S+)(?: [^\s,]+)*(?=,|$)/g, "$1");
var parts = input.split(",");
console.log(parts);

下面是正则表达式模式的解释:

(\S+)          match a "word" AND capture it in $1
(?: [^\s,]+)*  followed by zero or more "words," making sure that we 
               don't hit a comma
(?=,|$)        match until hitting either a comma or the end of the input

然后,我们只替换第一个捕获的单词。

【讨论】:

以上是关于如何使用正则表达式过滤掉任何没有逗号的东西?的主要内容,如果未能解决你的问题,请参考以下文章

数字和逗号的正则表达式

求教正则表达式如何匹配

请问正则表达式如何过滤超链接和提取链接

使用 C# 使用正则表达式过滤掉字母 [重复]

教你notepad++用正则表达式替换掉各行逗号前面内容

js split 正则表达式过滤引号中的逗号