如何在第一个逗号后拆分第一个竖线上的字符串？

Posted 2023-03-16

技术标签:

【中文标题】如何在第一个逗号后拆分第一个竖线上的字符串？【英文标题】：How to split the string on first vertical bar after first comma? 【发布时间】：2021-11-16 19:24:50 【问题描述】：

我有一个类似的字符串：

string1: http://localapp/lkasdjasd/answers/156, Lorem ipsum dolor?是Lorem，不是吗？ |这是答案

需要在第一个逗号后的第一个竖线上分割字符串：

键：156 string1: Lorem ipsum dolor？是 Lorem，不是吗？ string2：这是答案

arr   = line.split(',')
key   = arr[0].split('/')[-1].to_i
title = arr[1]
desc  = arr[2]

到目前为止，我只知道如何获取密钥

【问题讨论】：

split 采用可选的 limit：line.split(',', 2) 返回一个包含 2 个子字符串的数组：第一个逗号之前的部分和之后的所有内容。然后，您可以在| 上再次拆分第二个子字符串以获取标题和说明。（你可能想strip空格） 【参考方案1】：

你可以使用

/\A(?:[^,]*[^\d,])?(\d+),([^|]*)\|\s*(.+)/

请参阅Rubular demo。详情：

\A - 字符串开头 (?:[^,]*[^\d,])? - 零个或多个非逗号的可选序列，然后是数字和逗号以外的字符 (\d+) - 第 1 组：一位或多位数字 , - 逗号 ([^|]*) - 第 2 组：零个或多个非管道字符 \| - 一个 | 字符 \s* - 零个或多个空格 (.+) - 第 3 组：除换行符之外的一个或多个字符尽可能多。

见Ruby demo：

s = 'http://localapp/lkasdjasd/answers/156, Lorem ipsum dolor? is it Lorem, nah? | Here is the answer'
/\A(?:[^,]*[^\d,])?(?<key>\d+),(?<title>[^|]*)\|\s*(?<desc>.+)/ =~ s
puts "Key: #key\nTitle: #title\nDescription: #desc"

输出：

Key: 156
Title:  Lorem ipsum dolor? is it Lorem, nah? 
Description: Here is the answer

【讨论】：

【参考方案2】：

我投票给答案：https://***.com/a/69299320/13841038 您可以在打印时调用.to_i 和.strip

key.to_i
title.strip
desc.strip

如果你不喜欢正则表达式，那么下面的 sn-p 是你的解决方案的修改版本

str = "http://localapp/lkasdjasd/answers/156, Lorem ipsum dolor? is it Lorem, nah? | Here is the answer"

pipe_arr   = str.split('|')
arr        = pipe_arr[0].split(',')

key        = arr[0].split('/')[-1].to_i
title      = (arr[1] + ',' + arr[2]).strip
desc       = pipe_arr[1].strip

【讨论】：

【参考方案3】：

另一种方式：

s = 'http://localapp/lkasdjasd/answers/156, Lorem ipsum dolor? is it Lorem, nah? | Here is the answer'
answer = s.gsub(/.*,.*[|]/,'')

【讨论】：

【参考方案4】：

以下回答了标题和第三行中所要求的内容，这些内容是明确的。然而，这与预期的结果不一致。

r = /\A([^,]*,[^|]*)\|(.*)/x

"Little Miss Muffett, sat on | a tuffet".scan(r).first
  #=> ["Little Miss Muffett, sat on ", " a tuffet"]

"Little | Miss, Muffett, sat on |, a | tuffet".scan(r).first
  #=> ["Little | Miss, Muffett, sat on ", ", a | tuffet"]

我们可以在free-spacing模式中编写正则表达式以使其自文档化。

r = /
    \A       # match beginning of string
    (        # begin capture group 1
      [^,]*  # match 0 or more chars other than a comma
      ,      # match a comma
      [^|]*  # match 0 or more chars other than a pipe
    )        # end capture group 1
    \|       # match a pipe
    (.*)     # match 0 or more chars and save to capture group 2
    /x       # invoke free-spacing regex definition mode

有关该方法如何处理包含捕获组的正则表达式的说明，请参阅 String#scan。¹

不使用常规表达的第二种方法是使用String#index：

def split_it(str)
  i = str.index(',')
  return nil if i.nil?
  j = str.index('|', i+1)
  j.nil? ? nil : [str[0,j-1], str[j+1..-1]]
end

split_it("Little Miss Muffett, sat on | a tuffet")
  #=> ["Little Miss Muffett, sat on", " a tuffet"]

split_it("Little | Miss, Muffett, sat on |, a | tuffet")
  #=> ["Little | Miss, Muffett, sat on", ", a | tuffet"]

^{1 我将不加解释地给出一个替代方案，如下所示：“Little | Miss, Muffett, sat on |, a | tuffet”.match(/\A[^,], [^|]\K|/) && [$`, $'] #=> ["Little | Miss, Muffett, sat on ", ", a | tuffet"].}

【讨论】：

以上是关于如何在第一个逗号后拆分第一个竖线上的字符串？的主要内容，如果未能解决你的问题，请参考以下文章