如何在文本文件中的换行符后抓取文本不清除空格，制表符[关闭]

Posted 2023-02-18

技术标签:

【中文标题】如何在文本文件中的换行符后抓取文本不清除空格，制表符[关闭]【英文标题】：how to grab text after newline in a text file no clean of spaces, tabs [closed] 【发布时间】：2022-01-18 09:52:41 【问题描述】：

假设：需要将文件作为参数传递

这是我要显示的唯一文字。剩下的文本有更多的数据[不显示]，问题。文本是半干净的，充满了空格，制表符，Unicode，不干净，必须像这样[我的需要]，所以复制/粘贴这个确切的文本不起作用[由标记格式化]：

我有一些这样的文字：

*** *
more text with spaces and  tabs                                                             
*****
1
Something here and else, 2000 edf, 60 pop
    Usd324.32           2 Usd534.22
2
21st New tetx that will like to select with pattern, 334 pop
    Usd162.14

*** *
more text with spaces and tabs, unicode
*****

我正在尝试获取这个明确的文本：

1 Something here and else, 2000 edf, 60 pop Usd324.32

因为newline和whitespace，下一条命令只抓取1：

grep -E '1\s.+'

另外，我一直在尝试使用新的 concats：

grep -E '1\s|[A-Z].+'

但是不行，grep开始在文本的不同部分选择相似的模式

awk '$1=$11'   #done already
tr -s "\t\r\n\v" #done already
tr -d "\t\b\r"   #done already

我怎样才能抢到：

抢1newline 抓取1之后的整个第二行newline 获取号码$Usd324.34并删除Usd

【问题讨论】：

【参考方案1】：

你可以使用这个sed：

sed -En '/^1/ N;N;s/[[:blank:]]*Usd([^[:blank:]]+)[^\n]*$/\1/; s/\n/ /gp;' file

1 Something here and else, 2000 edf, 60 pop 324.32

或者这个awk 也可以：

awk '$0 == 1 
   printf "%s", $0
   getline
   printf " %s ", $0
   getline
   sub(/Usd/, "")
   print $1
' file

1 Something here and else, 2000 edf, 60 pop 324.32

【讨论】：

@AlexPixel：成功了吗？【参考方案2】：

纯猛击：

#! /bin/bash

exec <<EOF
*** *
more text with spaces and  tabs                                                             
*****
1
Something here and else, 2000 edf, 60 pop
    Usd324.32           2 Usd534.22
2
21st New tetx that will like to select with pattern, 334 pop
    Usd162.14

*** *
more text with spaces and tabs, unicode
*****
EOF

while read -r line1; do
  if [[ $line1 =~ ^1$ ]]; then
    read -r line2
    read -r line3col1 dontcare
    printf '%s %s %s\n' "$line1" "$line2" "$line3col1#Usd"
  fi
done

【讨论】：

以上是关于如何在文本文件中的换行符后抓取文本不清除空格，制表符[关闭]的主要内容，如果未能解决你的问题，请参考以下文章