仅替换文件中多次出现的匹配组

Posted 2023-03-29

技术标签:

【中文标题】仅替换文件中多次出现的匹配组【英文标题】：Replacing only the matched group in a file with multiple occurences 【发布时间】：2021-01-10 18:58:53 【问题描述】：

输入： /* ABCD X 1111 */ /* Comment 1111: [[reason for comment]] */

输出： /* ABCD X 1111 # [[reason for comment]] */

使用的正则表达式：regex = (?:[\/*]+\sPRQA[\s\w\,]*)(\*\/\s*\/\*\Comment[\w\,]+:)+(?:\s\[\[.*\/$)

如何使用上述正则表达式将匹配的组替换为'#'在文件中多次出现？

我尝试使用 re.sub(regex, '#\1', file.read(), re.MULTILINE)，但这会将 # 附加到匹配的组。

有没有直接的方法来代替逐行迭代然后替换？

【问题讨论】：

【参考方案1】：

你可以使用

re.sub(r'(/\*\s*ABCD[^*/]*)\*/\s*/\*\s*Comment[^*:]+:(\s*\[\[[^][]*]]\s*\*/)', r'\1#\2', file.read())

如果您确定这些子字符串仅出现在行尾，请添加您的 $ 锚点并使用 flags=re.M：

re.sub(r'(/\*\s*ABCD[^*/]*)\*/\s*/\*\s*Comment[^*:]+:(\s*\[\[[^][]*]]\s*\*/)$', r'\1#\2', file.read(), flags=re.M)

请参阅regex demo。详情：

(/\*\s*ABCD[^*/]*) - 第 1 组 (\1)：/*，零个或多个空格，ABCD，然后是除 * 和 / 之外的任何零个或多个字符 \*/\s*/\*\s*Comment[^*:]+: - */，零个或多个空格，/，零个或多个空格，Comment，一个或多个除* 和: 之外的字符，然后是: (\s*\[\[[^][]*]]\s*\*/) - 第 2 组 (\2)：零个或多个空格，[[，除 [ 和 ] 和 ] 之外的零个或多个字符，]]，零个或多个空格，*/。李>

见Python demo:

import re
rx = r'(/\*\s*ABCD[^*/]*)\*/\s*/\*\s*Comment[^*:]+:(\s*\[\[[^][]*]]\s*\*/)$'
text = "Some text ... /* ABCD X 1111 */ /* Comment 1111: [[reason for comment]] */\nMore text here... Some text ... /* ABCD XD 1222 */ /* Comment 1112: [[reason for comment 2]] */"
print( re.sub(rx, r'\1#\2', text, flags=re.M) )

输出：

Some text ... /* ABCD X 1111 # [[reason for comment]] */
More text here... Some text ... /* ABCD XD 1222 # [[reason for comment 2]] */

【讨论】：

以上是关于仅替换文件中多次出现的匹配组的主要内容，如果未能解决你的问题，请参考以下文章