C# 中正则表达式 Group 分组转

Posted 2020-07-28 Jacklovely

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了C# 中正则表达式 Group 分组转相关的知识，希望对你有一定的参考价值。

http://www.cnblogs.com/kiant71/archive/2010/08/14/1799799.html

在一个正则表达式中，如果要提取出多个不同的部分（子表达式项），需要用到分组功能。

在 C# 正则表达式中，Regex 成员关系如下，其中 Group 是其分组处理类。

Regex –> MatcheCollection (匹配项集合)

          –> Match (单匹配项内容)

                –> GroupCollection (单匹配项中包含的 "(分组/子表达式项)" 集合)

                      –> Group ( "(分组/子表达式项)" 内容)

                            –> CaputerCollection (分组项内容显示基础？)

                                  –> Caputer

Group 对分组有两种访问方式：

1、数组下标访问

在 ((\\d+)([a-z]))\\s+ 这个正则表达式里总共包含了四个分组，按照默认的从左到右的匹配方式，

Groups[0]    代表了匹配项本身，也就是整个整个表达式 ((\\d+)([a-z]))\\s+

Groups[1]    代表了子表达式项 ((\\d+)([a-z]))

Groups[2]    代表了子表达式项 (\\d+)

Groups[3]    代表了子表达式项 ([a-z])

string text = "1A 2B 3C 4D 5E 6F 7G 8H 9I 10J 11Q 12J 13K 14L 15M 16N ffee80 #800080";
Response.Write(text + "<br/>");
 
string strPatten = @"((\\d+)([a-z]))\\s+";
Regex rex = new Regex(strPatten, RegexOptions.IgnoreCase);
MatchCollection matches = rex.Matches(text);
 
//提取匹配项
foreach (Match match in matches)
{
    GroupCollection groups = match.Groups;
    Response.Write(string.Format("<br/>{0} 共有 {1} 个分组：{2}<br/>"
                                , match.Value, groups.Count, strPatten));
 
    //提取匹配项内的分组信息
    for (int i = 0; i < groups.Count; i++)
    {
        Response.Write(
            string.Format("分组 {0} 为 {1}，位置为 {2}，长度为 {3}<br/>"
                        , i
                        , groups[i].Value
                        , groups[i].Index
                        , groups[i].Length));
    }
}
 
/* 
 * 输出：
 1A 2B 3C 4D 5E 6F 7G 8H 9I 10J 11Q 12J 13K 14L 15M 16N ffee80 #800080
 
1A 共有 4 个分组：((\\d+)([a-z]))\\s+
分组 0 为 1A ，位置为 0，长度为 3
分组 1 为 1A，位置为 0，长度为 2
分组 2 为 1，位置为 0，长度为 1
分组 3 为 A，位置为 1，长度为 1
  
 ....
  
 */

2、命名访问

利用 (?<xxx>子表达式) 定义分组别名，这样就可以利用 Groups["xxx"] 进行访问分组/子表达式内容。

string text = "I\'ve found this amazing URL at http://www.sohu.com, and then find ftp://ftp.sohu.comisbetter.";
Response.Write(text + "<br/>");
 
string pattern = @"\\b(?<protocol>\\S+)://(?<address>\\S+)\\b";
Response.Write(pattern.Replace("<", "&lt;").Replace(">","&gt;") + "<br/><br/>");
 
MatchCollection matches = Regex.Matches(text, pattern);
foreach (Match match in matches)
{
    GroupCollection groups = match.Groups;
    Response.Write(string.Format(
                    "URL: {0}； Protocol: {1}； Address: {2} <br/>"
                    , match.Value
                    , groups["protocol"].Value 
                    , groups["address"].Value));
}
 
/* 
 * 输出
 I\'ve found this amazing URL at http://www.sohu.com, and then find ftp://ftp.sohu.comisbetter.
    \\b(?<protocol>\\S+)://(?<address>\\S+)\\b
 
    URL: http://www.sohu.com； Protocol: http； Address: www.sohu.com 
    URL: ftp://ftp.sohu.comisbetter； Protocol: ftp； Address: ftp.sohu.comisbetter 
 
 */

内容参考自：

C#正则表达式编程（三）：Match类和Group类用法 http://blog.csdn.net/zhoufoxcn/archive/2010/03/09/5358644.aspx

C#正则表达式类Match和Group类的理解 http://tech.ddvip.com/2008-10/122483707982616.html

以上是关于C# 中正则表达式 Group 分组转的主要内容，如果未能解决你的问题，请参考以下文章

C# 正则表达式类 Match类和Group类

如何在 Java 中获取 Group.Captures（来自 C# 中的 RegEx）的行为？