如何计算子字符串的出现次数? [复制]

Posted

技术标签:

【中文标题】如何计算子字符串的出现次数? [复制]【英文标题】:How to count of sub-string occurrences? [duplicate] 【发布时间】:2013-03-12 17:31:14 【问题描述】:

假设我有一个像这样的字符串:

MyString = "OU=Level3,OU=Level2,OU=Level1,DC=domain,DC=com";

然后我想知道这个字符串中子字符串“OU=”出现了多少次。使用单个字符,也许有类似的东西:

int count = MyString.Split("OU=").Length - 1;

Split 仅适用于char,不适用于string

另外如何找到n次出现的位置?比如第二个"OU="在字符串中的位置?

如何解决这个问题?

【问题讨论】:

String.Split 有几个重载允许您按字符串拆分。见msdn.microsoft.com/en-us/library/tabh47cf.aspx split 不适用于字符... 您可以使用 Multiple Delim 完成此操作 我非常乐意为您发布一个简单的编码示例以供将来使用。 and Split does work on a stringRemember that Split() returns an Array so in the case of string it would return string[] you would need to create new string[] "somestring", "someotherString"..etc.. 【参考方案1】:
Regex.Matches(input, "OU=").Count

【讨论】:

谢谢。还有如何找到n次出现的位置?比如字符串中第二次“OU=”的位置? 戳了一下,发现这个问题应该可以帮助你:***.com/questions/767767/… 一般来说,如果要匹配的子字符串包含正则表达式字符怎么办? Regex.Matches(input, Regex.Escape(subString)).Count 适用于所有可能的子字符串。 一般来说使用Regex.Escape(...)是明智的,所以抽象版本:Regex.Matches(input, Regex.Escape("string_which_isnt_a_regex")).Count 以字符串aaaaaa和子字符串aa为例。这个答案产生的计数为 3,其中子字符串 aa 的实际计数为 5。【参考方案2】:

您可以使用IndexOf 找到所有出现及其位置:

string MyString = "OU=Level3,OU=Level2,OU=Level1,DC=domain,DC=com";
string stringToFind = "OU=";

List<int> positions = new List<int>();
int pos = 0;
while ((pos < MyString.Length) && (pos = MyString.IndexOf(stringToFind, pos)) != -1)

    positions.Add(pos);
    pos += stringToFind.Length();


Console.WriteLine("0 occurrences", positions.Count);
foreach (var p in positions)

    Console.WriteLine(p);

你可以从一个正则表达式得到同样的结果:

var matches = Regex.Matches(MyString, "OU=");
Console.WriteLine("0 occurrences", matches.Count);
foreach (var m in matches)

    Console.WriteLine(m.Index);

主要区别:

正则表达式代码更短 Regex 代码分配一个集合和多个字符串。 可以编写 IndexOf 代码以立即输出位置,而无需创建集合。 Regex 代码在隔离时可能会更快,但如果多次使用,字符串分配的综合开销可能会导致垃圾收集器的负载更高。

如果我是在线编写这个不经常使用的东西,我可能会使用正则表达式解决方案。如果我将它作为经常使用的东西放入库中,我可能会选择 IndexOf 解决方案。

【讨论】:

小错字?列表 位置 = 新列表[];应该是 List 位置 = new List(); 我相信这是一个更好的解决方案。如果您尝试匹配的字符串类似于 .s ,那么正则表达式将返回不正确的数字。 Jim 提出的 while 循环可以正确计算 .s 的数量。【参考方案3】:

(剪辑模式:开启

您似乎正在解析 LDAP 查询!

你想解析它吗:

手动?转到“SplittingAndParsing” 自动通过 Win32 调用?转到“通过 PInvoke 使用 Win32”

(剪辑模式:关闭

“SplittingAndParsing”:

var MyString = "OU=Level3,OU=Level2,OU=Level1,DC=domain,DC=com";
var chunksAsKvps = MyString
    .Split(',')
    .Select(chunk => 
         
            var bits = chunk.Split('='); 
            return new KeyValuePair<string,string>(bits[0], bits[1]);
        );

var allOUs = chunksAsKvps
    .Where(kvp => kvp.Key.Equals("OU", StringComparison.OrdinalIgnoreCase));

“通过 PInvoke 使用 Win32”:

用法:

var parsedDn = Win32LDAP.ParseDN(str);    
var allOUs2 = parsedDn
    .Where(dn => dn.Key.Equals("OU", StringComparison.OrdinalIgnoreCase));

实用程序代码:

// I don't remember where I got this from, honestly...I *think* it came
// from another SO user long ago, but those details I've lost to history...
public class Win32LDAP

   #region Constants
   public const int ERROR_SUCCESS = 0;
   public const int ERROR_BUFFER_OVERFLOW = 111;
   #endregion Constants

   #region DN Parsing
   [DllImport("ntdsapi.dll", CharSet = CharSet.Unicode)]
   protected static extern int DsGetRdnW(
       ref IntPtr ppDN, 
       ref int pcDN, 
       out IntPtr ppKey, 
       out int pcKey, 
       out IntPtr ppVal, 
       out int pcVal
   );

   public static KeyValuePair<string, string> GetName(string distinguishedName)
   
       IntPtr pDistinguishedName = Marshal.StringToHGlobalUni(distinguishedName);
       try
       
           IntPtr pDN = pDistinguishedName, pKey, pVal;
           int cDN = distinguishedName.Length, cKey, cVal;

           int lastError = DsGetRdnW(ref pDN, ref cDN, out pKey, out cKey, out pVal, out cVal);

           if(lastError == ERROR_SUCCESS)
           
               string key, value;

               if(cKey < 1)
               
                   key = string.Empty;
               
               else
               
                   key = Marshal.PtrToStringUni(pKey, cKey);
               

               if(cVal < 1)
               
                   value = string.Empty;
               
               else
               
                   value = Marshal.PtrToStringUni(pVal, cVal);
               

               return new KeyValuePair<string, string>(key, value);
           
           else
           
               throw new Win32Exception(lastError);
           
       
       finally
       
           Marshal.FreeHGlobal(pDistinguishedName);
       
   

   public static IEnumerable<KeyValuePair<string, string>> ParseDN(string distinguishedName)
   
       List<KeyValuePair<string, string>> components = new List<KeyValuePair<string, string>>();
       IntPtr pDistinguishedName = Marshal.StringToHGlobalUni(distinguishedName);
       try
       
           IntPtr pDN = pDistinguishedName, pKey, pVal;
           int cDN = distinguishedName.Length, cKey, cVal;

           do
           
               int lastError = DsGetRdnW(ref pDN, ref cDN, out pKey, out cKey, out pVal, out cVal);

               if(lastError == ERROR_SUCCESS)
               
                   string key, value;

                   if(cKey < 0)
                   
                       key = null;
                   
                   else if(cKey == 0)
                   
                       key = string.Empty;
                   
                   else
                   
                       key = Marshal.PtrToStringUni(pKey, cKey);
                   

                   if(cVal < 0)
                   
                       value = null;
                   
                   else if(cVal == 0)
                   
                       value = string.Empty;
                   
                   else
                   
                       value = Marshal.PtrToStringUni(pVal, cVal);
                   

                   components.Add(new KeyValuePair<string, string>(key, value));

                   pDN = (IntPtr)(pDN.ToInt64() + UnicodeEncoding.CharSize); //skip over comma
                   cDN--;
               
               else
               
                   throw new Win32Exception(lastError);
               
            while(cDN > 0);

           return components;
       
       finally
       
           Marshal.FreeHGlobal(pDistinguishedName);
       
   

   [DllImport("ntdsapi.dll", CharSet = CharSet.Unicode)]
   protected static extern int DsQuoteRdnValueW(
       int cUnquotedRdnValueLength,
       string psUnquotedRdnValue,
       ref int pcQuotedRdnValueLength,
       IntPtr psQuotedRdnValue
   );

   public static string QuoteRDN(string rdn)
   
       if (rdn == null) return null;

       int initialLength = rdn.Length;
       int quotedLength = 0;
       IntPtr pQuotedRDN = IntPtr.Zero;

       int lastError = DsQuoteRdnValueW(initialLength, rdn, ref quotedLength, pQuotedRDN);

       switch (lastError)
       
           case ERROR_SUCCESS:
               
                   return string.Empty;
               
           case ERROR_BUFFER_OVERFLOW:
               
                   break; //continue
               
           default:
               
                   throw new Win32Exception(lastError);
               
       

       pQuotedRDN = Marshal.AllocHGlobal(quotedLength * UnicodeEncoding.CharSize);

       try
       
           lastError = DsQuoteRdnValueW(initialLength, rdn, ref quotedLength, pQuotedRDN);

           switch(lastError)
           
               case ERROR_SUCCESS:
                   
                       return Marshal.PtrToStringUni(pQuotedRDN, quotedLength);
                   
               default:
                   
                       throw new Win32Exception(lastError);
                   
           
       
       finally
       
           if(pQuotedRDN != IntPtr.Zero)
           
               Marshal.FreeHGlobal(pQuotedRDN);
           
       
   


   [DllImport("ntdsapi.dll", CharSet = CharSet.Unicode)]
   protected static extern int DsUnquoteRdnValueW(
       int cQuotedRdnValueLength,
       string psQuotedRdnValue,
       ref int pcUnquotedRdnValueLength,
       IntPtr psUnquotedRdnValue
   );

   public static string UnquoteRDN(string rdn)
   
       if (rdn == null) return null;

       int initialLength = rdn.Length;
       int unquotedLength = 0;
       IntPtr pUnquotedRDN = IntPtr.Zero;

       int lastError = DsUnquoteRdnValueW(initialLength, rdn, ref unquotedLength, pUnquotedRDN);

       switch (lastError)
       
           case ERROR_SUCCESS:
               
                   return string.Empty;
               
           case ERROR_BUFFER_OVERFLOW:
               
                   break; //continue
               
           default:
               
                   throw new Win32Exception(lastError);
               
       

       pUnquotedRDN = Marshal.AllocHGlobal(unquotedLength * UnicodeEncoding.CharSize);

       try
       
           lastError = DsUnquoteRdnValueW(initialLength, rdn, ref unquotedLength, pUnquotedRDN);

           switch(lastError)
           
               case ERROR_SUCCESS:
                   
                       return Marshal.PtrToStringUni(pUnquotedRDN, unquotedLength);
                   
               default:
                   
                       throw new Win32Exception(lastError);
                   
           
       
       finally
       
           if(pUnquotedRDN != IntPtr.Zero)
           
               Marshal.FreeHGlobal(pUnquotedRDN);
           
       
   
   #endregion DN Parsing


public class DNComponent

   public string Type  get; protected set; 
   public string EscapedValue  get; protected set; 
   public string UnescapedValue  get; protected set; 
   public string WholeComponent  get; protected set; 

   public DNComponent(string component, bool isEscaped)
   
       string[] tokens = component.Split(new char[]  '=' , 2);
       setup(tokens[0], tokens[1], isEscaped);
   

   public DNComponent(string key, string value, bool isEscaped)
   
       setup(key, value, isEscaped);
   

   private void setup(string key, string value, bool isEscaped)
   
       Type = key;

       if(isEscaped)
       
           EscapedValue = value;
           UnescapedValue = Win32LDAP.UnquoteRDN(value);
       
       else
       
           EscapedValue = Win32LDAP.QuoteRDN(value);
           UnescapedValue = value;
       

       WholeComponent = Type + "=" + EscapedValue;
   

   public override bool Equals(object obj)
   
       if (obj is DNComponent)
       
           DNComponent dnObj = (DNComponent)obj;
           return dnObj.WholeComponent.Equals(this.WholeComponent, StringComparison.CurrentCultureIgnoreCase);
       
       return base.Equals(obj);
   

   public override int GetHashCode()
   
       return WholeComponent.GetHashCode();
   


public class DistinguishedName

   public DNComponent[] Components
   
       get
       
           return components.ToArray();
       
   

   private List<DNComponent> components;
   private string cachedDN;

   public DistinguishedName(string distinguishedName)
   
       cachedDN = distinguishedName;
       components = new List<DNComponent>();
       foreach (KeyValuePair<string, string> kvp in Win32LDAP.ParseDN(distinguishedName))
       
           components.Add(new DNComponent(kvp.Key, kvp.Value, true));
       
   

   public DistinguishedName(IEnumerable<DNComponent> dnComponents)
   
       components = new List<DNComponent>(dnComponents);
       cachedDN = GetWholePath(",");
   

   public bool Contains(DNComponent dnComponent)
   
       return components.Contains(dnComponent);
   

   public string GetDNSDomainName()
   
       List<string> dcs = new List<string>();
       foreach (DNComponent dnc in components)
       
           if(dnc.Type.Equals("DC", StringComparison.CurrentCultureIgnoreCase))
           
               dcs.Add(dnc.UnescapedValue);
           
       
       return string.Join(".", dcs.ToArray());
   

   public string GetDomainDN()
   
       List<string> dcs = new List<string>();
       foreach (DNComponent dnc in components)
       
           if(dnc.Type.Equals("DC", StringComparison.CurrentCultureIgnoreCase))
           
               dcs.Add(dnc.WholeComponent);
           
       
       return string.Join(",", dcs.ToArray());
   

   public string GetWholePath()
   
       return GetWholePath(",");
   

   public string GetWholePath(string separator)
   
       List<string> parts = new List<string>();
       foreach (DNComponent component in components)
       
           parts.Add(component.WholeComponent);
       
       return string.Join(separator, parts.ToArray());
   

   public DistinguishedName GetParent()
   
       if(components.Count == 1)
       
           return null;
       
       List<DNComponent> tempList = new List<DNComponent>(components);
       tempList.RemoveAt(0);
       return new DistinguishedName(tempList);
   

   public override bool Equals(object obj)
   
       if(obj is DistinguishedName)
       
           DistinguishedName objDN = (DistinguishedName)obj;
           if (this.Components.Length == objDN.Components.Length)
           
               for (int i = 0; i < this.Components.Length; i++)
               
                   if (!this.Components[i].Equals(objDN.Components[i]))
                   
                       return false;
                   
               
               return true;
           
           return false;
       
       return base.Equals(obj);
   

   public override int GetHashCode()
   
       return cachedDN.GetHashCode();
   

【讨论】:

【参考方案4】:

这个扩展比正则表达式需要更少的资源。

public static int CountSubstring(this string text, string value)
                  
    int count = 0, minIndex = text.IndexOf(value, 0);
    while (minIndex != -1)
    
        minIndex = text.IndexOf(value, minIndex + value.Length);
        count++;
    
    return count;

用法:

MyString = "OU=Level3,OU=Level2,OU=Level1,DC=domain,DC=com";
int count = MyString.CountSubstring("OU=");

【讨论】:

【参考方案5】:

下面应该可以工作

  MyString = "OU=Level3,OU=Level2,OU=Level1,DC=domain,DC=com";
  int count = Regex.Matches(MyString, "OU=").Count

【讨论】:

【参考方案6】:
int count = myString.Split(new []',')
                    .Count(item => item.StartsWith(
                        "OU=", StringComparison.OrdinalIgnoreCase))

【讨论】:

不要自吹自擂,但这取决于逗号分隔。当然,它适用于这种非常具体的场景,但我的正则表达式解决方案更简单、更动态。 是的,我同意,只是提供一个替代方案。 当然,我很感激。 +1【参考方案7】:

这里有两个示例说明如何获得所需的结果

var MyString = "OU=Level3,OU=Level2,OU=Level1,DC=domain,DC=com";

这个你会看到一个分隔的值列表,但它会让 DC 只是一个想法 表明使用 String 的拆分确实有效`

var split = MyString.Split(new string[]  "OU=", "," , StringSplitOptions.RemoveEmptyEntries);

这将拆分并将仅 3 个项目返回到一个列表中,这样如果您不依赖计数,您可以直观地验证它返回 3 个级别的 `OU=``

var lstSplit = MyString.Split(new[]  ',' )
        .Where(splitItem => splitItem.StartsWith(
               "OU=", StringComparison.OrdinalIgnoreCase)).ToList();

【讨论】:

【参考方案8】:
public static int CountOccurences(string needle, string haystack)

    return (haystack.Length - haystack.Replace(needle, "").Length) / needle.Length;

将其与此处的其他答案(正则表达式之一和“IndexOf”之一)进行基准测试,运行速度更快。

【讨论】:

那只是***.com/questions/541954/…的副本至少添加一个参考

以上是关于如何计算子字符串的出现次数? [复制]的主要内容,如果未能解决你的问题,请参考以下文章

C语言课程设计题目计算字符串中子串出现的次数

根据子数组的位置和数组的第一个元素计算子数组元素的总和

Python pandas 计算子字符串的唯一字符串源的数量

计算 pyspark df 列中子字符串列表的出现次数

第八周 用INDEXOF统计一个字符出现次数&&正则表达式

linux中统计文件中一个字符串出现的次数