如何找到包含两个唯一重复字符的最长子字符串

Posted

技术标签:

【中文标题】如何找到包含两个唯一重复字符的最长子字符串【英文标题】:How to find the longest substring containing two unique repeating characters 【发布时间】:2013-02-06 11:34:04 【问题描述】:

任务是在给定字符串中找到由任意两个唯一重复字符组成的最长子字符串 前任。在输入字符串“aabadefghaabbaagad”中,最长的此类字符串是“aabbaa”

我想出了以下解决方案,但想看看是否有更有效的方法来做同样的事情

import java.util.*; 

public class SubString 
    public static void main(String[] args)  
    //String inStr="defghgadaaaaabaababbbbbbd";
    String inStr="aabadefghaabbaagad";
    //String inStr="aaaaaaaaaaaaaaaaaaaa";
    System.out.println("Input string is         "+inStr);

    StringBuilder sb = new StringBuilder(inStr.length());
    String subStr="";
    String interStr="";
    String maxStr="";
    int start=0,length=0, maxStart=0, maxlength=0, temp=0;

    while(start+2<inStr.length())   
        int i=0;
         temp=start;
         char x = inStr.charAt(start);
         char y = inStr.charAt(start+1);
         sb.append(x);
         sb.append(y);

         while( (x==y) && (start+2<inStr.length()) )
             start++;
              y = inStr.charAt(start+1);
              sb.append(y);
         

         subStr=inStr.substring(start+2);
         while(i<subStr.length())
             if(subStr.charAt(i)==x || subStr.charAt(i)==y )
                  sb.append(subStr.charAt(i));
                   i++;
              
              else
                   break;
         

         interStr= sb.toString();
         System.out.println("Intermediate string "+ interStr);
         length=interStr.length();
         if(maxlength<length)
             maxlength=length;
              length=0;
              maxStr = new String(interStr);
              maxStart=temp;
         

         start++;
         sb.setLength(0);
   

   System.out.println("");
   System.out.println("Longest string is "+maxStr.length()+" chars long "+maxStr);  


【问题讨论】:

你试过正则表达式吗? 使用HashMap就可以做到。 Findthemhere @Java_Alert 我该怎么做?你能解释一下吗? @Jayamohan 据我所知,这不是一个常见的子问题 【参考方案1】:

这里有一个提示,可能会引导您使用线性时间算法(我假设这是家庭作业,所以我不会给出完整的解决方案):在您找到一个既不等于 @ 的字符时987654321@也不是y,不用一路回到start + 1重新开始搜索。让我们使用字符串aabaaddaa。在您看到aabaa 并且下一个字符是d 的地方,在索引1 或2 处重新开始搜索是没有意义的,因为在这些情况下,您只会得到abaabaa在再次点击d 之前。事实上,您可以将start 直接移动到索引3(as 最后一组的第一个索引),因为您已经知道as 直到@ 987654333@,您可以将i 移动到索引5 并继续。

编辑:下面的伪代码。

// Find the first letter that is not equal to the first one, 
// or return the entire string if it consists of one type of characters
int start = 0;
int i = 1;
while (i < str.length() && str[i] == str[start])
    i++;
if (i == str.length())
    return str;

// The main algorithm
char[2] chars = str[start], str[i];
int lastGroupStart = 0;
while (i < str.length()) 
    if (str[i] == chars[0] || str[i] == chars[1]) 
        if (str[i] != str[i - 1])
            lastGroupStart = i;
    
    else 
        //TODO: str.substring(start, i) is a locally maximal string; 
        //      compare it to the longest one so far
        start = lastGroupStart;
        lastGroupStart = i;
        chars[0] = str[start];
        chars[1] = str[lastGroupStart];
    
    i++;

//TODO: After the loop, str.substring(start, str.length()) 
//      is also a potential solution.

【讨论】:

我在阿斯蒙德学校已经有一段时间了:)。这是一道面试题。 我确实已经考虑过你提到的优化。但这种方法在某些情况下会失效。以字符串“fghaabbaa”为例,我的代码会通过优化挑选出“fg”、“haa”和“bbaa”。我可以尝试根据最后一个中间字符串的模式来优化起始位置,但你可以看到它越来越难看。 @40mikemike:我的方法完全取决于将起始位置重置为最后一个相等字母序列的开头;那么,检测到的子串将是fgghhaaaabbaa。我非常有信心它可以用与您现在一样多的代码来完成,并且运行时间将是线性的而不是二次的。 @40mikemike:我无法抗拒编码挑战,所以请看上面的快速而肮脏的草稿。 我让这个人当讲师! +1【参考方案2】:

所以我的想法是分两步解决

扫描整个字符串以查找相同字母的连续流 循环提取的片段并压缩它们直到你得到一个间隙。

这样您还可以修改逻辑以扫描任何长度的最长子字符串,而不仅仅是 2。

class Program


    static void Main(string[] args)     
    
        //.         
        string input = "aabbccdddxxxxxxxxxxxxxxxxx";
        int max_chars = 2;

        //.
        int flip = 0;

        var scanned = new List<string>();

        while (flip > -1)
        
            scanned.Add(Scan(input, flip, ref flip));
        

        string found = string.Empty;
        for(int i=0;i<scanned.Count;i++)
        
            var s = Condense(scanned, i, max_chars);
            if (s.Length > found.Length)
             
                found = s;
            
        

        System.Console.WriteLine("Found:" + found);
        System.Console.ReadLine();


    

    /// <summary>
    /// 
    /// </summary>
    /// <param name="s"></param>
    /// <param name="start"></param>
    /// <returns></returns>
    private static string Scan(string s, int start, ref int flip)
     
        StringBuilder sb = new StringBuilder();

        flip = -1;

        sb.Append(s[start]);

        for (int i = start+1; i < s.Length; i++)
        
            if (s[i] == s[i - 1])  sb.Append(s[i]); continue;  else  flip=i; break;
        

        return sb.ToString();
    

    /// <summary>
    /// 
    /// </summary>
    /// <param name="list"></param>
    /// <param name="start"></param>
    /// <param name="repeat"></param>
    /// <param name="flip"></param>
    /// <returns></returns>
    private static string Condense(List<string> list, int start, int repeat)
    
        StringBuilder sb = new StringBuilder();

        List<char> domain = new List<char>()list[start][0];

        for (int i = start; i < list.Count; i++)
        
            bool gap = false;

            for (int j = 0; j < domain.Count; j++)
            
                if (list[i][0] == domain[j])
                 
                    sb.Append(list[i]);
                    break;
                
                else if (domain.Count < repeat)
                
                    domain.Add(list[i][0]);
                    sb.Append(list[i]);
                    break;
                
                else
                
                    gap=true;
                    break;
                
            

            if (gap)  break;
        

        return sb.ToString();
    



【讨论】:

【参考方案3】:

同样的问题,我写了这段代码

public int getLargest(char [] s)
    if(s.length<1) return s.length;
    char c1 = s[0],c2=' ';
    int start = 1,l=1, max=1;

    int i = 1;
    while(s[start]==c1)
        l++;
        start++;
        if(start==s.length) return start;
    

    c2 = s[start];
    l++;

    for(i = l; i<s.length;i++)
        if(s[i]==c1 || s[i]==c2)
            if(s[i]!=s[i-1])
                start = i;
            l++;
        
        else 
            l = i-start+1;  
            c1 = s[start];
            c2 = s[i];
            start = i;
        
        max = Math.max(l, max);
    
    return max;

【讨论】:

【参考方案4】:

一般解决方案:包含 K 个唯一字符的最长子串。

int longestKCharSubstring(string s, int k) 
    int i, max_len = 0, start = 0;
    // either unique char & its last pos
    unordered_map<char, int> ht;

    for (i = 0; i < s.size(); i++) 
        if (ht.size() < k || ht.find(s[i]) != ht.end()) 
            ht[s[i]] = i;
         else 
            // (k + 1)-th char
            max_len = max(max_len, i - start);
            // start points to the next of the earliest char
            char earliest_char;
            int earliest_char_pos = INT_MAX;
            for (auto key : ht)
                if (key.second < earliest_char_pos)
                    earliest_char = key.first;
            start = ht[earliest_char] + 1;
            // replace earliest_char
            ht.erase(earliest_char);
            ht[s[i]] = i;
        
    
    // special case: e.g., "aaaa" or "aaabb" when k = 2
    if (k == ht.size())
        max_len = max(max_len, i - start);

    return max_len;

【讨论】:

【参考方案5】:
import java.util.ArrayList;
import java.util.Collections; 
import java.util.HashMap; import java.util.Iterator; import java.util.List;
import java.util.Map;

public class PrintLLargestSubString 

    public static void main(String[] args)         String string =
            "abcdefghijklmnopqrstuvbcdefghijklmnopbcsdcelfabcdefghi";
    List<Integer> list = new ArrayList<Integer> ();         List<Integer>
    keyList = new ArrayList<Integer> ();        List<Integer> Indexlist = new
    ArrayList<Integer> ();      List<Integer> DifferenceList = new
    ArrayList<Integer> ();      Map<Integer, Integer> map = new
    HashMap<Integer, Integer>();        int index = 0;      int len = 1;        int
    j=1;        Indexlist.add(0);       for(int i = 0; i< string.length() ;i++) 
        if(j< string.length())
            if(string.charAt(i) < string.charAt(j))
                len++;
                list.add(len);
             else
                index= i+1;
                Indexlist.add(index); //                    System.out.println("\nindex" + index);              
                len=1;              
                                  j++;         //        System.out.println("\nlist" +list);         System.out.println("index List" +Indexlist); //     int n =
    Collections.max(list); //       int ind = Collections.max(Indexlist);
    //      System.out.println("Max number in IndexList  " +n);
    //      System.out.println("Index Max is " +ind);

    //Finding max difference in a list of elements      for(int diff = 0;
    diff< Indexlist.size()-1;diff++)           int difference =
            Indexlist.get(diff+1)-Indexlist.get(diff);
    map.put(Indexlist.get(diff), difference);
    DifferenceList.add(difference);         
    System.out.println("Difference between indexes" +DifferenceList); //        Iterator<Integer> keySetIterator = map.keySet().iterator(); //      while(keySetIterator.hasNext())
    //          Integer key = keySetIterator.next();
    //          System.out.println("index: " + key + "\tDifference  "
    +map.get(key)); // //        //        System.out.println("Diffferenece List" +DifferenceList);        int maxdiff = Collections.max(DifferenceList);      System.out.println("Max diff is " + maxdiff);       //////      Integer
    value = maxdiff;        int key = 0;        keyList.addAll(map.keySet());
    Collections.sort(keyList);      System.out.println("List of al keys"
            +keyList); //       System.out.println(map.entrySet());         for(Map.Entry entry: map.entrySet())           if(value.equals(entry.getValue()))
    key =  (int) entry.getKey();                          System.out.println("Key value of max difference starting element is " + key);   

    //Iterating key list and finding next key value         int next = 0 ;
    int KeyIndex = 0;       int b;      for(b= 0; b<keyList.size(); b++) 
        if(keyList.get(b)==key)
            KeyIndex = b;                                         System.out.println("index of key\t" +KeyIndex);         int nextIndex = KeyIndex+1;         System.out.println("next Index = " +nextIndex);         next = keyList.get(nextIndex);
            System.out.println("next Index value is = " +next);

            for( int z = KeyIndex; z < next ; z++) 
                System.out.print(string.charAt(z));            

            

【讨论】:

【参考方案6】:

问题可以在 O(n) 内解决。想法是维护一个窗口并向窗口添加元素,直到它包含小于或等于 2,如果需要,在这样做时更新我们的结果。如果唯一元素超出窗口中的要求,则开始从左侧删除元素。

#code
from collections import defaultdict

def solution(s, k):
  length = len(set(list(s)))
  count_dict = defaultdict(int)
  if  length < k:
    return "-1"

  res = []
  final = []
  maxi = -1

  for i in range(0, len(s)):

    res.append(s[i])
    if len(set(res)) <= k:
      if len(res) >= maxi and len(set(res)) <= k :
        maxi = len(res)
        final = res[:]
        count_dict[maxi] += 1

    else:
      while len(set(res)) != k:
        res = res[1:]
      if maxi <= len(res) and len(set(res)) <= k:
        maxi =  len(res)
        final = res[:]
        count_dict[maxi] += 1
    return len(final)

print(solution(s, k))

【讨论】:

【参考方案7】:

这里的想法是将每个字符的出现添加到哈希图中,当哈希图大小增加超过 k 时,删除不需要的字符。

private static int getMaxLength(String str, int k) 
        if (str.length() == k)
            return k;
        var hm = new HashMap<Character, Integer>();
        int maxLength = 0;
        int startCounter = 0;
        for (int i = 0; i < str.length(); i++) 
            char c = str.charAt(i);
            if (hm.get(c) != null) 
                hm.put(c, hm.get(c) + 1);
             else 
                hm.put(c, 1);
            
            //atmost K different characters
            if (hm.size() > k) 
                maxLength = Math.max(maxLength, i - startCounter);
                while (hm.size() > k) 
                    char t = str.charAt(startCounter);
                    int count = hm.get(t);
                    if (count > 1) 
                        hm.put(t, count - 1);
                     else 
                        hm.remove(t);
                    
                    startCounter++;
                
            
        
        return maxLength;
    

【讨论】:

以上是关于如何找到包含两个唯一重复字符的最长子字符串的主要内容,如果未能解决你的问题,请参考以下文章

数组篇在python中如何查找最长字符串子串

LeetCode159.至多包含两个不同字符的最长子串

两个字符串的所有公共最长子序列

请帮忙///如何计算两个 字符串的最长公共子串

给定一个字符串,找到最长子串的长度,而不重复字符。

输入一串字符,找到相同且最长的字符串