EditDistance

Posted 一棵球和一枝猪

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了EditDistance相关的知识,希望对你有一定的参考价值。

Edit Distance

Category : Dynamic Programming

Description : Given two strings str1 and str2 and below operations that can performed on str1, Find minimum number of edits required to convert ‘str1‘ into ‘str2‘.

  • Insert
  • Remove
  • Replace

Attention : All of the above operations are of equal cost

Examples

Input: str1 = "geek", str2 = "gesek"
Output: 1
We can convert str1 into str2 by inserting a ‘s‘

Input: str1 = "cat", str2 = "cut"
Output: 1
We can convert str1 into str2 by replacing ‘a‘ with ‘u‘

Input: str1 = "sunday", str2= "saturday"
Output: 3
Last three and first characters are same. We basically need to convert "un" to "atur". This can be done using below three operations.
Replace ‘n‘ with ‘r‘, insert t, insert a

What are the subproblems in this case?

The idea is process all characters one by one staring from either from left or right sides of both strings.

Let we traverse from right corner, there are two possibilities for every pair of character being traversed.

m: Length of str1
n: Length of str2
  1. If last characters of two strings are same, nothing much to do. Ignore last characters and get count for remaining strings. So we recur for lengths m-1 and n-1
  2. Else, we consider all operations on str1, consider all three operations on last character of first string. recursively compute minimum cost for all three operations and take minimum of three values.
  • Insert : recur for m and n - 1
  • Remove : recur for m - 1 and n
  • Replace : recur for m - 1 and n - 1

Naive recursive solution

// Utility function to find minimum of three numbers
int min(int x, int y, int z)
{
    return min(min(x, y), z);
}
int editDist(string str1, string str2, int m, int n)
{
    // If first string is empty, the only option is to 
    // insert all characters of second string into first
    if(m == 0) return n;
    
    // If second string is empty, the only option is to 
    // remove all characters of first string
    if (n == 0) return m;
    
    // if last characters of two strings are same
    if(str1[m-1] == str2[n-1])
        return editDist(str1, str2, m-1, n-1);
    
    return 1 + min (editDist(str1, str2, m, n-1),       //insert
                    editDist(str1, str2, m-1, n),       //remove
                    editDist(str1, str2, m-1, n-1)      //replace
                    );
}

int main(){
    string str1 = "sunday";
    string str2 = "saturday";
    
    cout << editDist(str1, str2, str1.length(), str2.length());
    return 0;
}

// Output 
3

The time complexity of above solution is exponential. In worst case, we may end up doing $O(3 ^ m)$ operations. The worst case happens when none of characters of two strings match. Below is a recursive call diagram for worst case.

技术分享图片

Like other typical Dynamic Programming(DP) problems, recomputations of same subproblems can be avoided by constructing a temporaty array that stores results of subproblems.

int min(int x, int y, int z)
{
    return min(min(x, y), z);
}

int editDistDP(string str1, string str2, int m, int n)
{
    int dp[m+1][n+1];
    // Fill dp[][] in bottom up manner
    for(int i = 0; i <= m ; i++)
    {
        for(int j = 0; j <= n; j++)
        {
            if(i == 0)
                dp[i][j] = j;
            else if(j == 0)
                dp[i][j] = i;
            else if(str1[i-1] == str2[j-1])
                dp[i][j] = dp[i-1][j-1];
            else
                dp[i][j] = 1 + min(dp[i][j-1],      //insert
                                   dp[i-1][j],      //remove
                                   dp[i-1][j-1]     //replace
                                   );
        }
    }

    return dp[m][n];
}

// Driver program
int main()
{
    string str1 = "sunday";
    string str2 = "saturday";
    
    cout << editDistDP(str1, str2, str1.length(), str2.length());
    
    return 0;
}

//Output
3

以上是关于EditDistance的主要内容,如果未能解决你的问题,请参考以下文章

最短编辑距离 72.EditDistance.md

72. Edit Distance

编辑距离leetcode

微信小程序代码片段

VSCode自定义代码片段——CSS选择器

谷歌浏览器调试jsp 引入代码片段,如何调试代码片段中的js