数据结构与算法 -- 字符串匹配

Posted jiangwangxiang

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了数据结构与算法 -- 字符串匹配相关的知识,希望对你有一定的参考价值。

1、Trie树

public class TrieTree 
    private TrieNode root = new TrieNode(/);//存储无意义字符
    
    //往Trie树中插入一个字符串
    public void insert(char[] text) 
        TrieNode p = root;
        for(int i=0; i<text.length; i++) 
            int index = text[i] - a;
            if(p.children[index] == null) 
                TrieNode newNode = new TrieNode(text[i]);
                p.children[index] = newNode;
            
            p = p.children[index];
        
        p.isEndingChar = true;
    
    
    //在Trie树中查找一个字符串
    public boolean find(char[] pattern) 
        TrieNode p = root;
        for(int i=0; i<pattern.length; i++) 
            int index = pattern[i] - a;
            if(p.children[index] == null) 
                return false;//不存在pattern
            
            p = p.children[index];
        
        if(p.isEndingChar) //完全匹配到pattern
            return true;
        else //不能完全匹配,只是前缀
            return false;
        
    
    public class TrieNode
        public char data;
        public TrieNode[] children = new TrieNode[26];
        public boolean isEndingChar = false;
        public TrieNode(char data) 
            this.data = data;
        
    
    
    public static void main(String[] args) 
        TrieTree trieTree = new TrieTree();
        trieTree.insert("hello".toCharArray());
        trieTree.insert("world".toCharArray());
        trieTree.insert("word".toCharArray());
        trieTree.insert("teacher".toCharArray());
        trieTree.insert("wild".toCharArray());
        String pattern = "word";
        System.out.println(trieTree.find(pattern.toCharArray()) ? "找到了 " + pattern : "没有完全匹配的字符串 " + pattern);
        pattern = "wor";
        System.out.println(trieTree.find(pattern.toCharArray()) ? "找到了 " + pattern : "没有完全匹配的字符串 " + pattern);
    

 2、利用Trie树实现搜索引擎的搜索关键词提示功能

import java.util.ArrayList;
import java.util.List;

public class TrieTree 
    private TrieNode root = new TrieNode(/);//存储无意义字符
    
    //往Trie树中插入一个字符串
    public void insert(char[] text) 
        TrieNode p = root;
        for(int i=0; i<text.length; i++) 
            int index = text[i] - a;
            if(p.children[index] == null) 
                TrieNode newNode = new TrieNode(text[i]);
                p.children[index] = newNode;
            
            p = p.children[index];
        
        p.isEndingChar = true;
    
    
    //在Trie树中查找一个字符串
    public List<String> find(char[] pattern) 
        TrieNode p = root;
        for(int i=0; i<pattern.length; i++) 
            int index = pattern[i] - a;
            if(p.children[index] == null) 
                return dfsResult;//不存在pattern
            
            p = p.children[index];
        
        if(p.isEndingChar) //完全匹配到pattern
            dfsResult.add(new String(pattern));
            return dfsResult;
        else //不能完全匹配,只是前缀
            String startPath = new String(pattern);
            //模式串 pattern的最后一个字符保存在p中,所以传入的path去掉该字符
            dfs(p, new StringBuffer(startPath.substring(0, startPath.length()-1)));
            return dfsResult;
        
    
    
    private List<String> dfsResult = new ArrayList<String>();
    private void dfs(TrieNode p, StringBuffer path) 
        if(p.isEndingChar) 
            dfsResult.add(new String(path.append(p.data).toString()));
        else 
            for(int j=0; j<26; j++) 
                if(p.children[j] != null) 
                    StringBuffer pathCopy = new StringBuffer(path.toString());
                    dfs(p.children[j], pathCopy.append(p.data));
                
            
        
    
    public class TrieNode
        public char data;
        public TrieNode[] children = new TrieNode[26];
        public boolean isEndingChar = false;
        public TrieNode(char data) 
            this.data = data;
        
    
    
    public static void main(String[] args) 
        TrieTree trieTree = new TrieTree();
        trieTree.insert("hello".toCharArray());
        trieTree.insert("world".toCharArray());
        trieTree.insert("word".toCharArray());
        trieTree.insert("teacher".toCharArray());
        trieTree.insert("wild".toCharArray());
        String pattern = "w";
        List<String> findResult = trieTree.find(pattern.toCharArray());
        for(String item : findResult) 
            System.out.println(item);
        
        System.out.println("------------------");
        
        trieTree.dfsResult.clear();
        pattern = "wor";
        findResult = trieTree.find(pattern.toCharArray());
        for(String item : findResult) 
            System.out.println(item);
        
        System.out.println("------------------");
        
        trieTree.dfsResult.clear();
        pattern = "word";
        findResult = trieTree.find(pattern.toCharArray());
        for(String item : findResult) 
            System.out.println(item);
        
        System.out.println("------------------");
    

 

以上是关于数据结构与算法 -- 字符串匹配的主要内容,如果未能解决你的问题,请参考以下文章

数据结构与算法 字符串匹配的KMP算法

数据结构与算法简记--多模式字符串匹配AC自动机

数据结构与算法之深入解析“通配符匹配”的求解思路与算法示例

数据结构与算法之深入解析“正则表达式匹配”的求解思路与算法示例

数据结构与算法之美-字符串匹配(上)

数据结构与算法 -- 字符串匹配