句子级ws4j中的语义匹配

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了句子级ws4j中的语义匹配相关的知识,希望对你有一定的参考价值。

我目前正试图在语法上匹配ws4j中的两个句子。我在单词级别实现了这个概念,但是在句子级别实现相同的功能并且以在线演示中显示的矩阵形式获得输出。如何开发代码来做同样的事情?

import java.util.List;
import edu.cmu.lti.ws4j.impl.Lesk;
import edu.cmu.lti.jawjaw.pobj.POS;
import edu.cmu.lti.lexical_db.ILexicalDatabase;
import edu.cmu.lti.lexical_db.NictWordNet;
import edu.cmu.lti.lexical_db.data.Concept;
import edu.cmu.lti.ws4j.Relatedness;
import edu.cmu.lti.ws4j.RelatednessCalculator;

public class WordMatcher1 {
public static void main(String[] args)
{
    String word1="rifle";
    String word2="gun";

    ILexicalDatabase db = new NictWordNet();
    RelatednessCalculator lesk = new Lesk(db);

    List<POS[]> posPairs = lesk.getPOSPairs();
    double maxScore = -1D;

    for(POS[] posPair: posPairs) 
    {
        String p1 = null,p2 = null;
        List<Concept> synsets1 = (List<Concept>)db.getAllConcepts(word1, posPair[0].toString());
        List<Concept> synsets2 = (List<Concept>)db.getAllConcepts(word2, posPair[1].toString());

        for(Concept ss1: synsets1) 
        {
            for (Concept ss2: synsets2) 
            {
                p1 = ss1.getPos().toString();
                p2 = ss2.getPos().toString();
                Relatedness relatedness = lesk.calcRelatednessOfSynset(ss1, ss2);
                double score = relatedness.getScore();
                if (score > maxScore) 
                { 
                    maxScore = score;
                }

            }
        }

        if (maxScore == -1D) 
        {
            maxScore = 0.0;
        }
        System.out.println("sim('" + word1 +" "+ p1 +"', '" + p2 +" "+ word2+ "') =  " + maxScore);
    }
}
答案

我有类似的问题,这个例子有效:

import java.util.List;
import edu.cmu.lti.jawjaw.pobj.POS;
import edu.cmu.lti.lexical_db.ILexicalDatabase;
import edu.cmu.lti.lexical_db.NictWordNet;
import edu.cmu.lti.lexical_db.data.Concept;
import edu.cmu.lti.ws4j.Relatedness;
import edu.cmu.lti.ws4j.RelatednessCalculator;
import edu.cmu.lti.ws4j.impl.Lesk;
import edu.cmu.lti.ws4j.util.WS4JConfiguration;

public class LeskSimilarity{

    public static void main(String[] args) {
    ILexicalDatabase db = new NictWordNet();
    RelatednessCalculator lesk = new Lesk(db);
    String word1="rifle";
    POS posWord1=  POS.n;
    String word2= "gun";
    POS posWord2= POS.n;
    double maxScore = 0;

        WS4JConfiguration.getInstance().setMFS(true);

        List<Concept> synsets1 = (List<Concept>)db.getAllConcepts(word1, posWord1.name());
        List<Concept> synsets2 = (List<Concept>)db.getAllConcepts(word2, posWord2.name());

        for(Concept synset1: synsets1) {
            for (Concept synset2: synsets2) {
                Relatedness relatedness =     lesk.calcRelatednessOfSynset(synset1, synset2);
            double score = relatedness.getScore();
            if (score > maxScore) { 
                maxScore = score;
            }
          }
        }

    if (maxScore == -1D) {
        maxScore = 0.0;
    }

    System.out.println("Similarity score of " + word1 + " & " + word2 + " : " + maxScore);
  }
}

以上是关于句子级ws4j中的语义匹配的主要内容,如果未能解决你的问题,请参考以下文章

使用句子级相似度的释义识别

使用 C++ 反转句子中的每个单词需要对我的代码片段进行代码优化

Cg入门19:Fragment shader - 片段级模型动态变色

DSSM:深度语义匹配模型(及其变体CLSMLSTM-DSSM)

论文阅读“Dual-View Variational Autoencoders for Semi-Supervised Text Matching”

如何通过C#中的特定片段从句子中提取整个单词?