(paper reading)Entity Linking with a Knowledge Base: Issues, Techniques, and Solutions

Posted 2020-09-08

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了(paper reading)Entity Linking with a Knowledge Base: Issues, Techniques, and Solutions相关的知识，希望对你有一定的参考价值。

给定一个包含一系列实体E的知识库，以及提到了M个已确定实体的文本集合，实体链接的目的是将文本中提到的每个实体m∈M链接到知识库中对应的实体e∈E上。如果文本中提到的实体在知识库中没有对应，则被称为unlinkable mentions，对这样的一类实体，一个实体链接系统会给它加上一个特殊的标签NIL。

一个典型的实体链接系统应该包含三个模块：

Candidate entity generation

　　对M当中的每一个m，实体链接系统需要在知识库中找出候选的实体集合E_m，主要的实现方法有：

- dictionary based techniques

　　　　　　利用wikipedia的一些属性构造一个字典，然后在字典当中进行查找。

- surface form expanssion from the local document

　　　　　　使用一些方法将要链接的实体m展开成全名，别名等。

- - Heuristic Based Methods
  - Supervised Learning Methods
- methods based on search engine

　　　　　　一些搜索引擎集成了寻找相似名称的实体的功能，所以存在直接利用搜索引擎的方法。

Candidate entity ranking

　　将候选的实体集合按照一定的准则进行排序，挑选出最有可能满足条件的实体。

　　确定准则需要了解实体的features，context-independant features包括name string comparison，entity popularity和entity type，即只需要考虑实体本身和候选的实体集合本身，context-dependant features则需要分析实体出现的环境，包括textual context和coherence between mapping entities。

　　对候选实体的集合进行排序主要的实现方法有：

- supervised ranking methods
  - binary classification methods
  - learning to rank methods
  - probabilistic methods
  - graph based approaches
  - model combination
  - training data generation
- unsupervised ranking methods
  - VSM based methods
  - information retieval based methods

Unlinkable mention prediction

　　确认排序最靠前的候选实体是否是m对应的目标实体，如果都不是需要给m加上unlikable mention的标签。

实体链接的应用主要有：

Information Extraction
Information Retrieval
Content Analysis
Question Answering
Knowledge Base Population

作者认为未来的研究方向有：

1. 考虑对其他类型的mention进行链接，而不是文本中的。

2. 考虑计算复杂度，效率和可扩展性。

3. 考虑domain-specific entity linking system。

以上是关于(paper reading)Entity Linking with a Knowledge Base: Issues, Techniques, and Solutions的主要内容，如果未能解决你的问题，请参考以下文章

How to read a scientific paper

paper reading:gaze tracking

Paper Reading_Computer Architecture

Paper Reading_SysML

Testing & Paper reading——Sketchvisor

CVPR 2016 paper reading