NLPConditional Language Models

Posted duye

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了NLPConditional Language Models相关的知识,希望对你有一定的参考价值。

Language Model estimates the probs that the sequences of words can be a sentence said by a human. Training it, we can get the embeddings of the whole vocabulary.

 

UnConditional Language Model  just assigns probs to sequences of words. That’s to say, given the first n-1 words and to predict the probs of the next word.(learn the prob distribution of next word).

Beacuse of the probs chain rule, we only train this:

技术分享图片

 

Conditional LMs

A conditional language model assigns probabilities to sequences of words, W =(w1,w2,…,wt) , given some conditioning context x.

 

For example, in the translation task, we must given the orininal sentence and its translation. The orininal sentence is the conditioning context, and by using it, we predict the objection sentence.

 

Data for training conditional LMs:

  To train conditional language models, we need paired? samples.E.X.

技术分享图片

Such task like:Translation, summarisation, caption generation,? speech recognition

 

How to evaluate the conditional LMs?

  • Traditional methods: use the cross-entropy or perplexity.(hard to interpret,easy to implement)
  • Task-specific evaluation:  Compare the model’s most likely output to human-generated expected output . Such as 【BLEU】、METEOR、ROUGE…(okay to interpret,easy to implement)
  • Human evaluation: Hard to implement.

 

Algorithmic challenges:

Given the condition context x, to find the max-probs of the the predict sequence of words, we cannot use the gready search, which might cann’t generate a real sentence.

We use the 【Beam Search】.

 

We draw attention to the “encoder-decoder” models  that learn a function that maps  x  into a ?xed-size? vector and then uses a language model to “decode”? that vector into a sequence of words, 

技术分享图片

 

Model: K&B2013

技术分享图片

A simpal of Encoder – just cumsum(very easy)

技术分享图片

A simpal of Encoder – CSM Encoder:use CNN to encode

技术分享图片

The Decoder – RNN Decoder

技术分享图片

The cal graph is.

技术分享图片

 

Sutskever et al. Model (2014):

- Important.Classic Model

技术分享图片

Cal Graph:

技术分享图片

 

Some Tricks to Sutskever et al. Model :

  • Read the Input Sequence ‘backwards’: +4BLEU

  技术分享图片

  •  Use an ensemble of m 【independently trained】 models (at the decode period) :
  1. Ensemble of 2 models: +3 BLEU
  2. Ensemble of 5 models: +4.5 BLEU


    For example:

      技术分享图片

  • we want to ?nd the most probable (MAP) output? given the input,i,e.

      技术分享图片

  We use the beam search : +1BLEU

    For example,the beam size is 2:

      技术分享图片

 

Example of A Application: Image caption generation

Encoder:CNN

Decoder:RNN or

             conditional n-gram LM(different to the RNN but it is useful)

             技术分享图片

             技术分享图片

 

We must have some datasets already.

Kiros et al. Model has done this.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

  .


以上是关于NLPConditional Language Models的主要内容,如果未能解决你的问题,请参考以下文章

PHP is a good language

Ubuntu 14.10没有语言支持(language support

Django 1.3 LANGUAGE_CODE 不正确

C Language

YAML (Yet Another Markup Language) - Kummer话你知

eclipse 打开jsp页面报错