chinese multiple class classification using BERT

Posted wuxiangli

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了chinese multiple class classification using BERT相关的知识,希望对你有一定的参考价值。

Steps:

  1. git clone https://github.com/google-research/bert
  2. prepare data, download pre-trained models
  3. modify code in run_classifier.py
    1. add a new processor

      技术图片

      ? ?

    2. add the processor in main function

      ? ?

      技术图片

? ?

Train and predict

  1. train

    python run_classifier.py

    --task_name=multiclass

    --do_train=true

    --do_eval=true

    --data_dir=/home/wxl/bertProject/bertTextClassification/data

    --vocab_file=/home/wxl/bertProject/chinese_L-12_H-768_A-12/vocab.txt

    --bert_config_file=/home/wxl/bertProject/chinese_L-12_H-768_A-12/bert_config.json

    --init_checkpoint=/home/wxl/bertProject/chinese_L-12_H-768_A-12/bert_model.ckpt

    --max_seq_length=128

    --train_batch_size=16

    --learning_rate=2e-5

    --num_train_epochs=100.0

    --output_dir=/home/wxl/bertProject/bertTextClassification/outputThree/

    ? ?

    you would get the following result if success:

    技术图片

    ? ?

    ? ?

    ? ?

  2. predict

    python run_classifier.py

    --task_name=multiclass

    --do_predict=true

    --data_dir=/home/wxl/bertProject/bertTextClassification/data

    --vocab_file=/home/wxl/bertProject/chinese_L-12_H-768_A-12/vocab.txt

    --bert_config_file=/home/wxl/bertProject/chinese_L-12_H-768_A-12/bert_config.json

    --init_checkpoint=/home/wxl/bertProject/bertTextClassification/outputThreeV1

    --max_seq_length=128

    --output_dir=/home/wxl/bertProject/bertTextClassification/mulitiPredictThreeV1/

    ? ?

? ?

以上是关于chinese multiple class classification using BERT的主要内容,如果未能解决你的问题,请参考以下文章

SQL Server 为 SQL Server 2008 R2 的英文版,如何将 master 数据库的排序规则改为Chinese_PRC_CL_AS

SLF4J: Class path contains multiple SLF4J bindings.

SLF4J: Class path contains multiple SLF4J bindings

Maven中日志jar包冲突报错:Class path contains multiple SLF4J bindings

Found multiple occurrences of org.json.JSONObject on the class path:

UnityError The same field name is serialized multiple times in the class or its parent class. This