java 使用CRF遇到的问题汇总
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了java 使用CRF遇到的问题汇总相关的知识,希望对你有一定的参考价值。
1、libCRFPP.so放在idea项目 resources下,打jar包时打在jar中。
jar包工具类
/*
* Class NativeUtils is published under the The MIT License:
*
* Copyright (c) 2012 Adam Heinrich <[email protected]>
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in all
* copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
* SOFTWARE.
*/
package cz.adamh.utils;
import java.io.*;
/**
* A simple library class which helps with loading dynamic libraries stored in the
* JAR archive. These libraries usualy contain implementation of some methods in
* native code (using JNI - Java Native Interface).
*
* @see http://adamheinrich.com/blog/2012/how-to-load-native-jni-library-from-jar
* @see https://github.com/adamheinrich/native-utils
*
*/
public class NativeUtils {
/**
* Private constructor - this class will never be instanced
*/
private NativeUtils() {
}
/**
* Loads library from current JAR archive
*
* The file from JAR is copied into system temporary directory and then loaded. The temporary file is deleted after exiting.
* Method uses String as filename because the pathname is "abstract", not system-dependent.
*
* @param path The path of file inside JAR as absolute path (beginning with ‘/‘), e.g. /package/File.ext
* @throws IOException If temporary file creation or read/write operation fails
* @throws IllegalArgumentException If source file (param path) does not exist
* @throws IllegalArgumentException If the path is not absolute or if the filename is shorter than three characters (restriction of {@see File#createTempFile(java.lang.String, java.lang.String)}).
*/
public static void loadLibraryFromJar(String path) throws IOException {
if (!path.startsWith("/")) {
throw new IllegalArgumentException("The path has to be absolute (start with ‘/‘).");
}
// Obtain filename from path
String[] parts = path.split("/");
String filename = (parts.length > 1) ? parts[parts.length - 1] : null;
// Split filename to prexif and suffix (extension)
String prefix = "";
String suffix = null;
if (filename != null) {
parts = filename.split("\\\\.", 2);
prefix = parts[0];
suffix = (parts.length > 1) ? "."+parts[parts.length - 1] : null; // Thanks, davs! :-)
}
// Check if the filename is okay
if (filename == null || prefix.length() < 3) {
throw new IllegalArgumentException("The filename has to be at least 3 characters long.");
}
// Prepare temporary file
File temp = File.createTempFile(prefix, suffix);
temp.deleteOnExit();
if (!temp.exists()) {
throw new FileNotFoundException("File " + temp.getAbsolutePath() + " does not exist.");
}
// Prepare buffer for data copying
byte[] buffer = new byte[1024];
int readBytes;
// Open and check input stream
InputStream is = NativeUtils.class.getResourceAsStream(path);
if (is == null) {
throw new FileNotFoundException("File " + path + " was not found inside JAR.");
}
// Open output stream and copy data between source file in JAR and the temporary file
OutputStream os = new FileOutputStream(temp);
try {
while ((readBytes = is.read(buffer)) != -1) {
os.write(buffer, 0, readBytes);
}
} finally {
// If read/write fails, close streams safely before throwing an exception
os.close();
is.close();
}
// Finally, load the library
System.load(temp.getAbsolutePath());
}
}
2、需要安装CRF相关信息
网上找到两种方式:
出现这种情况的原因是找不到libcrfpp.so.0等库文件,解决方案一为(貌似此方法对root用户不管用):
- 修改/etc/ld.so.conf文件
- 加入include /usr/local/lib
- 执行/sbin/ldconfig -v,刷新LIB库
解决方案二为建立以下符号链接:
ln -s /usr/local/lib/libcrfpp.a /usr/lib/libcrfpp.a
ln -s /usr/local/lib/libcrfpp.so /usr/lib/libcrfpp.so
ln -s /usr/local/lib/libcrfpp.so.0 /usr/lib/libcrfpp.so.0
连接 https://zxdcs.github.io/post/16/crf_java/
python 用户连接 http://midday.me/article/94d6bd4973264e1a801f8445904a810d
公司线上环境是docker容器方式不可用,实际用的方式一。
3、再有是连接库使用训练出来的model文件。路径网上均采用相对路劲,实际容器中不可用,采用绝对路径后解决。
Caused by: java.lang.RuntimeException: feature_index.cpp(193) [mmap_.open(model_filename)] mmap.h(153) [(fd = ::open(filename, flag | O_BINARY)) >= 0] open failed: model
at org.chasen.crfpp.CRFPPJNI.new_Tagger(Native Method)
at org.chasen.crfpp.Tagger.<init>(Tagger.java:183)
at com.jd.app.server.LoadCRFModel.<clinit>(LoadCRFModel.java:89)
... 63 more
这个错误可以采用3解决。
以上是关于java 使用CRF遇到的问题汇总的主要内容,如果未能解决你的问题,请参考以下文章
如何使用词嵌入作为 CRF (sklearn-crfsuite) 模型训练的特征
无法在 keras 的 BERT 之上为 NER 添加 CRF 层