HibernateSearch、JPA、H2驱动在数据库索引过程中抛出异常

Posted

技术标签:

【中文标题】HibernateSearch、JPA、H2驱动在数据库索引过程中抛出异常【英文标题】:HibernateSearch, JPA, H2 driver throws an exception during database indexing process 【发布时间】:2013-09-04 07:18:40 【问题描述】:

使用 HibernateSearch 我想索引我的 H2 嵌入式数据库。 调用此代码:

EntityManager em = articleDao.getEntityManager();
FullTextEntityManager fullTextEntityManager = Search.getFullTextEntityManager(em);
try 
    fullTextEntityManager.createIndexer().progressMonitor(new CustomMassIndexerProcessMonitor()).startAndWait();
 catch (InterruptedException e) 
    e.printStackTrace();
  

索引几分钟后,它会抛出以下异常:

2013-09-04 09:01:41 ERROR LogErrorHandler.handleException():83 - HSEARCH000058: HSEARCH000116: Unexpected error during MassIndexer operation
java.lang.OutOfMemoryError: Java heap space
    at java.util.Arrays.copyOf(Unknown Source)
    at java.lang.AbstractStringBuilder.expandCapacity(Unknown Source)
    at java.lang.AbstractStringBuilder.ensureCapacityInternal(Unknown Source)
    at java.lang.AbstractStringBuilder.append(Unknown Source)
    at java.lang.StringBuffer.append(Unknown Source)
    at java.io.StringWriter.write(Unknown Source)
    at org.h2.util.IOUtils.copyAndCloseInput(IOUtils.java:201)
    at org.h2.util.IOUtils.readStringAndClose(IOUtils.java:301)
    at org.h2.value.ValueLobDb.getString(ValueLobDb.java:226)
    at org.h2.jdbc.JdbcResultSet.getString(JdbcResultSet.java:296)
    at org.hibernate.type.descriptor.sql.VarcharTypeDescriptor$2.doExtract(VarcharTypeDescriptor.java:66)
    at org.hibernate.type.descriptor.sql.BasicExtractor.extract(BasicExtractor.java:64)
    at org.hibernate.type.AbstractStandardBasicType.nullSafeGet(AbstractStandardBasicType.java:261)
    at org.hibernate.type.AbstractStandardBasicType.nullSafeGet(AbstractStandardBasicType.java:257)
    at org.hibernate.type.AbstractStandardBasicType.nullSafeGet(AbstractStandardBasicType.java:247)
    at org.hibernate.type.AbstractStandardBasicType.hydrate(AbstractStandardBasicType.java:332)
    at org.hibernate.persister.entity.AbstractEntityPersister.hydrate(AbstractEntityPersister.java:2912)
    at org.hibernate.loader.Loader.loadFromResultSet(Loader.java:1673)
    at org.hibernate.loader.Loader.instanceNotYetLoaded(Loader.java:1605)
    at org.hibernate.loader.Loader.getRow(Loader.java:1505)
    at org.hibernate.loader.Loader.getRowFromResultSet(Loader.java:713)
    at org.hibernate.loader.Loader.processResultSet(Loader.java:943)
    at org.hibernate.loader.Loader.doQuery(Loader.java:911)
    at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:342)
    at org.hibernate.loader.Loader.doList(Loader.java:2526)
    at org.hibernate.loader.Loader.doList(Loader.java:2512)
    at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2342)
    at org.hibernate.loader.Loader.list(Loader.java:2337)
    at org.hibernate.loader.criteria.CriteriaLoader.list(CriteriaLoader.java:124)
    at org.hibernate.internal.SessionImpl.list(SessionImpl.java:1662)
    at org.hibernate.internal.CriteriaImpl.list(CriteriaImpl.java:374)
    at org.hibernate.search.batchindexing.impl.IdentifierConsumerEntityProducer.loadList(IdentifierConsumerEntityProducer.java:151)
Hibernate Search: entityloader-2, CustomMassIndexerProcessMonitor entitiesLoaded(10)
Hibernate Search: collectionsloader-2, CustomMassIndexerProcessMonitor documentsAdded(1)
Hibernate Search: collectionsloader-2, CustomMassIndexerProcessMonitor documentsBuilt(1)
Hibernate Search: collectionsloader-3, CustomMassIndexerProcessMonitor documentsAdded(1)
Hibernate Search: collectionsloader-3, CustomMassIndexerProcessMonitor documentsBuilt(1)
2013-09-04 09:01:47 ERROR LogErrorHandler.handleException():83 - HSEARCH000058: HSEARCH000116: Unexpected error during MassIndexer operation
java.lang.OutOfMemoryError: Java heap space
    at java.util.Arrays.copyOfRange(Unknown Source)
    at java.lang.String.<init>(Unknown Source)
    at java.lang.StringBuffer.toString(Unknown Source)
    at java.io.StringWriter.toString(Unknown Source)
    at org.h2.util.IOUtils.readStringAndClose(IOUtils.java:302)
    at org.h2.value.ValueLobDb.getString(ValueLobDb.java:226)
    at org.h2.jdbc.JdbcResultSet.getString(JdbcResultSet.java:296)
    at org.hibernate.type.descriptor.sql.VarcharTypeDescriptor$2.doExtract(VarcharTypeDescriptor.java:66)
    at org.hibernate.type.descriptor.sql.BasicExtractor.extract(BasicExtractor.java:64)
    at org.hibernate.type.AbstractStandardBasicType.nullSafeGet(AbstractStandardBasicType.java:261)
    at org.hibernate.type.AbstractStandardBasicType.nullSafeGet(AbstractStandardBasicType.java:257)
    at org.hibernate.type.AbstractStandardBasicType.nullSafeGet(AbstractStandardBasicType.java:247)
    at org.hibernate.type.AbstractStandardBasicType.hydrate(AbstractStandardBasicType.java:332)
    at org.hibernate.persister.entity.AbstractEntityPersister.hydrate(AbstractEntityPersister.java:2912)
    at org.hibernate.loader.Loader.loadFromResultSet(Loader.java:1673)
    at org.hibernate.loader.Loader.instanceNotYetLoaded(Loader.java:1605)
    at org.hibernate.loader.Loader.getRow(Loader.java:1505)
    at org.hibernate.loader.Loader.getRowFromResultSet(Loader.java:713)
    at org.hibernate.loader.Loader.processResultSet(Loader.java:943)
    at org.hibernate.loader.Loader.doQuery(Loader.java:911)
    at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:342)
    at org.hibernate.loader.Loader.doList(Loader.java:2526)
    at org.hibernate.loader.Loader.doList(Loader.java:2512)
    at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2342)
    at org.hibernate.loader.Loader.list(Loader.java:2337)
    at org.hibernate.loader.criteria.CriteriaLoader.list(CriteriaLoader.java:124)
    at org.hibernate.internal.SessionImpl.list(SessionImpl.java:1662)
    at org.hibernate.internal.CriteriaImpl.list(CriteriaImpl.java:374)
    at org.hibernate.search.batchindexing.impl.IdentifierConsumerEntityProducer.loadList(IdentifierConsumerEntityProducer.java:151)
    at org.hibernate.search.batchindexing.impl.IdentifierConsumerEntityProducer.loadAllFromQueue(IdentifierConsumerEntityProducer.java:117)
    at org.hibernate.search.batchindexing.impl.IdentifierConsumerEntityProducer.run(IdentifierConsumerEntityProducer.java:94)
    at org.hibernate.search.batchindexing.impl.OptionallyWrapInJTATransaction.run(OptionallyWrapInJTATransaction.java:132)

似乎 H2 util 类之一在尝试从 DB 读取时抛出此异常。我试图增加堆使用:'-Xms1024m -Xmx2048m',但这没有帮助:( 情景如下。我的 H2 数据库的每个条目都有一个字段类型 CLOB。如果我向该字段写入少量内容,那么一切都很好,不会引发任何错误。 但是,如果我在这些字段中有大量内容(每个 900kb),则在索引过程中会引发错误。

我正在使用以下罐子: 休眠实体管理器 4.2.4.Final h2 1.3.173 休眠搜索 4.4.0.Alpha1

这是我的持久化单元配置:

<persistence-unit name="hibernateSearchH2TestPersistenceUnit" transaction-type="RESOURCE_LOCAL">
    <provider>org.hibernate.ejb.HibernatePersistence</provider>

    <mapping-file>META-INF/queriesForTest.xml</mapping-file>

    <class>com.kaidex.db.entity.DocStatus</class>
    <class>com.kaidex.db.entity.DocType</class>
    <class>com.kaidex.db.entity.Article</class>
    <class>com.kaidex.db.entity.Issuer</class>
    <class>com.kaidex.db.entity.PublishingSource</class>

    <properties>
        <property name="hibernate.connection.url" value="jdbc:h2:D:\\kaidextestdb;CIPHER=XTEA"/>

        <property name="hibernate.dialect" value="org.hibernate.dialect.H2Dialect"/>
        <property name="hibernate.connection.driver_class" value="org.h2.Driver"/>

        <property name="hibernate.connection.username" value="sa"/>
        <property name="hibernate.connection.password" value="filepass userpass"/>

        <property name="hibernate.format_sql" value="true"/>
        <property name="hibernate.show_sql" value="false" />
        <property name="hibernate.hbm2ddl.auto" value="update" />


        <property name="hibernate.search.default.directory_provider" value="filesystem"/> 
        <property name="hibernate.search.default.indexBase" value="D:\lucene"/>
        <property name="hibernate.search.lucene_version" value="LUCENE_36"/>
    </properties>
</persistence-unit>

已更新。添加了实体配置:

@Entity(name="Article")
@Table(name="Article", schema="Kaidexdb")
@Indexed
public class Article 
    @Id
    @GeneratedValue(strategy=GenerationType.IDENTITY)
    private long id;
    ... 
    @Field(index=Index.YES, analyze=Analyze.YES, store=Store.NO)
    @Column(columnDefinition="CLOB")
    private String contentRo;

    @Field(index=Index.YES, analyze=Analyze.YES, store=Store.NO)
    @Column(columnDefinition="CLOB")
    private String contentRu;

    @IndexedEmbedded
    @ManyToOne
    @JoinColumn(name="docType_id", nullable=false)  
    private DocType docType;

    @IndexedEmbedded
    @ManyToOne
    @JoinColumn(name="docStatus_id", nullable=false)    
    private DocStatus docStatus;

    @IndexedEmbedded
    @ManyToOne
    @JoinColumn(name="issuer_id", nullable=false)
    private Issuer issuer;

    @IndexedEmbedded
    @ManyToOne
    @JoinColumn(name="ps_id", nullable=false)
    private PublishingSource publishingSource;
...


@Entity(name="DocStatus")
@Table(name="DocStatus", schema="Kaidexdb")
@Indexed
public class DocStatus 
    @Id
    @GeneratedValue(strategy=GenerationType.IDENTITY)
    private Long id;

    @Field(index=Index.YES, analyze=Analyze.YES, store=Store.NO)
    private String longNameRo;
    @Field(index=Index.YES, analyze=Analyze.YES, store=Store.NO)
    private String longNameRu;

    @OneToMany(mappedBy="docStatus", targetEntity=Article.class)
    private List<Article> articles; 
...

@Entity(name="DocType")
@Table(name="DocType", schema="Kaidexdb")
@Indexed
public class DocType 

    @Id
    @GeneratedValue(strategy=GenerationType.IDENTITY)
    private long id;

    @Field(index=Index.YES, analyze=Analyze.YES, store=Store.NO)
    @Column(unique=true)
    private String shortName;

    @Field(index=Index.YES, analyze=Analyze.YES, store=Store.NO)
    private String longNameRo;

    @Field(index=Index.YES, analyze=Analyze.YES, store=Store.NO)
    private String longNameRu;

    @OneToMany(mappedBy="docType", targetEntity=Article.class)
    private List<Article> articles;
...


@Entity(name="Issuer")
@Table(name="Issuer", schema="Kaidexdb")
@Indexed
public class Issuer 
    @Id
    @GeneratedValue(strategy=GenerationType.IDENTITY)
    private long id;

    @Field(index=Index.YES, analyze=Analyze.YES, store=Store.NO)
    private String shortNameRo; 

    @Field(index=Index.YES, analyze=Analyze.YES, store=Store.NO)
    private String longNameRo;  

    @Field(index=Index.YES, analyze=Analyze.YES, store=Store.NO)
    private String longNameRu;

    @Field(index=Index.YES, analyze=Analyze.YES, store=Store.NO)
    @Column(name="parent_id")
    private long parentId; 

    @OneToMany(mappedBy="issuer", targetEntity=Article.class)
    private List<Article> articles;
...


@Entity
@Table(name="PublishingSource", schema="Kaidexdb")
@Indexed
public class PublishingSource 
    @Id
    @GeneratedValue(strategy=GenerationType.IDENTITY)
    private long id;

    @Field(index=Index.YES, analyze=Analyze.YES, store=Store.NO)
    private String longNameRo;

    @Field(index=Index.YES, analyze=Analyze.YES, store=Store.NO)
    private String longNameRu;


    @OneToMany(mappedBy="publishingSource", targetEntity=Article.class)
    private List<Article> articles; 

有人可以帮我解决这个问题吗? 也许我应该对我的 H2 嵌入式数据库进行一些特定的配置,以通知 H2 我使用了一个大的 CLOB 字段?

提前谢谢你。

【问题讨论】:

您能否也发布您的实体的配置方式?你在索引 CLOB 吗?实体之间是否存在(引用)导致在索引单个实体时加载多个关联实体? 谢谢哈代的回复。我在上面添加了实体配置。 【参考方案1】:

根据堆栈跟踪,Hibernate 搜索尝试将 CLOB 作为字符串从数据库中加载(使用 java.sql.ResultSet.getString)。因此,H2 必须完全加载 CLOB。此外,Hibernate Search 似乎将如此大的字符串列表完全保存在内存中:

at org.h2.value.ValueLobDb.getString(ValueLobDb.java:226)
at org.h2.jdbc.JdbcResultSet.getString(JdbcResultSet.java:296)
at org.hibernate.type.descriptor.sql.VarcharTypeDescriptor$2.doExtract(VarcharTypeDescriptor.java:66)
...
at org.hibernate.internal.CriteriaImpl.list(CriteriaImpl.java:374)
at org.hibernate.search.batchindexing.impl.IdentifierConsumerEntityProducer.loadList(IdentifierConsumerEntityProducer.java:151)

所以它看起来像是 Hibernate Search 中的一个问题。与 Hibernate Search have been reported before 相关的内存问题(我知道这是一个旧版本),但我首先会尝试使用不同版本的 Hibernate Search,特别是因为您使用的是 Alpha 版本(4.4.0.Alpha1 )。可能是已知问题。

【讨论】:

感谢托马斯这么快的回复。我已将 Hibernate Search lib 更改为:'hibernate-search 4.3.0.Final' 和 'hibernate-search 4.2.0.Final' 但抛出了同样的错误:( 只是为了更新一些信息。我已经使用其他数据库驱动程序(HSQLDB、Apache Derby)测试了相同的代码,并且没有重现此异常。我想使用 H2 数据库,因为它使用的硬盘空间比其他嵌入式数据库少,而且速度也比其他数据库快。【参考方案2】:

我已经部分解决了这个问题。我减少了 Hibernate Search 每个查询加载的对象数量:

EntityManager em = articleDao.getEntityManager();
FullTextEntityManager fullTextEntityManager = Search.getFullTextEntityManager(em);
fullTextEntityManager.createIndexer().batchSizeToLoadObjects(2).startAndWait();

如果 H2 开发人员能在下一个版本中解决这个问题,那就太好了。 我用 HSQLDB 和 Apache Derby 测试了相同的代码,这些 db 驱动程序没有抛出任何异常。

【讨论】:

以上是关于HibernateSearch、JPA、H2驱动在数据库索引过程中抛出异常的主要内容,如果未能解决你的问题,请参考以下文章

无法使用 JPA 在 H2 中创建表和加载数据

为啥这个 JPA 查询在 H2 数据库上失败?

使用 JPA (Eclipselink) 在 H2 数据库中执行全文搜索

使用 H2、JPA 注释和 Hibernate 问题的一对多关联

如何使用hibernate jpa在内存数据库中设置h2?

使用 DDL 脚本时,JPA 始终无法创建 (H2) 数据库模式和表