如何迭代 JPA 查询结果而不将它们全部保留在内存中?

Posted

技术标签:

【中文标题】如何迭代 JPA 查询结果而不将它们全部保留在内存中?【英文标题】:How do I iterate over JPA query results without retaining them all in memory? 【发布时间】:2013-10-07 00:46:43 【问题描述】:

我正在使用 Oracle、Spring、Hibernate 和 JPA。我想迭代任意数量的数据库结果,而不必担心内存不足。

我试图滚动查询结果而不保留我得到的每个对象。

首先我创建原生查询:

    Query q = getEm().createNativeQuery(sql,reportRowType);
    q.setHint("org.hibernate.fetchSize",1000);
    q.setHint("org.hibernate.cacheable",false);

然后我在自动装配对象上调用一个方法,该对象执行查询并对结果进行处理。对于测试,我完全忽略了结果并对其进行迭代。

    @Transactional(readOnly = true, propagation = Propagation.REQUIRES_NEW)
    public <T extends ResultRow> long run(EntityManager em, Query q) 
        ScrollableResults sr = q.unwrap(org.hibernate.Query.class)
            .setReadOnly(true)
            .setFetchSize(1000)
            .setCacheable(false)
            .setCacheMode(CacheMode.IGNORE)
            .scroll(ScrollMode.FORWARD_ONLY);
        try 
            while(sr.next()) 
                T obj = (T)sr.get(0);
                em.detach(obj);

                // do something with the row here
            
         finally 
            if(sr!=null)
                sr.close();
        
    

我发现使用上面的代码我最终会耗尽内存(我的测试大约有 150 万个结果)。 Query 对象以某种方式保留了对象。

我发现即使我通过页面运行查询(使用 q.setFirstResult 和 q.setMaxResults)它实际上仍然保留了对象。

这样做的唯一方法是创建一个全新的 Query 对象,然后使用 setFirstResult 和 setMaxResults 获取结果 1 到 10000,然后是 10001 到 20000,等等。

我已经阅读过有关休眠 StatelessSession 的信息,但是让它工作起来看起来相当复杂。真的没有办法在不保留所有查询结果的情况下执行jpa查询吗?

【问题讨论】:

【参考方案1】:

我最终想出了如何使用休眠无状态会话。这不会是可移植的,但如果你使用的是休眠,这样的东西会起作用。

import javax.persistence.EntityManager;

import java.util.Map;

import org.apache.log4j.Logger;
import org.hibernate.*;
import org.springframework.stereotype.Service;
import org.springframework.transaction.annotation.Propagation;
import org.springframework.transaction.annotation.Transactional;

@Service
public class StatelessQueryRunner

    /** Executes specified native sql in a stateless session. The consumer is given each row as it's received. */
    @Transactional(readOnly = true, propagation = Propagation.REQUIRES_NEW)
    public <T> long run(EntityManager em, String sql, Class<T> resultRowType, Map<String,Object> params, int fetchSize, Integer firstResult, Integer maxResults, QueryResultConsumer<T> consumer) 
        int totalResults = 0;
        Session hibernateSession = em.unwrap(Session.class);
        StatelessSession statelessSession = hibernateSession.getSessionFactory().openStatelessSession();
        try 
            // create the query for the stateless session.
            SQLQuery q = statelessSession.createSQLQuery(sql);
            q.addEntity(resultRowType);

            q.setFetchSize(1000);
            JpaQueryWrapper qw = new JpaQueryWrapper(q);
            if(params!=null) 
                for(Map.Entry<String,Object> entry : params.entrySet()) 
                    qw.setParameter(entry.getKey(),entry.getValue());
                
            

            if(firstResult!=null)
                q.setFirstResult(firstResult);
            if(maxResults!=null)
                q.setMaxResults(maxResults);


            ScrollableResults sr = q
                .setReadOnly(true)
                .setFetchSize(fetchSize)
                .setCacheable(false)
                .scroll(ScrollMode.FORWARD_ONLY);
            try 
                while(sr.next()) 
                    T obj = (T)sr.get(0);
                    em.detach(obj);
                    consumer.consume(obj);

                    ++totalResults;
//                    if(totalResults % 100000 == 0)
//                        Logger.getLogger(getClass()).debug("totalResults="+ totalResults);
                
             finally 
                if(sr!=null)
                    sr.close();
            

            return totalResults;

         finally 
            statelessSession.close();
        
    

    /** You can't use annotations to start transactions when inside a stateless session, so if you want to do anything you'll need to call this method to run code in a separate session which uses the Session object to start a transaction and save things. */
    public <T> void runInSession(EntityManager em, SessionRunnable<T> action) 
        Session hibernateSession = em.unwrap(Session.class);
        Session session = hibernateSession.getSessionFactory().openSession();
        try 
            action.run(session);
         finally 
            session.close();
        
    


    public interface SessionRunnable<T>
    
        public void run(Session session);
    



import javax.persistence.*;
import java.util.*;

import org.hibernate.SQLQuery;

public class JpaQueryWrapper
    implements javax.persistence.Query

    private SQLQuery q;

    public JpaQueryWrapper(SQLQuery q) 
        this.q = q;
    

    @Override
    public List getResultList() throw new UnsupportedOperationException();
    @Override
    public Object getSingleResult() throw new UnsupportedOperationException();
    @Override
    public int executeUpdate() throw new UnsupportedOperationException();
    @Override
    public javax.persistence.Query setMaxResults(int maxResult) throw new UnsupportedOperationException();
    @Override
    public int getMaxResults() throw new UnsupportedOperationException();
    @Override
    public Query setFirstResult(int startPosition) throw new UnsupportedOperationException();
    @Override
    public int getFirstResult() throw new UnsupportedOperationException();
    @Override
    public javax.persistence.Query setHint(String hintName, Object value) throw new UnsupportedOperationException();
    @Override
    public Map<String, Object> getHints() throw new UnsupportedOperationException();
    @Override
    public <T> javax.persistence.Query setParameter(Parameter<T> param, T value) throw new UnsupportedOperationException();
    @Override
    public javax.persistence.Query setParameter(Parameter<Calendar> param, Calendar value, TemporalType temporalType) throw new UnsupportedOperationException();
    @Override
    public javax.persistence.Query setParameter(Parameter<Date> param, Date value, TemporalType temporalType) throw new UnsupportedOperationException();
    @Override
    public javax.persistence.Query setParameter(String name, Object value) 
        if(value instanceof Enum) 
            q.setParameter(name,((Enum) value).name());
         else 
            q.setParameter(name,value);
        
        return this;
    
    @Override
    public javax.persistence.Query setParameter(String name, Calendar value, TemporalType temporalType) return setParameter(name,(Date)(value==null?null:value.getTime()),temporalType);
    @Override
    public javax.persistence.Query setParameter(String name, Date value, TemporalType temporalType) 
        if(temporalType==TemporalType.DATE)
            q.setDate(name,value);
        else if(temporalType==TemporalType.TIME)
            q.setTime(name,value);
        else if(temporalType==TemporalType.TIMESTAMP)
            q.setTimestamp(name,value);
        else
            throw new UnsupportedOperationException();
        return this;
    
    @Override
    public javax.persistence.Query setParameter(int position, Object value) q.setParameter(position,value); return this;
    @Override
    public javax.persistence.Query setParameter(int position, Calendar value, TemporalType temporalType) throw new UnsupportedOperationException();
    @Override
    public javax.persistence.Query setParameter(int position, Date value, TemporalType temporalType) throw new UnsupportedOperationException();
    @Override
    public Set<Parameter<?>> getParameters() throw new UnsupportedOperationException();
    @Override
    public Parameter<?> getParameter(String name) throw new UnsupportedOperationException();
    @Override
    public <T> Parameter<T> getParameter(String name, Class<T> type) throw new UnsupportedOperationException();
    @Override
    public Parameter<?> getParameter(int position) throw new UnsupportedOperationException();
    @Override
    public <T> Parameter<T> getParameter(int position, Class<T> type) throw new UnsupportedOperationException();
    @Override
    public boolean isBound(Parameter<?> param) throw new UnsupportedOperationException();
    @Override
    public <T> T getParameterValue(Parameter<T> param) throw new UnsupportedOperationException();
    @Override
    public Object getParameterValue(String name) throw new UnsupportedOperationException();
    @Override
    public Object getParameterValue(int position) throw new UnsupportedOperationException();
    @Override
    public javax.persistence.Query setFlushMode(FlushModeType flushMode) throw new UnsupportedOperationException();
    @Override
    public FlushModeType getFlushMode() throw new UnsupportedOperationException();
    @Override
    public javax.persistence.Query setLockMode(LockModeType lockMode) throw new UnsupportedOperationException();
    @Override
    public LockModeType getLockMode() throw new UnsupportedOperationException();
    @Override
    public <T> T unwrap(Class <T> cls) throw new UnsupportedOperationException();



public interface QueryResultConsumer<T>

    public void consume(T obj);

【讨论】:

【参考方案2】:

据我所知,在 JPA 中,唯一的方法是使用多个查询,正如您所描述的。另一方面,您可以肯定,这是唯一使用的技术(如果有的话)。我的意思是:要么获取所有行,要么分页,每页有一个数据库查询。

当然,如果您更频繁地需要此功能,您可以创建一个特殊的 Iterator,它需要在其构造函数中使用 pageSize 和 QueryProvider。

【讨论】:

以上是关于如何迭代 JPA 查询结果而不将它们全部保留在内存中?的主要内容,如果未能解决你的问题,请参考以下文章

如何有效地乘以重复行的火炬张量而不将所有行存储在内存中或迭代?

在熊猫系列中保留带有图案的元素而不将它们转换为列表

R:函数如何使用省略号 (...) 接受变量参数而不将它们复制到内存中?

执行命令而不将其保留在历史记录中[关闭]

如何直接保存到持久存储,而不将数据保存到内存中

如何在保留元数据的同时转发电子邮件而不将电子邮件重新发送给抄送收件人?