如何迭代 JPA 查询结果而不将它们全部保留在内存中?
Posted
技术标签:
【中文标题】如何迭代 JPA 查询结果而不将它们全部保留在内存中?【英文标题】:How do I iterate over JPA query results without retaining them all in memory? 【发布时间】:2013-10-07 00:46:43 【问题描述】:我正在使用 Oracle、Spring、Hibernate 和 JPA。我想迭代任意数量的数据库结果,而不必担心内存不足。
我试图滚动查询结果而不保留我得到的每个对象。
首先我创建原生查询:
Query q = getEm().createNativeQuery(sql,reportRowType);
q.setHint("org.hibernate.fetchSize",1000);
q.setHint("org.hibernate.cacheable",false);
然后我在自动装配对象上调用一个方法,该对象执行查询并对结果进行处理。对于测试,我完全忽略了结果并对其进行迭代。
@Transactional(readOnly = true, propagation = Propagation.REQUIRES_NEW)
public <T extends ResultRow> long run(EntityManager em, Query q)
ScrollableResults sr = q.unwrap(org.hibernate.Query.class)
.setReadOnly(true)
.setFetchSize(1000)
.setCacheable(false)
.setCacheMode(CacheMode.IGNORE)
.scroll(ScrollMode.FORWARD_ONLY);
try
while(sr.next())
T obj = (T)sr.get(0);
em.detach(obj);
// do something with the row here
finally
if(sr!=null)
sr.close();
我发现使用上面的代码我最终会耗尽内存(我的测试大约有 150 万个结果)。 Query 对象以某种方式保留了对象。
我发现即使我通过页面运行查询(使用 q.setFirstResult 和 q.setMaxResults)它实际上仍然保留了对象。
这样做的唯一方法是创建一个全新的 Query 对象,然后使用 setFirstResult 和 setMaxResults 获取结果 1 到 10000,然后是 10001 到 20000,等等。
我已经阅读过有关休眠 StatelessSession 的信息,但是让它工作起来看起来相当复杂。真的没有办法在不保留所有查询结果的情况下执行jpa查询吗?
【问题讨论】:
【参考方案1】:我最终想出了如何使用休眠无状态会话。这不会是可移植的,但如果你使用的是休眠,这样的东西会起作用。
import javax.persistence.EntityManager;
import java.util.Map;
import org.apache.log4j.Logger;
import org.hibernate.*;
import org.springframework.stereotype.Service;
import org.springframework.transaction.annotation.Propagation;
import org.springframework.transaction.annotation.Transactional;
@Service
public class StatelessQueryRunner
/** Executes specified native sql in a stateless session. The consumer is given each row as it's received. */
@Transactional(readOnly = true, propagation = Propagation.REQUIRES_NEW)
public <T> long run(EntityManager em, String sql, Class<T> resultRowType, Map<String,Object> params, int fetchSize, Integer firstResult, Integer maxResults, QueryResultConsumer<T> consumer)
int totalResults = 0;
Session hibernateSession = em.unwrap(Session.class);
StatelessSession statelessSession = hibernateSession.getSessionFactory().openStatelessSession();
try
// create the query for the stateless session.
SQLQuery q = statelessSession.createSQLQuery(sql);
q.addEntity(resultRowType);
q.setFetchSize(1000);
JpaQueryWrapper qw = new JpaQueryWrapper(q);
if(params!=null)
for(Map.Entry<String,Object> entry : params.entrySet())
qw.setParameter(entry.getKey(),entry.getValue());
if(firstResult!=null)
q.setFirstResult(firstResult);
if(maxResults!=null)
q.setMaxResults(maxResults);
ScrollableResults sr = q
.setReadOnly(true)
.setFetchSize(fetchSize)
.setCacheable(false)
.scroll(ScrollMode.FORWARD_ONLY);
try
while(sr.next())
T obj = (T)sr.get(0);
em.detach(obj);
consumer.consume(obj);
++totalResults;
// if(totalResults % 100000 == 0)
// Logger.getLogger(getClass()).debug("totalResults="+ totalResults);
finally
if(sr!=null)
sr.close();
return totalResults;
finally
statelessSession.close();
/** You can't use annotations to start transactions when inside a stateless session, so if you want to do anything you'll need to call this method to run code in a separate session which uses the Session object to start a transaction and save things. */
public <T> void runInSession(EntityManager em, SessionRunnable<T> action)
Session hibernateSession = em.unwrap(Session.class);
Session session = hibernateSession.getSessionFactory().openSession();
try
action.run(session);
finally
session.close();
public interface SessionRunnable<T>
public void run(Session session);
import javax.persistence.*;
import java.util.*;
import org.hibernate.SQLQuery;
public class JpaQueryWrapper
implements javax.persistence.Query
private SQLQuery q;
public JpaQueryWrapper(SQLQuery q)
this.q = q;
@Override
public List getResultList() throw new UnsupportedOperationException();
@Override
public Object getSingleResult() throw new UnsupportedOperationException();
@Override
public int executeUpdate() throw new UnsupportedOperationException();
@Override
public javax.persistence.Query setMaxResults(int maxResult) throw new UnsupportedOperationException();
@Override
public int getMaxResults() throw new UnsupportedOperationException();
@Override
public Query setFirstResult(int startPosition) throw new UnsupportedOperationException();
@Override
public int getFirstResult() throw new UnsupportedOperationException();
@Override
public javax.persistence.Query setHint(String hintName, Object value) throw new UnsupportedOperationException();
@Override
public Map<String, Object> getHints() throw new UnsupportedOperationException();
@Override
public <T> javax.persistence.Query setParameter(Parameter<T> param, T value) throw new UnsupportedOperationException();
@Override
public javax.persistence.Query setParameter(Parameter<Calendar> param, Calendar value, TemporalType temporalType) throw new UnsupportedOperationException();
@Override
public javax.persistence.Query setParameter(Parameter<Date> param, Date value, TemporalType temporalType) throw new UnsupportedOperationException();
@Override
public javax.persistence.Query setParameter(String name, Object value)
if(value instanceof Enum)
q.setParameter(name,((Enum) value).name());
else
q.setParameter(name,value);
return this;
@Override
public javax.persistence.Query setParameter(String name, Calendar value, TemporalType temporalType) return setParameter(name,(Date)(value==null?null:value.getTime()),temporalType);
@Override
public javax.persistence.Query setParameter(String name, Date value, TemporalType temporalType)
if(temporalType==TemporalType.DATE)
q.setDate(name,value);
else if(temporalType==TemporalType.TIME)
q.setTime(name,value);
else if(temporalType==TemporalType.TIMESTAMP)
q.setTimestamp(name,value);
else
throw new UnsupportedOperationException();
return this;
@Override
public javax.persistence.Query setParameter(int position, Object value) q.setParameter(position,value); return this;
@Override
public javax.persistence.Query setParameter(int position, Calendar value, TemporalType temporalType) throw new UnsupportedOperationException();
@Override
public javax.persistence.Query setParameter(int position, Date value, TemporalType temporalType) throw new UnsupportedOperationException();
@Override
public Set<Parameter<?>> getParameters() throw new UnsupportedOperationException();
@Override
public Parameter<?> getParameter(String name) throw new UnsupportedOperationException();
@Override
public <T> Parameter<T> getParameter(String name, Class<T> type) throw new UnsupportedOperationException();
@Override
public Parameter<?> getParameter(int position) throw new UnsupportedOperationException();
@Override
public <T> Parameter<T> getParameter(int position, Class<T> type) throw new UnsupportedOperationException();
@Override
public boolean isBound(Parameter<?> param) throw new UnsupportedOperationException();
@Override
public <T> T getParameterValue(Parameter<T> param) throw new UnsupportedOperationException();
@Override
public Object getParameterValue(String name) throw new UnsupportedOperationException();
@Override
public Object getParameterValue(int position) throw new UnsupportedOperationException();
@Override
public javax.persistence.Query setFlushMode(FlushModeType flushMode) throw new UnsupportedOperationException();
@Override
public FlushModeType getFlushMode() throw new UnsupportedOperationException();
@Override
public javax.persistence.Query setLockMode(LockModeType lockMode) throw new UnsupportedOperationException();
@Override
public LockModeType getLockMode() throw new UnsupportedOperationException();
@Override
public <T> T unwrap(Class <T> cls) throw new UnsupportedOperationException();
public interface QueryResultConsumer<T>
public void consume(T obj);
【讨论】:
【参考方案2】:据我所知,在 JPA 中,唯一的方法是使用多个查询,正如您所描述的。另一方面,您可以肯定,这是唯一使用的技术(如果有的话)。我的意思是:要么获取所有行,要么分页,每页有一个数据库查询。
当然,如果您更频繁地需要此功能,您可以创建一个特殊的 Iterator,它需要在其构造函数中使用 pageSize 和 QueryProvider。
【讨论】:
以上是关于如何迭代 JPA 查询结果而不将它们全部保留在内存中?的主要内容,如果未能解决你的问题,请参考以下文章
如何有效地乘以重复行的火炬张量而不将所有行存储在内存中或迭代?