如何使用 Spring Batch 读取一个单元格中包含多行的 CSV 文件?
Posted
技术标签:
【中文标题】如何使用 Spring Batch 读取一个单元格中包含多行的 CSV 文件?【英文标题】:How to use Spring Batch to read CSV files which contains mutiple line in one cell? 【发布时间】:2018-05-10 23:21:38 【问题描述】:原始 CSV 是这样的:
第一行:姓名、学生ID、评论
数据:
Name, StudentId, Comment
Jake, 12312, poor
Emma, 12324, good
Mary, 13214, need more work on programming
and math.
csv 数据最后一个条目的注释单元格包含两行。我想把它当作一行数据。
当我使用 flatItemReader 读取文件时,它会抛出关于“预期令牌 3 但实际令牌 1”的错误,我猜它会将第二行视为新行。 有没有办法将它们视为一条线?
【问题讨论】:
【参考方案1】:让您的阅读器只返回每一行的原始字符串,而不尝试在分隔符上进行拆分。制作一个处理器(必须是有状态的)来处理解析。唯一棘手的部分是,当您以某种方式到达 EOF 时,您必须向处理器发出信号,这样它就不会等着看它是否应该聚合下一行。像这样的:
public class AggregatingItemProcessor<T> implements ItemProcessor<T, T>, InitializingBean
private BiPredicate<T, T> aggregatePredicate;
private BiFunction<T, T, T> aggregator;
public void setAggregatePredicate(BiPredicate<T, T> aggregatePredicate)
this.aggregatePredicate = aggregatePredicate;
public void setAggregator(BiFunction<T, T, T> aggregator)
this.aggregator = aggregator;
private T cur;
@Override
public T process(T item) throws Exception
if(cur == null)
cur = item;
return null;
if(aggregatePredicate.test(cur, item))
cur = aggregator.apply(cur, item);
return null;
else
T toRet = cur;
cur = item;
return toRet;
@Override
public void afterPropertiesSet() throws Exception
Assert.notNull(aggregatePredicate, "Predicate to determine if records should be aggregated must not be null.");
Assert.notNull(aggregator, "Function for aggregating items must not be null.");
然后配置...
static final String EOF_MARKER = "\0";
@Bean
public FlatFileItemReader<String> reader()
final FlatFileItemReader<String> reader = new FlatFileItemReader<String>()
private boolean finished = false;
@Override
public String read() throws Exception, UnexpectedInputException, ParseException
if(finished) return null;
String next = super.read();
if(next == null)
finished = true;
return EOF_MARKER;
return next;
;
reader.setLineMapper((s, i) -> s);
return reader;
@Bean
public AggregatingItemProcessor<String> processor()
final AggregatingItemProcessor<String> processor = new AggregatingItemProcessor<>();
processor.setAggregatePredicate((s1, s2) -> !EOF_MARKER.equals(s2) && StringUtils.countOccurrencesOf(s2, ",") < 2);
processor.setAggregator(String::concat);
return processor;
【讨论】:
以上是关于如何使用 Spring Batch 读取一个单元格中包含多行的 CSV 文件?的主要内容,如果未能解决你的问题,请参考以下文章
如何在 Spring Batch 中分别读取平面文件头和正文
Spring Batch中如何读取多个CSV文件合并数据进行处理?