对于使用apache POI转换为CSV时的xlsx单元格数据

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了对于使用apache POI转换为CSV时的xlsx单元格数据相关的知识,希望对你有一定的参考价值。

我正在使用以下程序将xlsx转换为csv,如果它包含换行符(/ n)或分隔符,我想在每个单元格字符串中添加引号字符(“”)。

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.util.Iterator;

import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;

public class XlsxtoCSV {

    static void xlsx(File inputFile, File outputFile) {
        // For storing data into CSV files
        StringBuffer data = new StringBuffer();

        try {
            FileOutputStream fos = new FileOutputStream(outputFile);
            // Get the workbook object for XLSX file
            XSSFWorkbook wBook = new XSSFWorkbook(new FileInputStream(inputFile));
            // Get first sheet from the workbook
            XSSFSheet sheet = wBook.getSheetAt(0);
            Row row;
            Cell cell;
            // Iterate through each rows from first sheet
            Iterator<Row> rowIterator = sheet.iterator();

            while (rowIterator.hasNext()) {
                row = rowIterator.next();

                // For each row, iterate through each columns
                Iterator<Cell> cellIterator = row.cellIterator();
                while (cellIterator.hasNext()) {

                    cell = cellIterator.next();

                    switch (cell.getCellType()) {
                        case Cell.CELL_TYPE_BOOLEAN:
                            data.append(cell.getBooleanCellValue() + ",");

                            break;
                        case Cell.CELL_TYPE_NUMERIC:
                            data.append(cell.getNumericCellValue() + ",");

                            break;
                        case Cell.CELL_TYPE_STRING:
                            data.append(cell.getStringCellValue() + ",");
                            break;

                        case Cell.CELL_TYPE_BLANK:
                            data.append("" + ",");
                            break;
                        default:
                            data.append(cell + ",");

                    }
                }
            }

            fos.write(data.toString().getBytes());
            fos.close();

        } catch (Exception ioe) {
            ioe.printStackTrace();
        }
    }
    //testing the application 

    public static void main(String[] args) {
        //reading file from desktop
        File inputFile = new File("C:\Users\user69\Desktop\test.xlsx");
        //writing excel data to csv 
        File outputFile = new File("C:\Users\user69\Desktop\test1.csv");
        xlsx(inputFile, outputFile);
    }
}

根据RFC4180 Csv规则。包含换行符(CRLF),双引号和逗号的字段应括在双引号中。因此,如果单元格数据在添加到String缓冲区之前包含换行符或分隔符(,),则必须格式化单元格数据(数字或字符串或任何其他类型)。请帮助我根据CSV规则格式化单元格数据。

答案

使用像commons-csv这样的库:

final Appendable out = ...;  
final CSVPrinter printer = CSVFormat.DEFAULT.withHeader("H1", "H2").print(out);
...
while (rowIterator.hasNext()) {
    ...
    while (cellIterator.hasNext()) {
        ...
        printer.print(cell.getStringCellValue());
        ...
    }
    printer.println();
}

另见短user-guide

另一答案

Centic的回复是完全正确的。为了扩展他所写的内容,这是我完整且经过测试的方法,它使用Commons CSV进行实际值打印。不幸的是,我们仍然需要遍历Sheet,XSSF中没有自动CSV输出方法,但我遵循Centic的策略来进行Row / Cell迭代。

这个例子输出到OutputStream,但显然File同样容易(在FileReader构造函数中使用CSVPrinter)。

// Convert an XSSFWorkbook to CSV and write to provided OutputStream
private void writeWorkbookAsCSVToOutputStream(XSSFWorkbook workbook, OutputStream out) {

    CSVPrinter csvPrinter = null;

    try {
        // Or change this to  File-based constructor, if File output is required
        csvPrinter = new CSVPrinter(new OutputStreamWriter(out), CSVFormat.DEFAULT);                

        if (workbook != null) {
            XSSFSheet sheet = workbook.getSheetAt(0); // Sheet #0
            Iterator<Row> rowIterator = sheet.rowIterator();
            while (rowIterator.hasNext()) {               
                Row row = rowIterator.next();
                Iterator<Cell> cellIterator = row.cellIterator();
                while (cellIterator.hasNext()) {
                    Cell cell = cellIterator.next();
                    csvPrinter.print(cell.getStringCellValue()); // Commons CSV prints here
                }
                // Newline after each row
                csvPrinter.println();
            }

        }

    }
    catch (Exception e) {
        log.error("Failed to write CSV file to output stream", e);
    }
    finally {
        try {
            if (csvPrinter != null) {
                // Close CSVPrinter
                csvPrinter.flush();
                csvPrinter.close();
            }
        }
        catch (IOException ioe) {
            log.error("Error when closing CSV Printer", ioe);
        }           
    }
}   

以上是关于对于使用apache POI转换为CSV时的xlsx单元格数据的主要内容,如果未能解决你的问题,请参考以下文章

带有 xlsm 文件的 Apache POI Java 堆空间

java怎么将xlsx转换成csv格式

java 关于xlsx(xls) 和 csv 文件的数据解析

用Java将Excel的xls和xlsx文件转换成csv文件的方法, XLS2CSV, XLSX2CSV

C# 将 csv 转换为 xls(使用现有的 csv 文件)

使用 soffice 命令行将 xls 转换为分号分隔的 csv