如何批量解析csv中的多行值

Posted

技术标签:

【中文标题】如何批量解析csv中的多行值【英文标题】:How to parse multi-line value from a csv in batch 【发布时间】:2022-01-14 10:15:16 【问题描述】:

我正在编写一个批处理脚本,我需要从 .csv 中解析文本,但遇到了障碍:

我设置了一个 for 循环来从每一行获取数据(这很好用),但我最终需要一个由多行分隔的值。例如(我将我希望被视为单个条目的内容放在括号中作为上下文):

(data I need,flag_for_which_process_to_run,dontcare,"data I need
data continued
data continued
this could continue for any number of lines",dontcare,dontcare,dontcare,dontcare)
(repeat)

有没有办法让批处理脚本在不破坏 for 循环的情况下解析它?如果有帮助,%%d 中的数据 用双引号括起来。代码如下,我指的是for循环中的第二个if。

SETLOCAL EnableDelayedExpansion

for /f "tokens=1,2,3,4 delims=," %%a in (sample.csv) do ( 
    REM Skip if %%b is not flag1
    if "%%b"=="flag1" (
        .
        .
        .
    )
    REM Skip if %%b is not otherflag
    if "%%b"=="otherflag" (


        REM Set the %%a variable
        set device=%%a
        echo "%%d"> output\tmp\temp.txt
        
    )
)

【问题讨论】:

你想如何处理多行值?作为一个多行值(它来了)?还是希望您在一个长值中加入 几行?请发布输入文件的 real 片段,以便我们将其用于测试和所需的输出。 不幸的是,它实际上是将其设置为新行。我认为 excel 将其作为单个单元格读取,因为它用双引号括起来。我无法发布输入的真实片段,因为它包含专有信息......我想我已经找到了一种方法来通过使用更多 if 语句和一个指示它是多行值的标志(所有行与“其他标志”是)。然后,只需将每一行回显到文本文件,直到 %%a 以双引号结尾,然后将标志设置回默认值以处理下一个要处理的标志。一旦我得到它的工作,我会发布。接受其他建议 感兴趣的值前面的三列/标记总是不加引号吗? 【参考方案1】:

鉴于前三个标记/值未加引号(因此它们不能单独包含引号或逗号)并且整个 CSV 文件不包含转义或退格字符,以下脚本,当 CSV 文件是作为命令行参数提供,应该提取您感兴趣的值(它只是将它们呼应出来):

@echo off
setlocal EnableExtensions DisableDelayedExpansion

rem // Define constants here:
set "_FILE=%~1" & rem // (CSV file; `%~1` is first command line argument)
rem // Get carriage-return character:
for /F %%C in ('copy /Z "%~0" nul') do set "_CR=%%C"
rem // Get line-feed character:
(set ^"_LF=^
%= blank line =%
^")
rem // Get escape and back-space characters:
for /F "tokens=1,2" %%E in ('prompt $E$S$H ^& for %%Z in ^(.^) do rem/') do set "_ESC=%%E" & set "_BS=%%F"

set "CONT="
rem // Read CSV file line by line:
for /F usebackq^ delims^=^ eol^= %%L in ("%_FILE%") do (
    rem // Branch for normal lines:
    if not defined CONT (
        rem // Get relevant tokens/values:
        for /F "tokens=1-3* delims=, eol=," %%A in ("%%L") do (
            set "DEVICE=%%A" & set "FLAG=%%B" & set "LINE=%%D"
            if not "%%D"=="%%~D" (
                rem // Fourth token begins with a `"`, hence remove it and enter branch for continued lines then:
                for /F delims^=^ eol^= %%E in ("%%D"^") do set "LINE=%%~E"
                set "DATA=" & set "CONT=#"
            ) else (
                rem // Fourth token does not begin with a '"', hence it cannot be continued:
                for /F "delims=, eol=," %%E in ("%%D") do (
                    rem // Do something with the data, like echoing:
                    echo/
                    echo FLAG=%%B
                    echo DEVICE=%%A
                    echo DATA=%%E
                )
            )
        )
    ) else set "LINE=%%L"
    rem // Branch for continued lines:
    if defined CONT (
        setlocal EnableDelayedExpansion
        rem // Temporarily replace escaped (doubled) `"` with back-space character:
        set "LINE=!LINE:""=%_BS%!"
        rem // Collect continued data with line-breaks replaced by escape characters:
        for /F delims^=^"^ eol^=^" %%D in ("!DATA!%_ESC%!LINE!") do endlocal & set "DATA=%%D"
        setlocal EnableDelayedExpansion
        if not "!LINE!"=="!LINE:"=!^" (
            rem /* There is a single `"` (plus a `,`), which is taken as the end of the continued fourth token;
            rem    hence replacing back line-breaks and (unescaped) `"`: */
            set "DATA=!DATA:*%_ESC%=!" & set "DATA=!DATA:%_BS%="!^"
            for %%E in ("!_CR!!_LF!") do set "DATA=!DATA:%_ESC%=%%~E!"
            rem // Do something with the data, like echoing:
            echo/
            echo FLAG=!FLAG!
            echo DEVICE=!DEVICE!
            echo DATA=!DATA!
            endlocal
            set "CONT="
        ) else endlocal
    )
)

endlocal
exit /B

【讨论】:

这行得通!我不得不对其进行一些处理才能将其粘贴到我当前的代码中,但这是可以预料的。谢谢!

以上是关于如何批量解析csv中的多行值的主要内容,如果未能解决你的问题,请参考以下文章

python批量读取csv并提取其中多行生成新csv

python批量读取csv并提取其中多行生成新csv

python批量读取csv并提取其中多行生成新csv

如何在python中将列值转换为csv的多行[重复]

从临时表插入 - 创建太多行

Excel 如何批量插入多行空白行