用于提取指定单词之间的行的批处理脚本

Posted

技术标签:

【中文标题】用于提取指定单词之间的行的批处理脚本【英文标题】:Batch Script To Extract Lines Between Specified Words 【发布时间】:2014-10-12 23:22:47 【问题描述】:

我有一个如下的日志文件。

[2014 年 8 月 19 日星期二 10:45:28]Local/PLPLAN/PL/giuraja@MSAD/2172/Info(1019025)

从数据库规则对象中读取规则 [PL]

[2014 年 8 月 19 日星期二 10:45:28]Local/PLPLAN/PL/giuraja@MSAD/2172/Info(1013157)

从用户 [giuraja@MSAD] 使用 [AIF0142.rul] 和数据文件 [SQL] 接收命令 [导入]

.

.

.

.

.

清除用户 [giuraja@MSAD] 实例 [1] 上的活动

.

.

我想提取以“[Tue Aug 19 10:”开头的行,直到以“Clear Active on User”开头的行,并使用 Windows 批处理脚本输出到文件。我尝试了下面的代码。它只输出最后一行。

@echo off & setlocal enabledelayedexpansion

设置 Month_Num=%date:~4,2%

如果 %Month_Num%==08 设置 Month_Name=Aug

设置日期=%date:~0,3%

设置 Today_Date=%date:~7,2%

设置 Search_String=[%Day% %Month_Name% %Today_Date% 10:

for /f "tokens=1 delims=[]" %%a in ('find /n "%Search_String%"^

@(

更多 +%%a D:\Hyperion\ERPI_Actuals_Load\Logs\PLPLAN.LOG)>D:\Hyperion\ERPI_Actuals_Load\Logs\PLPLAN_Temp.txt

(for /f "tokens=*" %%a in (D:\Hyperion\ERPI_Actuals_Load\Logs\PLPLAN_Temp.txt) 做 (

设置测试=%%a

如果 "!test:~0,20!" equ "Clear Active on User" goto :eof

回显 %%a

))>D:\Hyperion\ERPI_Actuals_Load\Logs\PLPLAN_Formatted.txt

问候, 拉加夫。

【问题讨论】:

【参考方案1】:

下面的批处理文件旨在尽可能快地处理大文件;但是,它会从结果中删除空行:

@echo off
setlocal EnableDelayedExpansion

set "start=[Tue Aug 19 10:"
set "end=Clear Active on User"

for /F %%a in ("%start%") do set startWord=%%a
for /F %%a in ("%end%") do set endWord=%%a

set "startLine="
set "endLine="
for /F "tokens=1,2 delims=: " %%a in ('findstr /I /N /B /C:"%start%" /C:"%end%" logFile.txt') do (
   if not defined startLine if "%%b" equ "%startWord%" set startLine=%%a
   if not defined endLine if "%%b" equ "%endWord%" set "endLine=%%a" & goto continue0
)
:continue0
set /A skipLines=startLine-1, numLines=endLine-startLine+1
set "skip="
if %skipLines% gtr 0 set skip=skip=%skipLines%

(for /F "%skip% delims=" %%a in (logFile.txt) do (
   echo %%a
   set /A numLines-=1
   if !numLines! equ 0 goto continue1
)) > outFile1.txt
:continue1

rem Previous outFile1.txt contain as many extra lines as empty lines removed, so we need to eliminate they

for /F "delims=:" %%a in ('findstr /I /N /B /C:"%end%" outFile1.txt') do set numLines=%%a
(for /F "delims=" %%a in (outFile1.txt) do (
   echo %%a
   set /A numLines-=1
   if !numLines! equ 0 goto continue2
)) > outFile2.txt
:continue2

del outFile1.txt
TYPE outFile2.txt

如果你想保留空行,这个过程会慢很多

【讨论】:

谢谢。有用。我应该怎么做才能保留空行?我可以拆分文件。【参考方案2】:

这应该可以工作(经过测试)

@echo off
set "st_line=Tue Aug 19"
set "end_line=Clear Active on User"
for /f "delims=:" %%i in ('findstr /inc:"%st_line%" logfile.txt') do (set st_line_ln=%%i)
for /f "delims=:" %%j in ('findstr /inc:"%end_line%" logfile.txt') do (set end_line_ln=%%j)

findstr /in /c:[a-z] /c:[0-9] /rc:"^$" logfile.txt >logfile_ln.txt

set /a "st_line_ln_temp=%st_line_ln-1"

for /f "skip=%st_line_ln_temp% tokens=1* delims=:" %%a in ('type logfile_ln.txt') do (
 if %%a leq %end_line_ln% (echo.%%b) 
)

del logfile_ln.txt

样本输出 -

C:\test>type logfile.txt
Tue Aug 19 10:45:28 2014]Local/PLPLAN/PL/giuraja@MSAD/2172/Info(1019025)

Reading Rules From Rule Object For Database [PL]

[Tue Aug 19 10:45:28 2014]Local/PLPLAN/PL/giuraja@MSAD/2172/Info(1013157)

Received Command [Import] from user [giuraja@MSAD] using [AIF0142.rul] with data file [SQL]

test1
test2
test4
test546
Clear Active on User [giuraja@MSAD] Instance [1
test1212
test232
test67
dj

C:\test>draft.bat
[Tue Aug 19 10:45:28 2014]Local/PLPLAN/PL/giuraja@MSAD/2172/Info(1013157)

Received Command [Import] from user [giuraja@MSAD] using [AIF0142.rul] with data file [SQL]

test1
test2
test4
test546
Clear Active on User [giuraja@MSAD] Instance [1
C:\test>

干杯,G

【讨论】:

谢谢。我试过这个。文件大小约为 400 MB,执行时卡住了。当我尝试使用您使用的同一个 logfile.txt 时,它只是将行号添加到每一行并输出到 logfile_ln.txt。我也收到此错误。 tokens=1* delims=:" 出乎意料。 如果您错过了任何引号'",通常会出现您提到的错误。你能再仔细看一遍吗【参考方案3】:
@ECHO OFF
SETLOCAL
:: If you don't want to preserve empty lines
SET "select="
(
 FOR /f "delims=" %%a IN (q25390541.txt) DO (
  ECHO %%a|FINDSTR /b /L /c:"[Tue Aug 19 10:" >NUL
  IF NOT ERRORLEVEL 1 SET select=y
  IF DEFINED select ECHO(%%a
  ECHO %%a|FINDSTR /b /L /c:"Clear Active on User" >NUL
  IF NOT ERRORLEVEL 1 GOTO done1
 )
)>newfile.txt

:done1

:: If you want to preserve empty lines
SET "select="
(
 FOR /f "tokens=1*delims=:" %%a IN ('findstr /n /r ".*" q25390541.txt') DO (
  IF "%%b"=="" (
   IF DEFINED select ECHO(
  ) ELSE (
   ECHO %%b|FINDSTR /b /L /c:"[Tue Aug 19 10:" >NUL
   IF NOT ERRORLEVEL 1 SET select=Y
   IF DEFINED select ECHO(%%b
   ECHO %%b|FINDSTR /b /L /c:"Clear Active on User" >NUL
   IF NOT ERRORLEVEL 1 GOTO done2
  )
 )
)>newfile2.txt

:done2

GOTO :EOF

我使用了一个名为 q25390541.txt 的文件,其中包含您的数据用于我的测试。 产生newfile.txtnewfile2.txt,具体取决于哪个是首选。

【讨论】:

谢谢。有用。但是还有更多字符串“Clear Active on User”的实例。我想要从“[Tue Aug 19 10”开始到字符串“Clear Active on User”的最后一个实例之间的所有行。 您提供的样本数据太少。您说“以开头的行”,这意味着只有一个。如果您的意思是“开头的 last 行”,那就这么说吧。这是完全不同且非常重要的。一个月后,您说过要保留空行 - 在对答案的评论中,而不是把它放在原始问题中。我们不知道您的文件结构或您需要的部分。您需要非常了解这些事情。 也许如果您要从[Tue... 的第一次出现提取到[Wed... 的第一次出现(或者它可能是从[ 开始但不以[Tue Aug 19 10 开始的行),那么单独可能符合您的要求。如果没有代表性数据——尤其是end of data that I want to extract dataline-sequence 的明确示例,我们将沦为猜谜游戏。 好的。谢谢。我没有具体说明,因为我后来了解了文件的格式。【参考方案4】:

这是一个包含(显示开始和结束行)的脚本,应该可以完成这项工作。您可以修改它以排除这些行:

@echo off
setlocal

set filename=text_file.txt
for /f "tokens=1 delims=:"  %%a in ('findstr /n /b /c:"[Tue Aug " %filename%') do (
    set /a f_line=%%a-1
)

for /f "tokens=1 delims=:"  %%a in ('findstr /n /b /c:"Clear Active on User" %filename%') do (
    set /a l_line=%%a
)
 echo %l_line% -- %f_line%

call :tail_head2 -file=%filename% -begin=%f_line%  -end=%l_line% 


exit /b 0




@echo off
:tail_head2
setlocal


rem ---------------------------
rem ------ arg parsing --------
rem ---------------------------

    if "%~1" equ "" goto :help
        for %%H in (/h -h /help -help) do (
                if /I "%~1" equ "%%H" goto :help
        )
        setlocal enableDelayedExpansion
            set "prev="
            for %%A in (%*) do (
                    if /I "!prev!" equ "-file" set file=%%~fsA
                    if /I "!prev!" equ "-begin" set begin=%%~A
                    if /I "!prev!" equ "-end" set end=%%A
                    set prev=%%~A
            )
        endlocal & (
                if "%file%" neq "" (set file=%file%)
                if "%begin%" neq "" (set /a begin=%begin%)
                if "%end%" neq "" (set /a end=%end%)
        )

rem -----------------------------
rem --- invalid cases check -----
rem -----------------------------

        if "%file%" EQU "" echo file not defined && exit /b 1
        if not exist "%file%"  echo file not exists && exit /b 2
        if not defined begin if not defined end echo neither BEGIN line nor END line are defined && exit /b 3

rem --------------------------
rem -- function selection ----
rem --------------------------

        if defined begin if %begin%0 LSS 0 for /F %%C in ('find /c /v "" ^<"%file%"')  do set /a lines_count=%%C
        if defined end if %end%0 LSS 0 if not defined lines_count for /F %%C in ('find /c /v "" ^<"%file%"')  do set lines_count=%%C

                rem -- begin only
        if not defined begin if defined end if %end%0 GEQ 0 goto :end_only
        if not defined begin if defined end if %end%0 LSS 0 (
                        set /a end=%lines_count%%end%+1
                        goto :end_only
                )

                rem -- end only
        if not defined end if defined begin if %begin%0 GEQ 0 goto :begin_only
        if not defined end if defined begin if %begin%0 LSS 0 (
                        set /a begin=%lines_count%%begin%+1
                        goto :begin_only
                )
                rem -- begin and end
        if %begin%0 LSS 0 if %end%0 LSS 0 (
                        set /a begin=%lines_count%%begin%+1
                        set /a end=%lines_count%%end%+1
                        goto :begin_end
                )
        if %begin%0 LSS 0 if %end%0 GEQ 0 (
                        set /a begin=%lines_count%%begin%+1
                        goto :begin_end
                )
        if %begin%0 GEQ 0 if %end%0 LSS 0 (
                        set /a end=%lines_count%%end%+1
                        goto :begin_end
                )
        if %begin%0 GEQ 0 if %end%0 GEQ 0 (
                        goto :begin_end
                )      
goto :eof

rem -------------------------
rem ------ functions --------
rem -------------------------

rem -----  single cases -----

:begin_only
        setlocal DisableDelayedExpansion
        for /F "delims=" %%L in ('findstr /R /N "^" "%file%"') do (
                set "line=%%L"
                for /F "delims=:" %%n in ("%%L") do (
                        if %%n GEQ %begin% (
                                setlocal EnableDelayedExpansion
                                set "text=!line:*:=!"
                                (echo(!text!)
                                endlocal
                        )
                )
        )
        endlocal
endlocal
goto :eof

:end_only
        setlocal disableDelayedExpansion
        for /F "delims=" %%L in ('findstr /R /N "^" "%file%"') do (
                set "line=%%L"
                for /F "delims=:" %%n in ("%%L") do (
                        IF %%n LEQ %end% (
                                setlocal EnableDelayedExpansion
                                set "text=!line:*:=!"
                                (echo(!text!)
                                endlocal
                        ) ELSE goto :break_eo
                )
        )
        :break_eo
        endlocal
endlocal
goto :eof

rem ---  end and begin case  -----

:begin_end
        setlocal disableDelayedExpansion
        if %begin% GTR %end% goto :break_be
        for /F "delims=" %%L in ('findstr /R /N "^" "%file%"') do (
                set "line=%%L"
                for /F "delims=:" %%n in ("%%L") do (
                    IF %%n GEQ %begin% IF %%n LEQ %end% (        
                        setlocal EnableDelayedExpansion
                        set "text=!line:*:=!"
                       (echo(!text!)
                        endlocal
                    ) ELSE goto :break_be                              
                )
        )
        :break_be
        endlocal
endlocal
goto :eof
rem ------------------
rem --- HELP ---------
rem ------------------
:help
    echo(
        echo %~n0 - dipsplays a lines of a file defined by -BEGIN and -END arguments passed to it
        echo(
        echo( USAGE:
        echo(
        echo %~n0  -file=file_to_process -begin=begin_line ^| -end=end_line 
        echo or
        echo %~n0  -file file_to_process -begin begin_line ^| -end end_line 
        echo(
        echo( if some of arguments BEGIN or END has a negative number it will start to count from the end of file
        echo(
        echo( http://ss64.org/viewtopic.php^?id^=1707
        echo(
goto :eof

编辑 - 最后 100 行:

@echo off
setlocal

set filename=text_file.txt
for /F %%C in ('find /c /v "" ^<"%filename%"')  do set /a lines_count=%%C
set /a last_100_lines=lines_count-100

type %filename% | more /e +%last_100_lines%

【讨论】:

谢谢。该脚本运行了大约 6 分钟,后来我终止了它。文件大小可高达 400 MB。 @user1492218 哇...这个脚本不是处理大文件的最佳方法。可能一些外部工具会有所帮助(尾部,头部端口可能)。这是否至少打印了数字行- 中间有--echo 是的。它打印了行数。如果我可以使用批处理脚本提取最后 100 行,那将会很有帮助。

以上是关于用于提取指定单词之间的行的批处理脚本的主要内容,如果未能解决你的问题,请参考以下文章

shell脚本--文本处理以及编程原理

用于删除字符串中重复单词的窗口批处理/DOS脚本[关闭]

文本处理工具和正则表达式SHELL脚本编程

Shell脚本

批处理 > 提取 txt 文件中同一字符之间的部分行

各位大侠: linux shell处理:有一文本,如何提取第4列相同的行。 谢谢了