如何在批处理脚本的文件内容中找到非 ASCII?
Posted
技术标签:
【中文标题】如何在批处理脚本的文件内容中找到非 ASCII?【英文标题】:How can I find non ASCI in content in file in batch script? 【发布时间】:2022-01-10 08:22:42 【问题描述】:在批处理脚本中,我想在 a.txt 中查找内容 在 a.txt 我有更多记录如何检查记录是否包含 nonaci 并写入 b.txt ? 我有中间字符串的代码,但也失败了
@echo off
setlocal enableDelayedExpansion
SETLOCAL
set _char= "123456789~abcdef0"
SET /A _startchar=1
SET /A _length=1
for /L %%a in (32,1,125) do (
cmd /c exit %%a
echo !=exitcodeAscii!
if "!=exitcodeAscii!" EQU "%_char%" echo -- %%a
CALL SET _substring=!!_char:!_startchar!,2!!
ECHO !_substring! --- !_startchar!
SET /A _startchar=!_startchar! + 1
)
【问题讨论】:
ascii 退出代码永远不会等于变量_char
。你想用那行代码来完成什么?以下代码行不正确:CALL SET _substring=!!_char:!_startchar!,2!!
。这应该使用双百分号将变量正确扩展为值,并且您缺少波浪号。 CALL SET _substring=%%_char:~!_startchar!,2%%
嗨@Squashman,谢谢,但我在 (32,1,125) do ( cmd /c exit %%a echo !=exitcodeAscii! if "!= exitcodeAscii!" EQU "%_char%" echo -- %%a CALL SET _substring=%%_char:~!_startchar!,2%% ECHO !_substring!--- !_startchar!SET /A _startchar=!_startchar!+ 1 ) 但是当我回显它是空间时它失败了 CALL SET _substring=%%_char:~!_startchar!,2%% ECHO !_substring! --- !_startchar!
您的问题的代码更新应该放在您的问题中。请edit您的问题与您的新代码。无论如何,我并没有试图解决你的问题,我只是让人们注意我看到的一些代码错误。我给你的代码确实解决了你遇到的语法问题。一旦startchar
变量大于您尝试解析的字符串的长度,代码肯定会回显一个空格。在这种情况下,_char
变量只有 20 个字符。所以在那之后子字符串将显示一个空格。这是非常基本的逻辑,您可以自己弄清楚。
【参考方案1】:
以下定义了一个具有有效 Ascii 字符的变量(不包括 "
,由替换处理)用于逐字符比较。
编辑:为提高性能和确保正确处理任何可能的 ASCII 输入而进行的更改。
@Echo off
For /f "tokens=4 delims=: " %%G in ('CHCP')Do Set "Restore_Codepage=CHCP %%G > nul"
Set "Return[Len]=" & Set "Return[String]=" & Set "input="
Setlocal DISABLEDelayedExpansion
REM the label marker ":#" is used within this script to delimit help output.
:#
:# ========================= ASCII string filter v3.1 by T3RRY ======================
Rem - This script iterates over an input string character by character and tests
Rem each character against a a whitelist of printable ASCII characters, with
Rem succesful matches used to build a new string containing only printable
Rem ASCII characters.
Rem - Switch /R modifies this script to into a testing tool
Rem to check if a string contains any NonASCII or nonprintable ASCII characters.
Rem - Errorlevel 0 indicates the string contains only printable ASCII characters
Rem - A Positive errorlevel is returned containing the 1 indexed position of the
Rem first NonASCII or nonprintable ASCII character found.
Rem - Execution time increases as string length increases. Each character in the
Rem string is tested against a whitelist containing 95 printable ASCII characters.
:#
:# Usage: Filepath <"String"> [ /P | /R ] | [ -? | /? | -help ]
:#
:# Rem to use from another batch file:
:# For /f delims^= %%G in ('FilePath "string"')Do Echo(%%G
:#
:# Accepts input String via doublequoted argument - reads %* and trims " \P" or " \R"
:# switches if present
:# - No escaping of characters in the argument is required
:# - If unbalanced doublequotes exist in the string all doublequotes will be Removed.
:#
:# Use Switch /P to preserve original spaces
:# - Default behaviour is to Remove all double spaces from the string.
:#
:# Use Switch /R to reject input containing NonASCII characters
:# - If non ASCII character encountered, returns a positive errorlevel
:# ( the 1 indexed position of first non ASCII character encountered )
:#
Rem Version changes 09/Dec/2021 :
Rem - Changed input method to handle cases where qouted args contain
Rem standard delims within quotes IE: "string "substring=text""
Rem Version changes 08/Dec/2021 :
Rem - Added Help Switches -? /? and -help
Rem - Added switch: /R
Rem - Reject strings containing non ASCII characters. Default: Strip NonASCCi
Rem characters from the string.
Rem Note: this switch does not define Return[Len] or Return[String]
Rem Version changes 07/Dec/2021 :
Rem - Rewritten for much faster performance - NOTE:
Rem - Added Switch: /P
Rem - Preserve all whitespace. Default: multiple spaces truncated to single.
Rem - Renamed variable for returning String : Return[String]
Rem - Added variable Return[Len] to return 0 indexed string length.
Rem - Corrected handling of completely non ASCII strings to return empty / 0 Len
Rem ** Utilize alternate data stream to store variable containing printable ASCII
Rem characters so the variable only needs to be generated on first execution.
Rem ** Requires this batch file to be run from an NTFS drive.
:# =================================================================================
Set "ASCII= !"
2> nul (
more < "%~f0:ASCII.dat" > nul || (
Setlocal EnableDelayedExpansion
For /l %%i in (34 1 126) Do (
Cmd /c Exit %%i
Set "ASCII=!ASCII!!=ExitCodeAscii!"
)
>"%~f0:ASCII.dat" (Echo(Set ^^"ASCII=!ASCII!")
ENDLOCAL
))
Set "ASCII="
For /f "delims=" %%G in ('More ^< "%~f0:ASCII.dat"')Do %%G
If not Defined ASCII (
2> nul (
Powershell.exe -c "Remove-item -path '%~nx0' -Stream '*'"
)
1>&2 Echo(An error has occured. Ensure "%~nx0" is located on an NTFS drive.
Pause
ENDLOCAL
Exit /b 1
)
Rem Maximum stringlength to support. Modify here to propagate to RemoveChar loop and Return[Len]
REM maximum 1015 chars due to input reading method.
Set "SupportLength=1015"
Set "input="
::====================================================================================================
rem :: input capture method by Dave Benham : https://www.dostips.com/forum/viewtopic.php?t=4288#p23980
setlocal enableDelayedExpansion
>"%temp%\getArg.txt" <"%temp%\getArg.txt" (
setlocal disableExtensions
set prompt=#
echo on
for %%a in (%%a) do rem . %*.
echo off
endlocal
set /p "args="
set /p "args="
set "input=!args:~7,-2!"
set "count=!args:~7,-2!"
)
del "%temp%\getArg.txt"
::====================================================================================================
Rem the below line can be used to Remove the aleternate data stream this file creates.
Rem Powershell -c "Remove-item -path '%~nx0' -Stream '*'"
CHCP 65001 > nul
If not defined input (
Echo(Demo:
Rem escaped for definition in DelayedExpansion environment
Set "input=this is a demo) * ^! & ☺ ^= ¶ | ^! <. ~ ^^ & %% ▒ ╔ § ♣ This"
Set input
)
REM handle help switches
Set input | %SystemRoot%\System32\Findstr.exe /Xli "input=\/? input=-? input=-help" > nul && (
Setlocal EnableDelayedExpansion
For /f "tokens=2* delims=#" %%G in ('%SystemRoot%\System32\Findstr.exe /blic:":# " "%~f0"')Do (
Set "Usage=%%G"
Echo(!Usage:Filepath=%~f0!
)
ENDLOCAL & ENDLOCAL
Exit /b 0
)
Set Div="is=#", "1/(is<<9)"
Set "DQ=1"
Set ^"count=!count:"=DQ!"
2> nul Set "null=%count:DQ=" & Set /A DQ+=1& set "null=%"
Set /A !Div:#=%DQ% %% 2! 2> nul || Set ^"input=!input:"=!"
REM handle nonhelp switches
Set "ASCIISwitch[R]="
Set "ASCIISwitch[P]="
If defined input (
Set input | %SystemRoot%\System32\findstr.exe /Elic:" /P" > nul && (
Set "input=!input:~0,-3!"
Set "ASCIISwitch[P]=true"
)
Set input | %SystemRoot%\System32\findstr.exe /Elic:" /R" > nul && (
Set "input=!input:~0,-3!"
Set "ASCIISwitch[R]=true"
))
Rem Remove outer doublequotes from input argument if not already removed due to unbalanced quoting.
If .^%input:~0,1%^%input:~-1%. == ."". Set "input=!input:~1,-1!"
Rem RemoveChar loop - iterate over input character by character; Compare against each character in whitelist
Rem Appends ASCII Whitelist characters to New string unless /R switch used, in which case NonASCII characters
Rem trigger an exit of the script with a positive errorlevel indicating the string is not ASCII.
Rem the return value is the 1 indexed position of the first non ascii character encountered.
Set "end=" & Set "New="
For /l %%i in (0 1 %SupportLength%)Do If not "!input:~%%i,1!"=="" (
Set "Char=!input:~%%i,1!"
Set "ISAscii="
For /l %%c in (0 1 94)Do If not "!ASCII:~%%c,1!" == "" (
Set "C_Char=!ASCII:~%%c,1!"
if "!Char!"=="!C_Char!" (
Set "New=!New!!Char!"
Set "ISAscii=true"
))
If Defined ASCIISwitch[R] (
If Not Defined ISAscii (
Endlocal & Endlocal & %Restore_Codepage%
For /f "delims=" %%G in ('Set /A %%i+1')Do Exit /b %%G
)))
Set "Input=!New!"
If not Defined ASCIISwitch[P] (
For /l %%i in (0 1 9)Do if defined Input Set "Input=!Input: = !"
)
If defined input (
Echo(!input!
For /l %%i in (0 1 %SupportLength%)Do If not defined Return[Len] If "!input:~%%i,1!"=="" Set "Return[Len]=%%i"
) Else (
Set "Return[Len]=0"
Set "Return[String]="
)
ENDLOCAL & ENDLOCAL & Set "Return[Len]=%Return[Len]%" & Set "Return[string]=%input%" )
%Restore_Codepage%
Exit /B 0
【讨论】:
嗨@T3RR0R 我尝试并成功,但如果返回 nonasci 将写入日志文件,我也有新问题。我的问题是我不能在函数中重用参数(nonasci 的位置)因为我需要读取文件夹中的所有文件 txt,并且在每个文件中我需要读取所有记录(10 条记录)如果有记录包含 nonasci 将写入日志文件并用空格替换 nonsci如果能 。你能帮我吗谢谢。 基于您的代码Call:IsASCII Demo Call:IsASCII Ascii
我添加了我的代码,但它不起作用` if %%i not equal 0 echo %Demo% >> log.txt` 你能帮我吗非常感谢以上是关于如何在批处理脚本的文件内容中找到非 ASCII?的主要内容,如果未能解决你的问题,请参考以下文章
如何在 Linux 中打开包含非 Ascii 字符串的 wchar_t* 文件?