从 git diff --name-only 迭代结果时如何处理文件名中的空格

Posted

技术标签:

【中文标题】从 git diff --name-only 迭代结果时如何处理文件名中的空格【英文标题】:How to cope with spaces in file names when iterating results from git diff --name-only 【发布时间】:2015-03-22 11:16:00 【问题描述】:

我正在处理的脚本需要从 git diff 遍历每个文件。但是,我不知道如何处理文件名中的空格。任何有空格的文件都被分成“2个文件”。我知道他们需要被包裹在" " 中,但我不知道如何在它进入@ 参数之前实现它。

当文件名中有空格时,我应该如何遍历文件来自

git diff --name-only  $1

?

这是一个重现错误的简单测试:

copyfiles()

    echo "Copying added files"
    for file in $@; do

        new_file=$(echo $file##*/)

        directory=$(echo $file%/*)
        echo "Full Path is is  $file"
        echo "File is  $new_file"
        echo "Directory is  $directory"
        cp $file $COPY_TO
    done    


COPY_TO="testDir"
DIFF_FILES=$( git diff --name-only  $1) 
copyfiles $DIFF_FILES 

脚本当前运行如下:

test.sh <git commit id>

【问题讨论】:

【参考方案1】:

--name-only 的输出会经过一定量的转义。不幸的是,使用起来很尴尬。

git diff 解释了-z 选项下的转义(和替代方法):

-z

当给出 --raw、--numstat、--name-only 或 --name-status 时,不要修改路径名并将 NUL 用作输出字段终止符。

如果没有这个选项,每个路径名输出将分别用 \t、\n、\" 和 \ 替换 TAB、LF、双引号和反斜杠字符,如果有任何一个,路径名将用双引号括起来这些替换发生了。

一个例子:

$ git init ugh
$ cd ugh
$ touch 'spa ce' $'new\nline' $'t\tab'
$ ls # Unhelpful really
new?line  spa ce  t?ab
$ ls --quote # Minorly helpful but wrong (for shell usage)
"new\nline"  "spa ce"  "t\tab"
$ git add -A
$ git diff --cached --name-only
"new\nline"
spa ce
"t\tab"
$ git diff --cached --name-only -z # Doesn't copy and paste well and is a bit confusing to read this way
new
line^@spa ce^@t ab^@
$ printf %q\\n "$(git diff --cached --name-only -z )"
$'new\nlinespa cet\tab'

无论如何,这里的重点是最好的方法是使用-z 输出并使用read 读取文件列表。

while IFS= read -r -d '' file; do
    printf 'file = %q\n' "$file"
done < <(git diff --cached --name-only -z)

您也可以将git diff 的输出通过管道传输到while 循环,但是如果循环完成后您需要循环内部的变量,则需要使用此流程替换方法来避免管道方法D 的子shell 问题。

【讨论】:

这个答案非常有帮助。谢谢你。它允许我将来自git diff --name-only -z 的输出作为输入传递给git diff/git difftool。我在这里演示:***.com/a/62853776/4561887【参考方案2】:

使用-z 让 git-diff 使用空终止符。例如:

export COPY_TO
git diff -z --name-only | xargs -0 sh -c 'for file; do
    new_file=$(echo $file##*/)
    directory=$(echo $file%/*)
    echo "Full Path is is  $file"
    echo "File is  $new_file"
    echo "Directory is  $directory"
    cp "$file" "$COPY_TO"
done' sh

请注意,更合理的解决方案是拒绝创建名称中带有空格的文件的人的拉取请求。

【讨论】:

喜欢合理的解决方案,我不明白为什么源文件以空白结尾! 我能问一下为什么在我当前的脚本中添加“-z”不起作用吗? 我不知道如何以一种允许我在每个文件上运行函数的方式使用它.. 如果你使用 bash,你可以用export -f 导出函数,然后xargs -0 -I bash -c 'function_name '。这会为每个文件调用一次函数,而不是传递多个文件名。不过,我强烈建议不要这样做,因为导出函数很古怪。而是将其放在 shell 脚本中。 虽然xargs -0 bash -c 'function_name "$@"' bash 更容易。这将调用具有多个参数的函数。重要的是export -f function_name【参考方案3】:
git diff -z --name-only |
while read -d $'\0' file
do
    echo $file
done

【讨论】:

你必须小心:read -d 只在 bash 中有效,在 sh 中无效。【参考方案4】:

谢谢@Etan Resiner for your answer。下面的示例展示了如何使用git diff --name-only -z "$merge_base" $BACKUP_BRANCH 的输出作为输入来包含发送到git diffgit difftool 的转义文件名。它需要一个额外的--,所以请看下面的代码。

我能够用它修复my git changes program,所以现在它可以处理文件名中包含空格或特殊字符(例如')的git repo 中的文件名。现在,程序如下所示:

用法:

Usage: git changes <common_base> <backup_branch> [any other args to pass to git difftool]

git-changes.sh:

特别注意files_changed_escaped变量的填充,这是直接从@Etan Reisner的回答中学到的。

COMMON_BASE_BRANCH="$1"
BACKUP_BRANCH="$2"
# Obtain all but the first args; see:
# https://***.com/questions/9057387/process-all-arguments-except-the-first-one-in-a-bash-script/9057392#9057392
ARGS_3_AND_LATER="$@:3"

merge_base="$(git merge-base $BACKUP_BRANCH $COMMON_BASE_BRANCH)"
files_changed="$(git diff --name-only "$merge_base" $BACKUP_BRANCH)"

echo "Checking for changes against backup branch \"$BACKUP_BRANCH\""
echo "only in these files which were previously-modified by that backup branch:"
echo "--- files originally changed by the backup branch: ---"
echo "$files_changed"
echo "------------------------------------------------------"
echo "Checking only these files for differences between your backup branch and your current branch."

# Now, escape the filenames so that they can be used even if they have spaces or special characters,
# such as single quotes (') in their filenames!
# See: https://***.com/questions/28109520/how-to-cope-with-spaces-in-file-names-when-iterating-results-from-git-diff-nam/28109890#28109890
files_changed_escaped=""
while IFS= read -r -d '' file; do
    escaped_filename="$(printf "%q" "$file")"
    files_changed_escaped="$files_changed_escaped    $escaped_filename"
done < <(git diff --name-only -z "$merge_base" $BACKUP_BRANCH)

# DEBUG PRINTS. COMMENT OUT WHEN DONE DEBUGGING.
echo "$files_changed_escaped"
echo "----------"
# print withOUT quotes to see if that changes things; ans: indeed, it does: this removes extra 
# spaces and I think will replace each true newline char (\n) with a single space as well 
echo $files_changed_escaped 
echo "=========="

# NB: the `--` is REQUIRED before listing all of the files to search in, or else escaped files
# that have a dash (-) in their filename confuse the `git diff` parser and the parser thinks they
# are options! It will output this error:
#       fatal: option '-\' must come before non-option arguments
# Putting the list of all escaped filenames to check AFTER the `--` forces the parser to know
# they cannot be options, because the `--` with nothing after it signifies the end of all optional
# args.
git difftool $ARGS_3_AND_LATER $BACKUP_BRANCH -- $files_changed_escaped
echo "Done."

您可以在此处下载 git changes 程序作为我的 dotfiles 项目的一部分:https://github.com/ElectricRCAircraftGuy/eRCaGuy_dotfiles。

它还包含诸如git diffn之类的东西,即带有行号的git diff

【讨论】:

【参考方案5】:

我认为你的代码需要这个命令IFS=$'\n'

echo "this command is important"

IFS=$'\n'
for file_change in `git diff --name-only $1`
do
    echo "Put $file_change ..."

    # File Name
    fileName=$(basename "$file_change")
    echo "$fileName"

    # Directory
    dir=$(dirname "$file_change")
    echo "$dir"
    

    # copy file
    cp $file_change $REMOTE_DIR$file_change
done

【讨论】:

以上是关于从 git diff --name-only 迭代结果时如何处理文件名中的空格的主要内容,如果未能解决你的问题,请参考以下文章

如何在预提交挂钩中使用 git diff 的退出代码

Git 显示指定项目有无冲突文件

想要从“git diff”中排除文件

如何从 git diff 读取输出?

如何从 git diff 读取输出?

从命令行在 sublime 中打开 git diff