x=$(find . -name "*.txt")
echo $x

如果我在Bash shell中运行上面的代码段,我得到的是一个包含几个由空白分隔的文件名的字符串,而不是一个列表。

当然,我可以进一步用空白分隔它们以得到一个列表,但我相信有更好的方法来做到这一点。

那么,循环查找命令结果的最佳方法是什么呢?


当前回答

TL;DR:如果你只是想知道最正确的答案,你可能想知道我的个人偏好(见本文底部):

# execute `process` once for each file
find . -name '*.txt' -exec process {} \;

如果有时间,请通读其余部分,了解几种不同的方法以及其中大多数方法的问题。


完整的答案是:

最好的方法取决于你想做什么,但这里有一些选择。只要子树中没有文件名中有空格的文件或文件夹,你就可以遍历这些文件:

for i in $x; do # Not recommended, will break on whitespace
    process "$i"
done

稍微好一点,去掉临时变量x:

for i in $(find -name \*.txt); do # Not recommended, will break on whitespace
    process "$i"
done

当你可以的时候,最好是glob。空白安全,对于当前目录中的文件:

for i in *.txt; do # Whitespace-safe but not recursive.
    process "$i"
done

通过启用globstar选项,你可以glob所有匹配的文件在这个目录和所有子目录:

# Make sure globstar is enabled
shopt -s globstar
for i in **/*.txt; do # Whitespace-safe and recursive
    process "$i"
done

在某些情况下,例如,如果文件名已经在文件中,你可能需要使用read:

# IFS= makes sure it doesn't trim leading and trailing whitespace
# -r prevents interpretation of \ escapes.
while IFS= read -r line; do # Whitespace-safe EXCEPT newlines
    process "$line"
done < filename

通过适当设置分隔符,Read可以安全地与find结合使用:

find . -name '*.txt' -print0 | 
    while IFS= read -r -d '' line; do 
        process "$line"
    done

对于更复杂的搜索,你可能会使用find,或者带-exec选项,或者带-print0 | xargs -0:

# execute `process` once for each file
find . -name \*.txt -exec process {} \;

# execute `process` once with all the files as arguments*:
find . -name \*.txt -exec process {} +

# using xargs*
find . -name \*.txt -print0 | xargs -0 process

# using xargs with arguments after each filename (implies one run per filename)
find . -name \*.txt -print0 | xargs -0 -I{} process {} argument

Find还可以在运行命令之前使用-execdir而不是-exec来CD到每个文件的目录,并且可以使用-ok而不是-exec(或-okdir而不是-execdir)来进行交互(在为每个文件运行命令之前提示)。

*:从技术上讲,find和xargs(默认情况下)都会在命令行中使用尽可能多的参数运行命令,次数与遍历所有文件所需的次数相同。在实践中,除非您有非常多的文件,否则这并不重要,如果您超过了长度,但需要在同一个命令行上全部使用它们,那么SOL就会找到不同的方法。

其他回答

你可以存储你的查找输出在数组中,如果你希望以后使用输出:

array=($(find . -name "*.txt"))

现在要打印new line中的每个元素,可以使用for循环迭代数组的所有元素,也可以使用printf语句。

for i in ${array[@]};do echo $i; done

or

printf '%s\n' "${array[@]}"

你还可以使用:

for file in "`find . -name "*.txt"`"; do echo "$file"; done

这将以换行符打印每个文件名

若要仅以列表形式打印查找输出,可以使用以下方法之一:

find . -name "*.txt" -print 2>/dev/null

or

find . -name "*.txt" -print | grep -v 'Permission denied'

这将删除错误消息,并仅在新行中输出文件名。

如果您希望对文件名做一些事情,将其存储在数组中是很好的,否则不需要占用该空间,您可以直接从find输出。

Filenames can include spaces and even control characters. Spaces are (default) delimiters for shell expansion in bash and as a result of that x=$(find . -name "*.txt") from the question is not recommended at all. If find gets a filename with spaces e.g. "the file.txt" you will get 2 separated strings for processing, if you process x in a loop. You can improve this by changing delimiter (bash IFS Variable) e.g. to \r\n, but filenames can include control characters - so this is not a (completely) safe method.

从我的角度来看,有两种推荐的(安全的)文件处理模式:

1. 用于循环和文件名扩展:

for file in ./*.txt; do
    [[ ! -e $file ]] && continue  # continue, if file does not exist
    # single filename is in $file
    echo "$file"
    # your code here
done

2. 使用find-read-while & process替换

while IFS= read -r -d '' file; do
    # single filename is in $file
    echo "$file"
    # your code here
done < <(find . -name "*.txt" -print0)

讲话

模式1:

bash returns the search pattern ("*.txt") if no matching file is found - so the extra line "continue, if file does not exist" is needed. see Bash Manual, Filename Expansion shell option nullglob can be used to avoid this extra line. "If the failglob shell option is set, and no matches are found, an error message is printed and the command is not executed." (from Bash Manual above) shell option globstar: "If set, the pattern ‘**’ used in a filename expansion context will match all files and zero or more directories and subdirectories. If the pattern is followed by a ‘/’, only directories and subdirectories match." see Bash Manual, Shopt Builtin other options for filename expansion: extglob, nocaseglob, dotglob & shell variable GLOBIGNORE

模式二:

filenames can contain blanks, tabs, spaces, newlines, ... to process filenames in a safe way, find with -print0 is used: filename is printed with all control characters & terminated with NUL. see also Gnu Findutils Manpage, Unsafe File Name Handling, safe File Name Handling, unusual characters in filenames. See David A. Wheeler below for detailed discussion of this topic. There are some possible patterns to process find results in a while loop. Others (kevin, David W.) have shown how to do this using pipes: files_found=1 find . -name "*.txt" -print0 | while IFS= read -r -d '' file; do # single filename in $file echo "$file" files_found=0 # not working example # your code here done [[ $files_found -eq 0 ]] && echo "files found" || echo "no files found" When you try this piece of code, you will see, that it does not work: files_found is always "true" & the code will always echo "no files found". Reason is: each command of a pipeline is executed in a separate subshell, so the changed variable inside the loop (separate subshell) does not change the variable in the main shell script. This is why I recommend using process substitution as the "better", more useful, more general pattern.See I set variables in a loop that's in a pipeline. Why do they disappear... (from Greg's Bash FAQ) for a detailed discussion on this topic.

其他参考资料和来源:

Gnu Bash手册,模式匹配 《Shell中的文件名和路径名:如何正确地做到这一点》,David A. Wheeler 格雷格的维基,你为什么不看带“for”的行 为什么不应该解析Greg的Wiki ls(1)的输出 Gnu Bash手册,进程替换

# Doesn't handle whitespace
for x in `find . -name "*.txt" -print`; do
  process_one $x
done

or

# Handles whitespace and newlines
find . -name "*.txt" -print0 | xargs -0 -n 1 process_one

我喜欢使用find,这是第一次分配给变量和IFS切换到新行如下:

FilesFound=$(find . -name "*.txt")

IFSbkp="$IFS"
IFS=$'\n'
counter=1;
for file in $FilesFound; do
    echo "${counter}: ${file}"
    let counter++;
done
IFS="$IFSbkp"

正如@Konrad Rudolph所评论的那样,这将不适用于文件名中的“新行”。我仍然认为它很方便,因为它涵盖了需要遍历命令输出的大多数情况。

根据@phk的其他回答和评论,使用fd #3: (仍然允许在循环中使用stdin)

while IFS= read -r f <&3; do
    echo "$f"

done 3< <(find . -iname "*filename*")