我想分别获取文件名(不带扩展名)和扩展名。

到目前为止,我找到的最佳解决方案是:

NAME=`echo "$FILE" | cut -d'.' -f1`
EXTENSION=`echo "$FILE" | cut -d'.' -f2`

这是错误的,因为如果文件名包含多个,它就不起作用。字符。假设我有a.b.js,它会考虑a和b.js而不是a.b和js。

它可以在Python中用

file, ext = os.path.splitext(path)

但如果可能的话,我不希望仅仅为此启动Python解释器。

有更好的主意吗?


当前回答

如果您还想允许空扩展,这是我能想到的最短的:

echo 'hello.txt' | sed -r 's/.+\.(.+)|.*/\1/' # EXTENSION
echo 'hello.txt' | sed -r 's/(.+)\..+|(.*)/\1\2/' # FILENAME

第一行解释道:它匹配PATH.EXT或ANYTHING,并将其替换为EXT。如果匹配了ANYTHNG,则不会捕获EXT组。

其他回答

如果文件没有扩展名或文件名,这似乎不起作用。这是我正在使用的;它只使用内置文件名,并处理更多(但不是所有)病态文件名。

#!/bin/bash
for fullpath in "$@"
do
    filename="${fullpath##*/}"                      # Strip longest match of */ from start
    dir="${fullpath:0:${#fullpath} - ${#filename}}" # Substring from 0 thru pos of filename
    base="${filename%.[^.]*}"                       # Strip shortest match of . plus at least one non-dot char from end
    ext="${filename:${#base} + 1}"                  # Substring from len of base thru end
    if [[ -z "$base" && -n "$ext" ]]; then          # If we have an extension and no base, it's really the base
        base=".$ext"
        ext=""
    fi

    echo -e "$fullpath:\n\tdir  = \"$dir\"\n\tbase = \"$base\"\n\text  = \"$ext\""
done

下面是一些测试用例:

$ basename-and-extension.sh / /home/me/ /home/me/file /home/me/file.tar /home/me/file.tar.gz /home/me/.hidden /home/me/.hidden.tar /home/me/.. .
/:
    dir  = "/"
    base = ""
    ext  = ""
/home/me/:
    dir  = "/home/me/"
    base = ""
    ext  = ""
/home/me/file:
    dir  = "/home/me/"
    base = "file"
    ext  = ""
/home/me/file.tar:
    dir  = "/home/me/"
    base = "file"
    ext  = "tar"
/home/me/file.tar.gz:
    dir  = "/home/me/"
    base = "file.tar"
    ext  = "gz"
/home/me/.hidden:
    dir  = "/home/me/"
    base = ".hidden"
    ext  = ""
/home/me/.hidden.tar:
    dir  = "/home/me/"
    base = ".hidden"
    ext  = "tar"
/home/me/..:
    dir  = "/home/me/"
    base = ".."
    ext  = ""
.:
    dir  = ""
    base = "."
    ext  = ""

您可以使用

sed 's/^/./' | rev | cut -d. -f2- | rev | cut -c2-

获取文件名和

sed 's/^/./' | rev | cut -d. -f1  | rev

以获得扩展。

测试用例:

echo "filename.gz"     | sed 's/^/./' | rev | cut -d. -f2- | rev | cut -c2-
echo "filename.gz"     | sed 's/^/./' | rev | cut -d. -f1  | rev
echo "filename"        | sed 's/^/./' | rev | cut -d. -f2- | rev | cut -c2-
echo "filename"        | sed 's/^/./' | rev | cut -d. -f1  | rev
echo "filename.tar.gz" | sed 's/^/./' | rev | cut -d. -f2- | rev | cut -c2-
echo "filename.tar.gz" | sed 's/^/./' | rev | cut -d. -f1  | rev

公认的答案在典型情况下有效,但在边缘情况下无效,即:

对于没有扩展名的文件名(在这个答案的剩余部分中称为后缀),extension=${filename##*.}返回输入文件名,而不是空字符串。extension=${filename##*.}不包括首字母。,与惯例相反。盲目地准备。不适用于没有后缀的文件名。filename=“${filename%.*}”将是空字符串,如果输入文件名以开头。并且不包含进一步的。字符(例如.bash_profile)-与惯例相反。

---------

因此,覆盖所有边缘情况的鲁棒解决方案的复杂性需要一个函数——见下面的定义;它可以返回路径的所有组件。

示例调用:

splitPath '/etc/bash.bashrc' dir fname fnameroot suffix
# -> $dir == '/etc'
# -> $fname == 'bash.bashrc'
# -> $fnameroot == 'bash'
# -> $suffix == '.bashrc'

请注意,输入路径后面的参数是自由选择的位置变量名称。要跳过不感兴趣的变量,请指定_(使用扔掉变量$_)或“”;例如,要仅提取文件名根和扩展名,请使用splitPath“/etc/bash.bashrc”_ _ fnameroot扩展名。


# SYNOPSIS
#   splitPath path varDirname [varBasename [varBasenameRoot [varSuffix]]] 
# DESCRIPTION
#   Splits the specified input path into its components and returns them by assigning
#   them to variables with the specified *names*.
#   Specify '' or throw-away variable _ to skip earlier variables, if necessary.
#   The filename suffix, if any, always starts with '.' - only the *last*
#   '.'-prefixed token is reported as the suffix.
#   As with `dirname`, varDirname will report '.' (current dir) for input paths
#   that are mere filenames, and '/' for the root dir.
#   As with `dirname` and `basename`, a trailing '/' in the input path is ignored.
#   A '.' as the very first char. of a filename is NOT considered the beginning
#   of a filename suffix.
# EXAMPLE
#   splitPath '/home/jdoe/readme.txt' parentpath fname fnameroot suffix
#   echo "$parentpath" # -> '/home/jdoe'
#   echo "$fname" # -> 'readme.txt'
#   echo "$fnameroot" # -> 'readme'
#   echo "$suffix" # -> '.txt'
#   ---
#   splitPath '/home/jdoe/readme.txt' _ _ fnameroot
#   echo "$fnameroot" # -> 'readme'  
splitPath() {
  local _sp_dirname= _sp_basename= _sp_basename_root= _sp_suffix=
    # simple argument validation
  (( $# >= 2 )) || { echo "$FUNCNAME: ERROR: Specify an input path and at least 1 output variable name." >&2; exit 2; }
    # extract dirname (parent path) and basename (filename)
  _sp_dirname=$(dirname "$1")
  _sp_basename=$(basename "$1")
    # determine suffix, if any
  _sp_suffix=$([[ $_sp_basename = *.* ]] && printf %s ".${_sp_basename##*.}" || printf '')
    # determine basename root (filemane w/o suffix)
  if [[ "$_sp_basename" == "$_sp_suffix" ]]; then # does filename start with '.'?
      _sp_basename_root=$_sp_basename
      _sp_suffix=''
  else # strip suffix from filename
    _sp_basename_root=${_sp_basename%$_sp_suffix}
  fi
  # assign to output vars.
  [[ -n $2 ]] && printf -v "$2" "$_sp_dirname"
  [[ -n $3 ]] && printf -v "$3" "$_sp_basename"
  [[ -n $4 ]] && printf -v "$4" "$_sp_basename_root"
  [[ -n $5 ]] && printf -v "$5" "$_sp_suffix"
  return 0
}

test_paths=(
  '/etc/bash.bashrc'
  '/usr/bin/grep'
  '/Users/jdoe/.bash_profile'
  '/Library/Application Support/'
  'readme.new.txt'
)

for p in "${test_paths[@]}"; do
  echo ----- "$p"
  parentpath= fname= fnameroot= suffix=
  splitPath "$p" parentpath fname fnameroot suffix
  for n in parentpath fname fnameroot suffix; do
    echo "$n=${!n}"
  done
done

执行功能的测试代码:

test_paths=(
  '/etc/bash.bashrc'
  '/usr/bin/grep'
  '/Users/jdoe/.bash_profile'
  '/Library/Application Support/'
  'readme.new.txt'
)

for p in "${test_paths[@]}"; do
  echo ----- "$p"
  parentpath= fname= fnameroot= suffix=
  splitPath "$p" parentpath fname fnameroot suffix
  for n in parentpath fname fnameroot suffix; do
    echo "$n=${!n}"
  done
done

预期输出-注意边缘情况:

没有后缀的文件名以开头的文件名。(不考虑后缀的开头)以/结尾的输入路径(忽略尾随/)仅为文件名的输入路径(.作为父路径返回)超过的文件名-前缀标记(仅最后一个被视为后缀):

----- /etc/bash.bashrc
parentpath=/etc
fname=bash.bashrc
fnameroot=bash
suffix=.bashrc
----- /usr/bin/grep
parentpath=/usr/bin
fname=grep
fnameroot=grep
suffix=
----- /Users/jdoe/.bash_profile
parentpath=/Users/jdoe
fname=.bash_profile
fnameroot=.bash_profile
suffix=
----- /Library/Application Support/
parentpath=/Library
fname=Application Support
fnameroot=Application Support
suffix=
----- readme.new.txt
parentpath=.
fname=readme.new.txt
fnameroot=readme.new
suffix=.txt
pax> echo a.b.js | sed 's/\.[^.]*$//'
a.b
pax> echo a.b.js | sed 's/^.*\.//'
js

工作正常,因此您可以使用:

pax> FILE=a.b.js
pax> NAME=$(echo "$FILE" | sed 's/\.[^.]*$//')
pax> EXTENSION=$(echo "$FILE" | sed 's/^.*\.//')
pax> echo $NAME
a.b
pax> echo $EXTENSION
js

顺便说一下,这些命令的工作原理如下。

NAME命令将一个“.”字符后接任意数量的非“.”字,直到行尾,但不包含任何内容(即,从最后的“.”到行尾,包括首尾)。这基本上是一个使用正则表达式技巧的非贪婪替换。

EXTENSION命令将任意数量的字符替换为行开头的“.”字符,而不使用任何字符(即,它删除从行开头到最后一个点的所有内容,包括所有内容)。这是一个贪婪的替代,这是默认操作。

您可以使用POSIX参数扩展的魔力:

bash-3.2$ FILENAME=somefile.tar.gz
bash-3.2$ echo "${FILENAME%%.*}"
somefile
bash-3.2$ echo "${FILENAME%.*}"
somefile.tar

如果您的文件名是这样的格式,则需要注意/somefile.tar.gz然后echo${FILENAME%%.*}会贪婪地删除最长的匹配项。你会得到空字符串。

(您可以使用临时变量解决此问题:

FULL_FILENAME=$FILENAME
FILENAME=${FULL_FILENAME##*/}
echo ${FILENAME%%.*}

)


本网站提供了更多信息。

${variable%pattern}
  Trim the shortest match from the end
${variable##pattern}
  Trim the longest match from the beginning
${variable%%pattern}
  Trim the longest match from the end
${variable#pattern}
  Trim the shortest match from the beginning