你最喜欢在Bash中处理错误的方法是什么? 我在网上找到的处理错误的最好的例子是William Shotts, Jr在http://www.linuxcommand.org上写的。

他建议在Bash中使用以下函数进行错误处理:

#!/bin/bash

# A slicker error handling routine

# I put a variable in my scripts named PROGNAME which
# holds the name of the program being run.  You can get this
# value from the first item on the command line ($0).

# Reference: This was copied from <http://www.linuxcommand.org/wss0150.php>

PROGNAME=$(basename $0)

function error_exit
{

#   ----------------------------------------------------------------
#   Function for exit due to fatal program error
#       Accepts 1 argument:
#           string containing descriptive error message
#   ---------------------------------------------------------------- 

    echo "${PROGNAME}: ${1:-"Unknown Error"}" 1>&2
    exit 1
}

# Example call of the error_exit function.  Note the inclusion
# of the LINENO environment variable.  It contains the current
# line number.

echo "Example of error with line number and message"
error_exit "$LINENO: An error has occurred."

在Bash脚本中是否有更好的错误处理例程?


当前回答

我更喜欢简单的称呼。所以我用了一些看起来有点复杂,但很容易使用的东西。我通常只是复制并粘贴下面的代码到我的脚本。代码后面有解释。

#This function is used to cleanly exit any script. It does this displaying a
# given error message, and exiting with an error code.
function error_exit {
    echo
    echo "$@"
    exit 1
}
#Trap the killer signals so that we can exit with a good message.
trap "error_exit 'Received signal SIGHUP'" SIGHUP
trap "error_exit 'Received signal SIGINT'" SIGINT
trap "error_exit 'Received signal SIGTERM'" SIGTERM

#Alias the function so that it will print a message with the following format:
#prog-name(@line#): message
#We have to explicitly allow aliases, we do this because they make calling the
#function much easier (see example).
shopt -s expand_aliases
alias die='error_exit "Error ${0}(@`echo $(( $LINENO - 1 ))`):"'

I usually put a call to the cleanup function in side the error_exit function, but this varies from script to script so I left it out. The traps catch the common terminating signals and make sure everything gets cleaned up. The alias is what does the real magic. I like to check everything for failure. So in general I call programs in an "if !" type statement. By subtracting 1 from the line number the alias will tell me where the failure occurred. It is also dead simple to call, and pretty much idiot proof. Below is an example (just replace /bin/false with whatever you are going to call).

#This is an example useage, it will print out
#Error prog-name (@1): Who knew false is false.
if ! /bin/false ; then
    die "Who knew false is false."
fi

其他回答

使用陷阱并不总是一个选择。例如,如果您正在编写某种需要错误处理的可重用函数,并且可以从任何脚本调用(在使用helper函数源文件之后),则该函数不能假设外部脚本的退出时间,这使得使用trap非常困难。使用陷阱的另一个缺点是糟糕的可组合性,因为您可能会覆盖之前可能在调用者链中较早设置的陷阱。

有一个小技巧可以用来在没有陷阱的情况下进行正确的错误处理。正如你可能已经从其他答案中知道的,如果你在命令后使用||操作符,set -e在命令中不起作用,即使你在子shell中运行它们;例如,这行不通:

#!/bin/sh

# prints:
#
# --> outer
# --> inner
# ./so_1.sh: line 16: some_failed_command: command not found
# <-- inner
# <-- outer

set -e

outer() {
  echo '--> outer'
  (inner) || {
    exit_code=$?
    echo '--> cleanup'
    return $exit_code
  }
  echo '<-- outer'
}

inner() {
  set -e
  echo '--> inner'
  some_failed_command
  echo '<-- inner'
}

outer

但是需要||操作符来防止在清理之前从外部函数返回。诀窍是在后台运行内部命令,然后立即等待它。wait内置函数将返回内部命令的退出码,现在你在wait后使用||,而不是内部函数,所以set -e在后者中正常工作:

#!/bin/sh

# prints:
#
# --> outer
# --> inner
# ./so_2.sh: line 27: some_failed_command: command not found
# --> cleanup

set -e

outer() {
  echo '--> outer'
  inner &
  wait $! || {
    exit_code=$?
    echo '--> cleanup'
    return $exit_code
  }
  echo '<-- outer'
}

inner() {
  set -e
  echo '--> inner'
  some_failed_command
  echo '<-- inner'
}

outer

下面是基于这个思想的泛型函数。如果你删除本地关键字,它应该在所有posix兼容的shell中工作,即用x=y替换所有本地x=y:

# [CLEANUP=cleanup_cmd] run cmd [args...]
#
# `cmd` and `args...` A command to run and its arguments.
#
# `cleanup_cmd` A command that is called after cmd has exited,
# and gets passed the same arguments as cmd. Additionally, the
# following environment variables are available to that command:
#
# - `RUN_CMD` contains the `cmd` that was passed to `run`;
# - `RUN_EXIT_CODE` contains the exit code of the command.
#
# If `cleanup_cmd` is set, `run` will return the exit code of that
# command. Otherwise, it will return the exit code of `cmd`.
#
run() {
  local cmd="$1"; shift
  local exit_code=0

  local e_was_set=1; if ! is_shell_attribute_set e; then
    set -e
    e_was_set=0
  fi

  "$cmd" "$@" &

  wait $! || {
    exit_code=$?
  }

  if [ "$e_was_set" = 0 ] && is_shell_attribute_set e; then
    set +e
  fi

  if [ -n "$CLEANUP" ]; then
    RUN_CMD="$cmd" RUN_EXIT_CODE="$exit_code" "$CLEANUP" "$@"
    return $?
  fi

  return $exit_code
}


is_shell_attribute_set() { # attribute, like "x"
  case "$-" in
    *"$1"*) return 0 ;;
    *)    return 1 ;;
  esac
}

用法示例:

#!/bin/sh
set -e

# Source the file with the definition of `run` (previous code snippet).
# Alternatively, you may paste that code directly here and comment the next line.
. ./utils.sh


main() {
  echo "--> main: $@"
  CLEANUP=cleanup run inner "$@"
  echo "<-- main"
}


inner() {
  echo "--> inner: $@"
  sleep 0.5; if [ "$1" = 'fail' ]; then
    oh_my_god_look_at_this
  fi
  echo "<-- inner"
}


cleanup() {
  echo "--> cleanup: $@"
  echo "    RUN_CMD = '$RUN_CMD'"
  echo "    RUN_EXIT_CODE = $RUN_EXIT_CODE"
  sleep 0.3
  echo '<-- cleanup'
  return $RUN_EXIT_CODE
}

main "$@"

运行示例:

$ ./so_3 fail; echo "exit code: $?"

--> main: fail
--> inner: fail
./so_3: line 15: oh_my_god_look_at_this: command not found
--> cleanup: fail
    RUN_CMD = 'inner'
    RUN_EXIT_CODE = 127
<-- cleanup
exit code: 127

$ ./so_3 pass; echo "exit code: $?"

--> main: pass
--> inner: pass
<-- inner
--> cleanup: pass
    RUN_CMD = 'inner'
    RUN_EXIT_CODE = 0
<-- cleanup
<-- main
exit code: 0

在使用此方法时,您需要注意的唯一一件事是,您传递给运行的命令对Shell变量所做的所有修改都不会传播到调用函数,因为该命令运行在子Shell中。

不确定这是否对您有帮助,但我修改了这里的一些建议函数,以便在其中包括错误检查(先前命令的退出代码)。 在每次“检查”中,我还将错误的“消息”作为参数传递给日志记录。

#!/bin/bash

error_exit()
{
    if [ "$?" != "0" ]; then
        log.sh "$1"
        exit 1
    fi
}

现在要在同一个脚本中调用它(或者在另一个脚本中,如果我使用export -f error_exit),我只需写函数名并传递一个消息作为参数,如下所示:

#!/bin/bash

cd /home/myuser/afolder
error_exit "Unable to switch to folder"

rm *
error_exit "Unable to delete all files"

使用这个,我能够为一些自动化进程创建一个真正健壮的bash文件,它将在错误的情况下停止并通知我(log.sh将这样做)

另一个需要考虑的问题是返回的退出码。只有“1”是相当标准的,尽管bash本身使用了少数保留的退出码,并且同一页认为用户定义的代码应该在64-113范围内,以符合C/ c++标准。

你也可以考虑使用位向量方法,挂载使用它的退出码:

 0  success
 1  incorrect invocation or permissions
 2  system error (out of memory, cannot fork, no more loop devices)
 4  internal mount bug or missing nfs support in mount
 8  user interrupt
16  problems writing or locking /etc/mtab
32  mount failure
64  some mount succeeded

OR-ing代码可以让脚本显示多个同时发生的错误。

我使用以下陷阱代码,它还允许通过管道和“时间”命令跟踪错误

#!/bin/bash
set -o pipefail  # trace ERR through pipes
set -o errtrace  # trace ERR through 'time command' and other functions
function error() {
    JOB="$0"              # job name
    LASTLINE="$1"         # line of error occurrence
    LASTERR="$2"          # error code
    echo "ERROR in ${JOB} : line ${LASTLINE} with exit code ${LASTERR}"
    exit 1
}
trap 'error ${LINENO} ${?}' ERR

我更喜欢简单的称呼。所以我用了一些看起来有点复杂,但很容易使用的东西。我通常只是复制并粘贴下面的代码到我的脚本。代码后面有解释。

#This function is used to cleanly exit any script. It does this displaying a
# given error message, and exiting with an error code.
function error_exit {
    echo
    echo "$@"
    exit 1
}
#Trap the killer signals so that we can exit with a good message.
trap "error_exit 'Received signal SIGHUP'" SIGHUP
trap "error_exit 'Received signal SIGINT'" SIGINT
trap "error_exit 'Received signal SIGTERM'" SIGTERM

#Alias the function so that it will print a message with the following format:
#prog-name(@line#): message
#We have to explicitly allow aliases, we do this because they make calling the
#function much easier (see example).
shopt -s expand_aliases
alias die='error_exit "Error ${0}(@`echo $(( $LINENO - 1 ))`):"'

I usually put a call to the cleanup function in side the error_exit function, but this varies from script to script so I left it out. The traps catch the common terminating signals and make sure everything gets cleaned up. The alias is what does the real magic. I like to check everything for failure. So in general I call programs in an "if !" type statement. By subtracting 1 from the line number the alias will tell me where the failure occurred. It is also dead simple to call, and pretty much idiot proof. Below is an example (just replace /bin/false with whatever you are going to call).

#This is an example useage, it will print out
#Error prog-name (@1): Who knew false is false.
if ! /bin/false ; then
    die "Who knew false is false."
fi