有什么快速而简单的方法可以确保在给定时间内只有一个shell脚本实例在运行?


当前回答

测试“锁文件”是否存在的简单方法是有缺陷的。

为什么?因为它们不会检查文件是否存在,也不会在单个原子操作中创建文件。因为这个;有一个竞争条件会使你的互斥失败。

相反,您可以使用mkdir。Mkdir创建一个目录,如果它还不存在,如果它存在,它设置一个退出码。更重要的是,它在一个原子动作中完成了所有这些操作,这使得它非常适合这个场景。

if ! mkdir /tmp/myscript.lock 2>/dev/null; then
    echo "Myscript is already running." >&2
    exit 1
fi

有关所有细节,请参阅优秀的BashFAQ: http://mywiki.wooledge.org/BashFAQ/045

如果你想要处理陈旧的锁,fuser(1)会派上用场。唯一的缺点是这个操作大约需要一秒钟,所以它不是即时的。

下面是我曾经写过的一个函数,它使用fuser解决了这个问题:

#       mutex file
#
# Open a mutual exclusion lock on the file, unless another process already owns one.
#
# If the file is already locked by another process, the operation fails.
# This function defines a lock on a file as having a file descriptor open to the file.
# This function uses FD 9 to open a lock on the file.  To release the lock, close FD 9:
# exec 9>&-
#
mutex() {
    local file=$1 pid pids 

    exec 9>>"$file"
    { pids=$(fuser -f "$file"); } 2>&- 9>&- 
    for pid in $pids; do
        [[ $pid = $$ ]] && continue

        exec 9>&- 
        return 1 # Locked by a pid.
    done 
}

你可以像这样在脚本中使用它:

mutex /var/run/myscript.lock || { echo "Already running." >&2; exit 1; }

如果您不关心可移植性(这些解决方案应该适用于几乎任何UNIX机器),Linux的fuser(1)提供了一些额外的选项,还有flock(1)。

其他回答

我使用一种简单的方法来处理过期的锁文件。

注意,上面的一些解决方案存储pid,忽略了pid可以环绕的事实。因此,仅仅检查是否有一个有效的进程与存储的pid是不够的,特别是对于长时间运行的脚本。

我使用noclobber来确保一次只能打开一个脚本并写入锁文件。此外,我在锁文件中存储了足够的信息来惟一地标识一个进程。我定义了一组数据来唯一地标识一个进程为pid、ppid、lstart。

当一个新脚本启动时,如果它未能创建锁文件,那么它将验证创建锁文件的进程是否仍然存在。如果不是,我们假设原始进程不体面地死亡,并留下一个过时的锁文件。然后,新脚本获得锁文件的所有权,一切又恢复正常了。

应该与跨多个平台的多个shell一起工作。快速、便携、简单。

#!/usr/bin/env sh
# Author: rouble

LOCKFILE=/var/tmp/lockfile #customize this line

trap release INT TERM EXIT

# Creates a lockfile. Sets global variable $ACQUIRED to true on success.
# 
# Returns 0 if it is successfully able to create lockfile.
acquire () {
    set -C #Shell noclobber option. If file exists, > will fail.
    UUID=`ps -eo pid,ppid,lstart $$ | tail -1`
    if (echo "$UUID" > "$LOCKFILE") 2>/dev/null; then
        ACQUIRED="TRUE"
        return 0
    else
        if [ -e $LOCKFILE ]; then 
            # We may be dealing with a stale lock file.
            # Bring out the magnifying glass. 
            CURRENT_UUID_FROM_LOCKFILE=`cat $LOCKFILE`
            CURRENT_PID_FROM_LOCKFILE=`cat $LOCKFILE | cut -f 1 -d " "`
            CURRENT_UUID_FROM_PS=`ps -eo pid,ppid,lstart $CURRENT_PID_FROM_LOCKFILE | tail -1`
            if [ "$CURRENT_UUID_FROM_LOCKFILE" == "$CURRENT_UUID_FROM_PS" ]; then 
                echo "Script already running with following identification: $CURRENT_UUID_FROM_LOCKFILE" >&2
                return 1
            else
                # The process that created this lock file died an ungraceful death. 
                # Take ownership of the lock file.
                echo "The process $CURRENT_UUID_FROM_LOCKFILE is no longer around. Taking ownership of $LOCKFILE"
                release "FORCE"
                if (echo "$UUID" > "$LOCKFILE") 2>/dev/null; then
                    ACQUIRED="TRUE"
                    return 0
                else
                    echo "Cannot write to $LOCKFILE. Error." >&2
                    return 1
                fi
            fi
        else
            echo "Do you have write permissons to $LOCKFILE ?" >&2
            return 1
        fi
    fi
}

# Removes the lock file only if this script created it ($ACQUIRED is set), 
# OR, if we are removing a stale lock file (first parameter is "FORCE") 
release () {
    #Destroy lock file. Take no prisoners.
    if [ "$ACQUIRED" ] || [ "$1" == "FORCE" ]; then
        rm -f $LOCKFILE
    fi
}

# Test code
# int main( int argc, const char* argv[] )
echo "Acquring lock."
acquire
if [ $? -eq 0 ]; then 
    echo "Acquired lock."
    read -p "Press [Enter] key to release lock..."
    release
    echo "Released lock."
else
    echo "Unable to acquire lock."
fi

试试下面的方法,

ab=`ps -ef | grep -v grep | grep -wc processname`

然后使用if循环将变量与1匹配。

实际上,尽管bmdhacks的答案几乎很好,但在第一次检查锁文件之后和在写入它之前,第二个脚本有轻微的机会运行。它们都将写入锁文件,它们都将运行。以下是如何确保它有效的方法:

lockfile=/var/lock/myscript.lock

if ( set -o noclobber; echo "$$" > "$lockfile") 2> /dev/null ; then
  trap 'rm -f "$lockfile"; exit $?' INT TERM EXIT
else
  # or you can decide to skip the "else" part if you want
  echo "Another instance is already running!"
fi

如果文件已经存在,noclobber选项将确保重定向命令将失败。因此,重定向命令实际上是原子的—您用一个命令写入和检查文件。你不需要在文件的末尾删除锁文件-它会被陷阱删除。我希望这对以后读到它的人有帮助。

另外,我没有看到Mikel已经正确地回答了这个问题,尽管他没有包括trap命令,以减少使用Ctrl-C停止脚本后留下锁文件的机会。这就是完整的解

对于shell脚本,我倾向于使用mkdir而不是flock,因为它使锁更可移植。

不管怎样,使用set -e是不够的。它只在任何命令失败时退出脚本。你的锁还是会留下的。

为了正确的锁清理,你真的应该把你的陷阱设置成这样的伪代码(提取,简化和未经测试,但来自积极使用的脚本):

#=======================================================================
# Predefined Global Variables
#=======================================================================

TMPDIR=/tmp/myapp
[[ ! -d $TMP_DIR ]] \
    && mkdir -p $TMP_DIR \
    && chmod 700 $TMPDIR

LOCK_DIR=$TMP_DIR/lock

#=======================================================================
# Functions
#=======================================================================

function mklock {
    __lockdir="$LOCK_DIR/$(date +%s.%N).$$" # Private Global. Use Epoch.Nano.PID

    # If it can create $LOCK_DIR then no other instance is running
    if $(mkdir $LOCK_DIR)
    then
        mkdir $__lockdir  # create this instance's specific lock in queue
        LOCK_EXISTS=true  # Global
    else
        echo "FATAL: Lock already exists. Another copy is running or manually lock clean up required."
        exit 1001  # Or work out some sleep_while_execution_lock elsewhere
    fi
}

function rmlock {
    [[ ! -d $__lockdir ]] \
        && echo "WARNING: Lock is missing. $__lockdir does not exist" \
        || rmdir $__lockdir
}

#-----------------------------------------------------------------------
# Private Signal Traps Functions {{{2
#
# DANGER: SIGKILL cannot be trapped. So, try not to `kill -9 PID` or 
#         there will be *NO CLEAN UP*. You'll have to manually remove 
#         any locks in place.
#-----------------------------------------------------------------------
function __sig_exit {

    # Place your clean up logic here 

    # Remove the LOCK
    [[ -n $LOCK_EXISTS ]] && rmlock
}

function __sig_int {
    echo "WARNING: SIGINT caught"    
    exit 1002
}

function __sig_quit {
    echo "SIGQUIT caught"
    exit 1003
}

function __sig_term {
    echo "WARNING: SIGTERM caught"    
    exit 1015
}

#=======================================================================
# Main
#=======================================================================

# Set TRAPs
trap __sig_exit EXIT    # SIGEXIT
trap __sig_int INT      # SIGINT
trap __sig_quit QUIT    # SIGQUIT
trap __sig_term TERM    # SIGTERM

mklock

# CODE

exit # No need for cleanup code here being in the __sig_exit trap function

接下来会发生什么。所有陷阱都会产生一个出口,所以__sig_exit函数总是会发生(除非SIGKILL),它会清理你的锁。

注意:我的退出值不是低值。为什么?各种批处理系统生成或期望数字0到31。将它们设置为其他内容,我可以让我的脚本和批处理流对前一个批处理作业或脚本做出相应的反应。

如果flock的限制,这已经在这篇文章的其他地方描述过了,对你来说不是问题,那么这应该是有效的:

#!/bin/bash

{
    # exit if we are unable to obtain a lock; this would happen if 
    # the script is already running elsewhere
    # note: -x (exclusive) is the default
    flock -n 100 || exit

    # put commands to run here
    sleep 100
} 100>/tmp/myjob.lock