有什么快速而简单的方法可以确保在给定时间内只有一个shell脚本实例在运行?


当前回答

如果您不想或不能使用flock(例如,您没有使用共享文件系统),请考虑使用外部服务,如lockable。

它暴露了咨询锁原语,就像flock一样。特别地,你可以通过以下方式获取锁:

https://lockable.dev/api/acquire/my-lock-name

然后通过

https://lockable.dev/api/release/my-lock-name

通过将脚本执行与锁获取和释放结合在一起,您可以确保在任何给定时间只有一个流程实例在运行。

其他回答

一个有flock(1)但没有subshell的例子。Flock ()ed文件/tmp/foo永远不会被删除,但这没关系,因为它会被Flock()和un-flock()ed。

#!/bin/bash

exec 9<> /tmp/foo
flock -n 9
RET=$?
if [[ $RET -ne 0 ]] ; then
    echo "lock failed, exiting"
    exit
fi

#Now we are inside the "critical section"
echo "inside lock"
sleep 5
exec 9>&- #close fd 9, and release lock

#The part below is outside the critical section (the lock)
echo "lock released"
sleep 5

另一个选项是通过运行set -C来使用shell的noclobber选项。如果文件已经存在,那么>将失败。

简而言之:

set -C
lockfile="/tmp/locktest.lock"
if echo "$$" > "$lockfile"; then
    echo "Successfully acquired lock"
    # do work
    rm "$lockfile"    # XXX or via trap - see below
else
    echo "Cannot acquire lock - already locked by $(cat "$lockfile")"
fi

这会导致shell调用:

open(pathname, O_CREAT|O_EXCL)

自动创建文件,如果文件已经存在则失败。


根据BashFAQ 045上的评论,这可能在ksh88中失败,但它在我所有的shell中都有效:

$ strace -e trace=creat,open -f /bin/bash /home/mikel/bin/testopen 2>&1 | grep -F testopen.lock
open("/tmp/testopen.lock", O_WRONLY|O_CREAT|O_EXCL|O_LARGEFILE, 0666) = 3

$ strace -e trace=creat,open -f /bin/zsh /home/mikel/bin/testopen 2>&1 | grep -F testopen.lock
open("/tmp/testopen.lock", O_WRONLY|O_CREAT|O_EXCL|O_NOCTTY|O_LARGEFILE, 0666) = 3

$ strace -e trace=creat,open -f /bin/pdksh /home/mikel/bin/testopen 2>&1 | grep -F testopen.lock
open("/tmp/testopen.lock", O_WRONLY|O_CREAT|O_EXCL|O_TRUNC|O_LARGEFILE, 0666) = 3

$ strace -e trace=creat,open -f /bin/dash /home/mikel/bin/testopen 2>&1 | grep -F testopen.lock
open("/tmp/testopen.lock", O_WRONLY|O_CREAT|O_EXCL|O_LARGEFILE, 0666) = 3

有趣的是pdksh添加了O_TRUNC标志,但显然这是多余的: 要么创建一个空文件,要么什么也不做。


如何进行rm取决于您希望如何处理不干净的出口。

在干净退出时删除

新的运行失败,直到导致不干净退出的问题得到解决,并手动删除锁文件。

# acquire lock
# do work (code here may call exit, etc.)
rm "$lockfile"

在任何出口删除

只要脚本尚未运行,新的运行就会成功。

trap 'rm "$lockfile"' EXIT

我使用onlineer @脚本的开头:

#!/bin/bash

if [[ $(pgrep -afc "$(basename "$0")") -gt "1" ]]; then echo "Another instance of "$0" has already been started!" && exit; fi
.
the_beginning_of_actual_script

在内存中看到进程的存在是很好的(不管进程的状态是什么);但它对我很有用。

我对现有的答案有以下问题:

Some answers try to clean up lock files and then having to deal with stale lock files caused by e.g. sudden crash/reboot. IMO that is unnecessarily complicated. Let lock files stay. Some answers use script file itself $0 or $BASH_SOURCE for locking often referring to examples from man flock. This fails when script is replaced due to update or edit causing next run to open and obtain lock on the new script file even though another instance holding a lock on the removed file is still running. Few answers use a fixed file descriptor. This is not ideal. I do not want to rely on how this will behave e.g. opening lock file fails but gets mishandled and attempts to lock on unrelated file descriptor inherited from parent process. Another fail case is injecting locking wrapper for a 3rd party binary that does not handle locking itself but fixed file descriptors can interfere with file descriptor passing to child processes. I reject answers using process lookup for already running script name. There are several reasons for it, such as but not limited to reliability/atomicity, parsing output, and having script that does several related functions some of which do not require locking.

这个答案是:

rely on flock because it gets kernel to provide locking ... provided lock file is created atomically and not replaced. assume and rely on lock file being stored on the local filesystem as opposed to NFS. change lock file presence to NOT mean anything about a running instance. Its role is purely to prevent two concurrent instances creating file with same name and replacing another's copy. Lock file does not get deleted, it gets left behind and can survive across reboots. The locking is indicated via flock not via lock file presence. assume bash shell, as tagged by the question.

它不是一个联机程序,但是没有注释和错误消息,它足够小:

#!/bin/bash

LOCKFILE=/var/lock/TODO

set -o noclobber
exec {lockfd}<> "${LOCKFILE}" || exit 1
set +o noclobber # depends on what you need
flock --exclusive --nonblock ${lockfd} || exit 1

但我更喜欢注释和错误消息:

#!/bin/bash

# TODO Set a lock file name
LOCKFILE=/var/lock/myprogram.lock

# Set noclobber option to ensure lock file is not REPLACED.
set -o noclobber

# Open lock file for R+W on a new file descriptor
# and assign the new file descriptor to "lockfd" variable.
# This does NOT obtain a lock but ensures the file exists and opens it.
exec {lockfd}<> "${LOCKFILE}" || {
  echo "pid=$$ failed to open LOCKFILE='${LOCKFILE}'" 1>&2
  exit 1
}

# TODO!!!! undo/set the desired noclobber value for the remainder of the script
set +o noclobber

# Lock on the allocated file descriptor or fail
# Adjust flock options e.g. --noblock as needed
flock --exclusive --nonblock ${lockfd} || {
  echo "pid=$$ failed to obtain lock fd='${lockfd}' LOCKFILE='${LOCKFILE}'" 1>&2
  exit 1
}

# DO work here
echo "pid=$$ obtained exclusive lock fd='${lockfd}' LOCKFILE='${LOCKFILE}'"

# Can unlock after critical section and do more work after unlocking
#flock -u ${lockfd};
# if unlocking then might as well close lockfd too
#exec {lockfd}<&-

下面是一种方法,它结合了原子目录锁定和通过PID检查过期锁,如果过期就重新启动。此外,这并不依赖于任何羞怯。

#!/bin/dash

SCRIPTNAME=$(basename $0)
LOCKDIR="/var/lock/${SCRIPTNAME}"
PIDFILE="${LOCKDIR}/pid"

if ! mkdir $LOCKDIR 2>/dev/null
then
    # lock failed, but check for stale one by checking if the PID is really existing
    PID=$(cat $PIDFILE)
    if ! kill -0 $PID 2>/dev/null
    then
       echo "Removing stale lock of nonexistent PID ${PID}" >&2
       rm -rf $LOCKDIR
       echo "Restarting myself (${SCRIPTNAME})" >&2
       exec "$0" "$@"
    fi
    echo "$SCRIPTNAME is already running, bailing out" >&2
    exit 1
else
    # lock successfully acquired, save PID
    echo $$ > $PIDFILE
fi

trap "rm -rf ${LOCKDIR}" QUIT INT TERM EXIT


echo hello

sleep 30s

echo bye