我想杀死整个进程树。使用通用脚本语言实现这一目标的最佳方法是什么?我在寻找一个简单的解决办法。


当前回答

下面的代码已经在FreeBSD, Linux和MacOS X上进行了测试,只依赖于pgrep和kill (ps -o版本在BSD下不工作)。第一个参数是父pid,子pid必须被终止。第二个参数是一个布尔值,用于确定父pid是否也必须终止。

KillChilds() {
        local pid="${1}"
        local self="${2:-false}"

        if children="$(pgrep -P "$pid")"; then
                for child in $children; do
                        KillChilds "$child" true
                done
        fi

        if [ "$self" == true ]; then
                kill -s SIGTERM "$pid" || (sleep 10 && kill -9 "$pid" &)
        fi
}

KillChilds $$ > /dev/null 2>&1

这将向shell脚本中的任何子/孙辈进程发送SIGTERM,如果SIGTERM没有成功,它将等待10秒,然后发送kill。


答:早些时候

下面的代码也可以工作,但是会在BSD上杀死shell本身。

KillSubTree() {
    local parent="${1}"
    for child in $(ps -o pid=$parent); do
            if [ $$ -ne $child ]; then (kill -s SIGTERM $child || (sleep 10 && kill -9 $child & )) > /dev/null 2>&1 ; fi
    done
}
# Example lanch from within script
KillSubTree $$ > /dev/null 2>&1

其他回答

使用进程组ID (PGID)杀死属于同一进程树的所有进程

使用默认信号(TERM = 15) 使用信号kill (9)

您可以从同一进程树的任何进程id (PID)中检索PGID

杀- - - $ (ps - o pgid = $ PID | grep - o '[0 - 9] * ')(信号) $(ps -o pgid= $PID | grep -o '[0-9]*')(信号kill)

特别感谢tanager和Speakus对$PID剩余空间和OSX兼容性的贡献。

解释

kill -9 -"$PGID" => Send signal 9 (KILL) to all child and grandchild... PGID=$(ps opgid= "$PID") => Retrieve the Process-Group-ID from any Process-ID of the tree, not only the Process-Parent-ID. A variation of ps opgid= $PID is ps -o pgid --no-headers $PID where pgid can be replaced by pgrp. But: ps inserts leading spaces when PID is less than five digits and right aligned as noticed by tanager. You can use: PGID=$(ps opgid= "$PID" | tr -d ' ') ps from OSX always print the header, therefore Speakus proposes: PGID="$( ps -o pgid "$PID" | grep [0-9] | tr -d ' ' )" grep -o [0-9]* prints successive digits only (does not print spaces or alphabetical headers).

更多命令行

PGID=$(ps -o pgid= $PID | grep -o [0-9]*)
kill -TERM -"$PGID"  # kill -15
kill -INT  -"$PGID"  # correspond to [CRTL+C] from keyboard
kill -QUIT -"$PGID"  # correspond to [CRTL+\] from keyboard
kill -CONT -"$PGID"  # restart a stopped process (above signals do not kill it)
sleep 2              # wait terminate process (more time if required)
kill -KILL -"$PGID"  # kill -9 if it does not intercept signals (or buggy)

限制

正如davide和Hubert Kario所注意到的,当kill被属于同一棵树的进程调用时,kill在终止整个树的kill之前可能会杀死自己。 因此,请确保使用具有不同process - group - id的进程运行该命令。


很长的故事

> cat run-many-processes.sh
#!/bin/sh
echo "ProcessID=$$ begins ($0)"
./child.sh background &
./child.sh foreground
echo "ProcessID=$$ ends ($0)"

> cat child.sh
#!/bin/sh
echo "ProcessID=$$ begins ($0)"
./grandchild.sh background &
./grandchild.sh foreground
echo "ProcessID=$$ ends ($0)"

> cat grandchild.sh
#!/bin/sh
echo "ProcessID=$$ begins ($0)"
sleep 9999
echo "ProcessID=$$ ends ($0)"

使用'&'在后台运行流程树

> ./run-many-processes.sh &    
ProcessID=28957 begins (./run-many-processes.sh)
ProcessID=28959 begins (./child.sh)
ProcessID=28958 begins (./child.sh)
ProcessID=28960 begins (./grandchild.sh)
ProcessID=28961 begins (./grandchild.sh)
ProcessID=28962 begins (./grandchild.sh)
ProcessID=28963 begins (./grandchild.sh)

> PID=$!                    # get the Parent Process ID
> PGID=$(ps opgid= "$PID")  # get the Process Group ID

> ps fj
 PPID   PID  PGID   SID TTY      TPGID STAT   UID   TIME COMMAND
28348 28349 28349 28349 pts/3    28969 Ss   33021   0:00 -bash
28349 28957 28957 28349 pts/3    28969 S    33021   0:00  \_ /bin/sh ./run-many-processes.sh
28957 28958 28957 28349 pts/3    28969 S    33021   0:00  |   \_ /bin/sh ./child.sh background
28958 28961 28957 28349 pts/3    28969 S    33021   0:00  |   |   \_ /bin/sh ./grandchild.sh background
28961 28965 28957 28349 pts/3    28969 S    33021   0:00  |   |   |   \_ sleep 9999
28958 28963 28957 28349 pts/3    28969 S    33021   0:00  |   |   \_ /bin/sh ./grandchild.sh foreground
28963 28967 28957 28349 pts/3    28969 S    33021   0:00  |   |       \_ sleep 9999
28957 28959 28957 28349 pts/3    28969 S    33021   0:00  |   \_ /bin/sh ./child.sh foreground
28959 28960 28957 28349 pts/3    28969 S    33021   0:00  |       \_ /bin/sh ./grandchild.sh background
28960 28964 28957 28349 pts/3    28969 S    33021   0:00  |       |   \_ sleep 9999
28959 28962 28957 28349 pts/3    28969 S    33021   0:00  |       \_ /bin/sh ./grandchild.sh foreground
28962 28966 28957 28349 pts/3    28969 S    33021   0:00  |           \_ sleep 9999
28349 28969 28969 28349 pts/3    28969 R+   33021   0:00  \_ ps fj

命令pkill -P $PID不杀死孙子:

> pkill -P "$PID"
./run-many-processes.sh: line 4: 28958 Terminated              ./child.sh background
./run-many-processes.sh: line 4: 28959 Terminated              ./child.sh foreground
ProcessID=28957 ends (./run-many-processes.sh)
[1]+  Done                    ./run-many-processes.sh

> ps fj
 PPID   PID  PGID   SID TTY      TPGID STAT   UID   TIME COMMAND
28348 28349 28349 28349 pts/3    28987 Ss   33021   0:00 -bash
28349 28987 28987 28349 pts/3    28987 R+   33021   0:00  \_ ps fj
    1 28963 28957 28349 pts/3    28987 S    33021   0:00 /bin/sh ./grandchild.sh foreground
28963 28967 28957 28349 pts/3    28987 S    33021   0:00  \_ sleep 9999
    1 28962 28957 28349 pts/3    28987 S    33021   0:00 /bin/sh ./grandchild.sh foreground
28962 28966 28957 28349 pts/3    28987 S    33021   0:00  \_ sleep 9999
    1 28961 28957 28349 pts/3    28987 S    33021   0:00 /bin/sh ./grandchild.sh background
28961 28965 28957 28349 pts/3    28987 S    33021   0:00  \_ sleep 9999
    1 28960 28957 28349 pts/3    28987 S    33021   0:00 /bin/sh ./grandchild.sh background
28960 28964 28957 28349 pts/3    28987 S    33021   0:00  \_ sleep 9999

命令kill——-$PGID杀死所有进程,包括孙子进程。

> kill --    -"$PGID"  # default signal is TERM (kill -15)
> kill -CONT -"$PGID"  # awake stopped processes
> kill -KILL -"$PGID"  # kill -9 to be sure

> ps fj
 PPID   PID  PGID   SID TTY      TPGID STAT   UID   TIME COMMAND
28348 28349 28349 28349 pts/3    29039 Ss   33021   0:00 -bash
28349 29039 29039 28349 pts/3    29039 R+   33021   0:00  \_ ps fj

结论

我注意到在这个例子中PID和PGID是相等的(28957)。 这就是为什么我一开始觉得杀了。-$PID就够了。但如果进程是在Makefile中生成的,则进程ID与组ID不同。

我认为kill——-$(ps -o pgid= $PID | grep -o[0-9]*)是当从不同的Group ID(另一个进程树)调用时杀死整个进程树的最好的简单技巧。

如果你的系统上有pstrree和perl,你可以尝试这样做:

perl -e 'kill 9, (`pstree -p PID` =~ m/\((\d+)\)/sg)'

如果你知道你想要杀死的东西的pid,你通常可以从会话id开始,和同一会话中的所有东西。我会仔细检查,但我使用这个脚本在循环中启动rsync,我想要死亡,而不是启动另一个(因为循环),因为如果我只是杀死所有的rsync。

kill $(ps -o pid= -s $(ps -o sess --no-heading --pid 21709))

如果你不知道pid,你仍然可以嵌套更多

kill $(ps -o pid= -s $(ps -o sess --no-heading --pid $(pgrep rsync )))

要递归地杀死进程树,请使用killtree():

#!/bin/bash

killtree() {
    local _pid=$1
    local _sig=${2:--TERM}
    kill -stop ${_pid} # needed to stop quickly forking parent from producing children between child killing and parent killing
    for _child in $(ps -o pid --no-headers --ppid ${_pid}); do
        killtree ${_child} ${_sig}
    done
    kill -${_sig} ${_pid}
}

if [ $# -eq 0 -o $# -gt 2 ]; then
    echo "Usage: $(basename $0) <pid> [signal]"
    exit 1
fi

killtree $@
ps -o pid= --ppid $PPID | xargs kill -9