I'm using a python script as a driver for a hydrodynamics code. When it comes time to run the simulation, I use subprocess.Popen to run the code, collect the output from stdout and stderr into a subprocess.PIPE --- then I can print (and save to a log-file) the output information, and check for any errors. The problem is, I have no idea how the code is progressing. If I run it directly from the command line, it gives me output about what iteration its at, what time, what the next time-step is, etc.
是否有一种方法既存储输出(用于日志记录和错误检查),又产生实时流输出?
我的代码的相关部分:
ret_val = subprocess.Popen( run_command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True )
output, errors = ret_val.communicate()
log_file.write(output)
print output
if( ret_val.returncode ):
print "RUN failed\n\n%s\n\n" % (errors)
success = False
if( errors ): log_file.write("\n\n%s\n\n" % errors)
最初,我将run_command通过tee输送,以便将副本直接发送到日志文件,流仍然直接输出到终端——但这样我就不能存储任何错误(据我所知)。
目前我的临时解决方案是:
ret_val = subprocess.Popen( run_command, stdout=log_file, stderr=subprocess.PIPE, shell=True )
while not ret_val.poll():
log_file.flush()
然后,在另一个终端上运行tail -f log.txt (s.t. log_file = 'log.txt')。
我认为subprocess. communication方法有点误导人:它实际上填充了您在subprocess.Popen中指定的stdout和stderr。
然而,从子进程中读取。可以提供给子流程的PIPE。Popen的stdout和stderr参数最终会填满OS管道缓冲区并导致应用程序死锁(特别是当你有多个必须使用subprocess的进程/线程时)。
我建议的解决方案是提供带有文件的标准输出和标准输出-并读取文件的内容,而不是从死锁PIPE中读取。这些文件可以是tempfile.NamedTemporaryFile()——当subprocess. communication写入这些文件时,也可以访问该文件进行读取。
下面是一个示例用法:
try:
with ProcessRunner(
("python", "task.py"), env=os.environ.copy(), seconds_to_wait=0.01
) as process_runner:
for out in process_runner:
print(out)
except ProcessError as e:
print(e.error_message)
raise
这是源代码,准备使用尽可能多的评论,因为我可以提供解释它的功能:
如果您正在使用python 2,请确保首先从pypi安装最新版本的subprocess32包。
import os
import sys
import threading
import time
import tempfile
import logging
if os.name == 'posix' and sys.version_info[0] < 3:
# Support python 2
import subprocess32 as subprocess
else:
# Get latest and greatest from python 3
import subprocess
logger = logging.getLogger(__name__)
class ProcessError(Exception):
"""Base exception for errors related to running the process"""
class ProcessTimeout(ProcessError):
"""Error that will be raised when the process execution will exceed a timeout"""
class ProcessRunner(object):
def __init__(self, args, env=None, timeout=None, bufsize=-1, seconds_to_wait=0.25, **kwargs):
"""
Constructor facade to subprocess.Popen that receives parameters which are more specifically required for the
Process Runner. This is a class that should be used as a context manager - and that provides an iterator
for reading captured output from subprocess.communicate in near realtime.
Example usage:
try:
with ProcessRunner(('python', task_file_path), env=os.environ.copy(), seconds_to_wait=0.01) as process_runner:
for out in process_runner:
print(out)
except ProcessError as e:
print(e.error_message)
raise
:param args: same as subprocess.Popen
:param env: same as subprocess.Popen
:param timeout: same as subprocess.communicate
:param bufsize: same as subprocess.Popen
:param seconds_to_wait: time to wait between each readline from the temporary file
:param kwargs: same as subprocess.Popen
"""
self._seconds_to_wait = seconds_to_wait
self._process_has_timed_out = False
self._timeout = timeout
self._process_done = False
self._std_file_handle = tempfile.NamedTemporaryFile()
self._process = subprocess.Popen(args, env=env, bufsize=bufsize,
stdout=self._std_file_handle, stderr=self._std_file_handle, **kwargs)
self._thread = threading.Thread(target=self._run_process)
self._thread.daemon = True
def __enter__(self):
self._thread.start()
return self
def __exit__(self, exc_type, exc_val, exc_tb):
self._thread.join()
self._std_file_handle.close()
def __iter__(self):
# read all output from stdout file that subprocess.communicate fills
with open(self._std_file_handle.name, 'r') as stdout:
# while process is alive, keep reading data
while not self._process_done:
out = stdout.readline()
out_without_trailing_whitespaces = out.rstrip()
if out_without_trailing_whitespaces:
# yield stdout data without trailing \n
yield out_without_trailing_whitespaces
else:
# if there is nothing to read, then please wait a tiny little bit
time.sleep(self._seconds_to_wait)
# this is a hack: terraform seems to write to buffer after process has finished
out = stdout.read()
if out:
yield out
if self._process_has_timed_out:
raise ProcessTimeout('Process has timed out')
if self._process.returncode != 0:
raise ProcessError('Process has failed')
def _run_process(self):
try:
# Start gathering information (stdout and stderr) from the opened process
self._process.communicate(timeout=self._timeout)
# Graceful termination of the opened process
self._process.terminate()
except subprocess.TimeoutExpired:
self._process_has_timed_out = True
# Force termination of the opened process
self._process.kill()
self._process_done = True
@property
def return_code(self):
return self._process.returncode
我发现如何以流的方式读取子进程的输出(同时也在一个变量中捕获它)在Python中(对于多个输出流,即stdout和stderr)是通过传递子进程一个命名的临时文件来写入,然后在单独的读取句柄中打开相同的临时文件。
注意:这是针对Python 3的
stdout_write = tempfile.NamedTemporaryFile()
stdout_read = io.open(stdout_write.name, "r")
stderr_write = tempfile.NamedTemporaryFile()
stderr_read = io.open(stderr_write.name, "r")
stdout_captured = ""
stderr_captured = ""
proc = subprocess.Popen(["command"], stdout=stdout_write, stderr=stderr_write)
while True:
proc_done: bool = cli_process.poll() is not None
while True:
content = stdout_read.read(1024)
sys.stdout.write(content)
stdout_captured += content
if len(content) < 1024:
break
while True:
content = stderr_read.read(1024)
sys.stderr.write(content)
stdout_captured += content
if len(content) < 1024:
break
if proc_done:
break
time.sleep(0.1)
stdout_write.close()
stdout_read.close()
stderr_write.close()
stderr_read.close()
但是,如果您不需要捕获输出,那么您可以简单地传递sys。Stdout和sys。stderr流从你的Python脚本到被调用的子进程,正如xaav在他的回答中建议的那样:
subprocess.Popen(["command"], stdout=sys.stdout, stderr=sys.stderr)