如何在tensorflow中获得当前可用的gpu ?

我有一个使用分布式TensorFlow的计划，我看到TensorFlow可以使用gpu进行训练和测试。在集群环境中，每台机器可能有0个或1个或多个gpu，我想在尽可能多的机器上运行我的TensorFlow图。

我发现当运行tf.Session()时，TensorFlow在日志消息中给出了关于GPU的信息，如下所示:

I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)

我的问题是如何从TensorFlow获取当前可用GPU的信息?我可以从日志中获得加载的GPU信息，但我想以一种更复杂的编程方式来实现。我也可以故意使用CUDA_VISIBLE_DEVICES环境变量限制GPU，所以我不想知道从OS内核获取GPU信息的方法。

简而言之，我想要一个函数像tf.get_available_gpu()将返回['/gpu:0'， '/gpu:1']如果有两个gpu可用的机器。我如何实现这个?

当前回答

tensorflow 2中的工作如下:

import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
for gpu in gpus:
    print("Name:", gpu.name, "  Type:", gpu.device_type)

从2.1开始，你可以放弃实验性:

    gpus = tf.config.list_physical_devices('GPU')

https://www.tensorflow.org/api_docs/python/tf/config/list_physical_devices

2019-10-07 03:50:01

其他回答

您可以使用以下代码字段来显示设备名称、类型、内存和位置。

from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())

2023-01-13 06:50:12

用这种方法检查所有部件:

from __future__ import absolute_import, division, print_function, unicode_literals

import numpy as np
import tensorflow as tf
import tensorflow_hub as hub
import tensorflow_datasets as tfds


version = tf.__version__
executing_eagerly = tf.executing_eagerly()
hub_version = hub.__version__
available = tf.config.experimental.list_physical_devices("GPU")

print("Version: ", version)
print("Eager mode: ", executing_eagerly)
print("Hub Version: ", h_version)
print("GPU is", "available" if avai else "NOT AVAILABLE")

2020-01-16 09:16:48

从TensorFlow 2.1开始，你可以使用tf.config.list_physical_devices('GPU'):

import tensorflow as tf

gpus = tf.config.list_physical_devices('GPU')
for gpu in gpus:
    print("Name:", gpu.name, "  Type:", gpu.device_type)

如果你安装了两个gpu，它会输出:

Name: /physical_device:GPU:0   Type: GPU
Name: /physical_device:GPU:1   Type: GPU

在TF 2.0中，您必须添加experimental:

gpus = tf.config.experimental.list_physical_devices('GPU')

See:

引导页当前的API

2019-06-03 01:35:55

您可以使用以下代码检查所有设备列表:

from tensorflow.python.client import device_lib

device_lib.list_local_devices()

2017-07-19 06:52:44

有一个名为device_lib.list_local_devices()的无文档方法，它允许您列出本地进程中可用的设备。(注意:作为一个未记录的方法，这是受制于向后不兼容的更改。)该函数返回DeviceAttributes协议缓冲区对象的列表。您可以为GPU设备提取一个字符串设备名称列表，如下所示:

from tensorflow.python.client import device_lib

def get_available_gpus():
    local_device_protos = device_lib.list_local_devices()
    return [x.name for x in local_device_protos if x.device_type == 'GPU']

请注意(至少到TensorFlow 1.4)，调用device_lib.list_local_devices()将运行一些初始化代码，默认情况下，将在所有设备上分配所有GPU内存(GitHub问题)。为了避免这种情况，首先使用显式的小per_process_gpu_fraction或allow_growth=True创建一个会话，以防止分配所有内存。请参阅这个问题了解更多细节。

2016-07-26 02:34:21

如何在tensorflow中获得当前可用的gpu ?

推荐文章

最新文章

标签