我如何扩大输出显示，以看到一个熊猫数据框架的更多列?

是否有一种方法可以在交互或脚本执行模式下扩大输出的显示?

具体来说，我在Pandas DataFrame上使用了describe()函数。当DataFrame是五列(标签)宽时，我得到了我想要的描述性统计数据。然而，如果DataFrame有更多的列，统计数据将被抑制，并返回如下内容:

>> Index: 8 entries, count to max
>> Data columns:
>> x1          8  non-null values
>> x2          8  non-null values
>> x3          8  non-null values
>> x4          8  non-null values
>> x5          8  non-null values
>> x6          8  non-null values
>> x7          8  non-null values

无论有6列还是7列，都给出“8”值。“8”指什么?

我已经尝试过将IDLE窗口拖大，以及增加“配置IDLE”宽度选项，但无济于事。

当前回答

下面的行足以显示一个数据框架中的所有列。

pd.set_option('display.max_columns', None)

2019-11-05 06:31:11

其他回答

我只用了这三句话:

pd.set_option('display.max_columns', None)
pd.set_option('display.expand_frame_repr', False)
pd.set_option('max_colwidth', -1)

它适用于Anaconda, Python 3.6.5, Pandas 0.23.0和Visual Studio Code 1.26。

2018-07-26 14:10:51

你可以通过set_printoptions来调整Pandas打印选项。

In [3]: df.describe()
Out[3]:
<class 'pandas.core.frame.DataFrame'>
Index: 8 entries, count to max
Data columns:
x1    8  non-null values
x2    8  non-null values
x3    8  non-null values
x4    8  non-null values
x5    8  non-null values
x6    8  non-null values
x7    8  non-null values
dtypes: float64(7)

In [4]: pd.set_printoptions(precision=2)

In [5]: df.describe()
Out[5]:
            x1       x2       x3       x4       x5       x6       x7
count      8.0      8.0      8.0      8.0      8.0      8.0      8.0
mean   69024.5  69025.5  69026.5  69027.5  69028.5  69029.5  69030.5
std       17.1     17.1     17.1     17.1     17.1     17.1     17.1
min    69000.0  69001.0  69002.0  69003.0  69004.0  69005.0  69006.0
25%    69012.2  69013.2  69014.2  69015.2  69016.2  69017.2  69018.2
50%    69024.5  69025.5  69026.5  69027.5  69028.5  69029.5  69030.5
75%    69036.8  69037.8  69038.8  69039.8  69040.8  69041.8  69042.8
max    69049.0  69050.0  69051.0  69052.0  69053.0  69054.0  69055.0

然而，这并不会在所有情况下工作，因为Pandas会检测你的控制台宽度，并且它只会在输出适合控制台时使用to_string(参见set_printoptions的文档字符串)。在这种情况下，你可以显式调用由BrenBarn回答的to_string。

更新

在0.10版本中，数据帧的打印方式发生了变化:

In [3]: df.describe()
Out[3]:
                 x1            x2            x3            x4            x5  \
count      8.000000      8.000000      8.000000      8.000000      8.000000
mean   59832.361578  27356.711336  49317.281222  51214.837838  51254.839690
std    22600.723536  26867.192716  28071.737509  21012.422793  33831.515761
min    31906.695474   1648.359160     56.378115  16278.322271     43.745574
25%    45264.625201  12799.540572  41429.628749  40374.273582  29789.643875
50%    56340.214856  18666.456293  51995.661512  54894.562656  47667.684422
75%    75587.003417  31375.610322  61069.190523  67811.893435  76014.884048
max    98136.474782  84544.484627  91743.983895  75154.587156  99012.695717

                 x6            x7
count      8.000000      8.000000
mean   41863.000717  33950.235126
std    38709.468281  29075.745673
min     3590.990740   1833.464154
25%    15145.759625   6879.523949
50%    22139.243042  33706.029946
75%    72038.983496  51449.893980
max    98601.190488  83309.051963

此外，设置Pandas选项的API改变了:

In [4]: pd.set_option('display.precision', 2)

In [5]: df.describe()
Out[5]:
            x1       x2       x3       x4       x5       x6       x7
count      8.0      8.0      8.0      8.0      8.0      8.0      8.0
mean   59832.4  27356.7  49317.3  51214.8  51254.8  41863.0  33950.2
std    22600.7  26867.2  28071.7  21012.4  33831.5  38709.5  29075.7
min    31906.7   1648.4     56.4  16278.3     43.7   3591.0   1833.5
25%    45264.6  12799.5  41429.6  40374.3  29789.6  15145.8   6879.5
50%    56340.2  18666.5  51995.7  54894.6  47667.7  22139.2  33706.0
75%    75587.0  31375.6  61069.2  67811.9  76014.9  72039.0  51449.9
max    98136.5  84544.5  91744.0  75154.6  99012.7  98601.2  83309.1

2012-07-29 10:56:01

这不是严格意义上的答案，但是让我们记住我们可以df.describe().transpose()或者df.head(n).transpose()，或者df.tail(n).transpose()。

我还发现，当标题是结构化的时，将它们作为列来阅读更容易:

header1_xxx,

header2_xxx,

header3_xxx,

我认为终端和应用程序处理垂直滚动更自然，如果这是必要的转置后。

标头通常比它们的值大，将它们全部放在一列(索引)中可以最大限度地减少它们对总表宽度的影响。

最后，其他的df描述也可以合并，这里有一个可能的想法:

def df_overview(df: pd.DataFrame, max_colwidth=25, head=3, tail=3):
    return(
        df.describe([0.5]).transpose()
        .merge(df.dtypes.rename('dtypes'), left_index=True, right_index=True)
        .merge(df.head(head).transpose(), left_index=True, right_index=True)
        .merge(df.tail(tail).transpose(), left_index=True, right_index=True)
        .to_string(max_colwidth=max_colwidth, float_format=lambda x: "{:.4G}".format(x))
    )

2022-01-28 19:00:41

似乎前面所有的答案都能解决这个问题。还有一点:你可以使用(auto-complete-able)而不是pd.set_option('option_name'):

pd.options.display.width = None

参见Pandas文档:选项和设置:

选项有一个完整的“虚线风格”，不区分大小写的名称(例如。 display.max_rows)。的属性可以直接获取/设置选项顶级选项属性: 在[1]中:导入熊猫为pd 在[2]:pd.options.display.max_rows中 [2]: 15 在[3]:pd.options.display中。Max_rows = 999 在[4]:pd.options.display.max_rows中出[4]:999

[…]

对于max_…参数:

max_rows and max_columns are used in __repr__() methods to decide if to_string() or info() is used to render an object to a string. In case Python/IPython is running in a terminal this can be set to 0 and pandas will correctly auto-detect the width the terminal and swap to a smaller format in case all columns would not fit vertically. The IPython notebook, IPython qtconsole, or IDLE do not run in a terminal and hence it is not possible to do correct auto-detection. ‘None’ value means unlimited. [emphasis not in original]

对于width参数:

以字符为单位的显示宽度。如果Python/IPython在终端中运行，可以将其设置为None, pandas将正确地自动检测宽度。请注意，IPython notebook、IPython qtconsole或IDLE不在终端中运行，因此不可能正确地检测宽度。

2018-03-23 16:52:02

试试这个:

pd.set_option('display.expand_frame_repr', False)

从文档中可以看到:

显示。Expand_frame_repr:布尔值是否跨多行打印宽DataFrame的完整DataFrame repr, max_columns仍然被尊重，但如果它的宽度超过display.width，输出将跨多个“页”环绕。[默认值:True][当前:True]

看到:pandas.set_option。

2014-08-20 22:19:24

我如何扩大输出显示，以看到一个熊猫数据框架的更多列?

推荐文章

最新文章

标签