是否有一种方法可以在交互或脚本执行模式下扩大输出的显示?

具体来说,我在Pandas DataFrame上使用了describe()函数。当DataFrame是五列(标签)宽时,我得到了我想要的描述性统计数据。然而,如果DataFrame有更多的列,统计数据将被抑制,并返回如下内容:

>> Index: 8 entries, count to max
>> Data columns:
>> x1          8  non-null values
>> x2          8  non-null values
>> x3          8  non-null values
>> x4          8  non-null values
>> x5          8  non-null values
>> x6          8  non-null values
>> x7          8  non-null values

无论有6列还是7列,都给出“8”值。“8”指什么?

我已经尝试过将IDLE窗口拖大,以及增加“配置IDLE”宽度选项,但无济于事。


当前回答

import pandas as pd
pd.set_option('display.max_columns', 100)
pd.set_option('display.width', 1000)

SentenceA = "William likes Piano and Piano likes William"
SentenceB = "Sara likes Guitar"
SentenceC = "Mamoosh likes Piano"
SentenceD = "William is a CS Student"
SentenceE = "Sara is kind"
SentenceF = "Mamoosh is kind"


bowA = SentenceA.split(" ")
bowB = SentenceB.split(" ")
bowC = SentenceC.split(" ")
bowD = SentenceD.split(" ")
bowE = SentenceE.split(" ")
bowF = SentenceF.split(" ")

# Creating a set consisting of all words

wordSet = set(bowA).union(set(bowB)).union(set(bowC)).union(set(bowD)).union(set(bowE)).union(set(bowF))
print("Set of all words is: ", wordSet)

# Initiating dictionary with 0 value for all BOWs

wordDictA = dict.fromkeys(wordSet, 0)
wordDictB = dict.fromkeys(wordSet, 0)
wordDictC = dict.fromkeys(wordSet, 0)
wordDictD = dict.fromkeys(wordSet, 0)
wordDictE = dict.fromkeys(wordSet, 0)
wordDictF = dict.fromkeys(wordSet, 0)

for word in bowA:
    wordDictA[word] += 1
for word in bowB:
    wordDictB[word] += 1
for word in bowC:
    wordDictC[word] += 1
for word in bowD:
    wordDictD[word] += 1
for word in bowE:
    wordDictE[word] += 1
for word in bowF:
    wordDictF[word] += 1

# Printing term frequency

print("SentenceA TF: ", wordDictA)
print("SentenceB TF: ", wordDictB)
print("SentenceC TF: ", wordDictC)
print("SentenceD TF: ", wordDictD)
print("SentenceE TF: ", wordDictE)
print("SentenceF TF: ", wordDictF)

print(pd.DataFrame([wordDictA, wordDictB, wordDictB, wordDictC, wordDictD, wordDictE, wordDictF]))

输出:

   CS  Guitar  Mamoosh  Piano  Sara  Student  William  a  and  is  kind  likes
0   0       0        0      2     0        0        2  0    1   0     0      2
1   0       1        0      0     1        0        0  0    0   0     0      1
2   0       1        0      0     1        0        0  0    0   0     0      1
3   0       0        1      1     0        0        0  0    0   0     0      1
4   1       0        0      0     0        1        1  1    0   1     0      0
5   0       0        0      0     1        0        0  0    0   1     1      0
6   0       0        1      0     0        0        0  0    0   1     1      0

其他回答

使用以下方法设置列的最大宽度:

pd.set_option('max_colwidth', 800)

这个特定的语句将每列的最大宽度设置为800像素。

试试这个:

pd.set_option('display.expand_frame_repr', False)

从文档中可以看到:

显示。Expand_frame_repr:布尔值 是否跨多行打印宽DataFrame的完整DataFrame repr, max_columns仍然被尊重,但如果它的宽度超过display.width,输出将跨多个“页”环绕。[默认值:True][当前:True]

看到:pandas.set_option。

这不是严格意义上的答案,但是让我们记住我们可以df.describe().transpose()或者df.head(n).transpose(),或者df.tail(n).transpose()。

我还发现,当标题是结构化的时,将它们作为列来阅读更容易:

header1_xxx,

header2_xxx,

header3_xxx,

我认为终端和应用程序处理垂直滚动更自然,如果这是必要的转置后。

标头通常比它们的值大,将它们全部放在一列(索引)中可以最大限度地减少它们对总表宽度的影响。

最后,其他的df描述也可以合并,这里有一个可能的想法:

def df_overview(df: pd.DataFrame, max_colwidth=25, head=3, tail=3):
    return(
        df.describe([0.5]).transpose()
        .merge(df.dtypes.rename('dtypes'), left_index=True, right_index=True)
        .merge(df.head(head).transpose(), left_index=True, right_index=True)
        .merge(df.tail(tail).transpose(), left_index=True, right_index=True)
        .to_string(max_colwidth=max_colwidth, float_format=lambda x: "{:.4G}".format(x))
    )

根据v0.18.0的文档,如果你在终端上运行(即,不是IPython notebook, qtconsole或IDLE),让Pandas自动检测你的屏幕宽度并根据它显示的列数进行调整是一个双行程序:

pd.set_option('display.large_repr', 'truncate')
pd.set_option('display.max_columns', 0)

如果你想临时设置选项来显示一个大的数据帧,你可以使用option_context:

with pd.option_context('display.max_rows', None, 'display.max_columns', None):
    print (df)

退出with块时,选项值将自动恢复。