我有一些Pandas dataframe共享相同的值尺度,但有不同的列和索引。当调用df.plot()时,我得到单独的plot图像。我真正想要的是把它们都放在同一个情节中,作为次要情节,但不幸的是,我没能想出一个解决方案,非常感谢一些帮助。
您可以使用matplotlib手动创建子图,然后使用ax关键字在特定的子图上绘制数据帧。例如,对于4个子图(2x2):
import matplotlib.pyplot as plt
fig, axes = plt.subplots(nrows=2, ncols=2)
df1.plot(ax=axes[0,0])
df2.plot(ax=axes[0,1])
...
这里的axes是一个包含不同子图轴的数组,您可以通过索引轴来访问其中一个。 如果你想要一个共享的x轴,那么你可以给plt.subplots提供sharex=True。
你可以看到eg。在证明joris答案的文件中。同样在文档中,你也可以在pandas plot函数中设置subplots=True和layout=(,):
df.plot(subplots=True, layout=(1,2))
你也可以使用fig.add_subplot()来获取子图网格参数,如221、222、223、224等。在这个ipython笔记本中可以看到pandas数据帧上的绘图(包括子绘图)的好例子。
您可以使用熟悉的Matplotlib样式调用图形和子图,但是您只需要使用plt.gca()指定当前轴。一个例子:
plt.figure(1)
plt.subplot(2,2,1)
df.A.plot() #no need to specify for first axis
plt.subplot(2,2,2)
df.B.plot(ax=plt.gca())
plt.subplot(2,2,3)
df.C.plot(ax=plt.gca())
等等……
在上面的@joris响应的基础上,如果已经建立了对子图的引用,那么也可以使用该引用。例如,
ax1 = plt.subplot2grid((50,100), (0, 0), colspan=20, rowspan=10)
...
df.plot.barh(ax=ax1, stacked=True)
你可以用这个:
fig = plt.figure()
ax = fig.add_subplot(221)
plt.plot(x,y)
ax = fig.add_subplot(222)
plt.plot(x,z)
...
plt.show()
您可能根本不需要使用Pandas。这是cat频率的matplotlib图:
x = np.linspace(0, 2*np.pi, 400)
y = np.sin(x**2)
f, axes = plt.subplots(2, 1)
for c, i in enumerate(axes):
axes[c].plot(x, y)
axes[c].set_title('cats')
plt.tight_layout()
您可以使用matplotlib绘制多个pandas数据帧的多个子图,只需简单地列出所有数据帧。然后使用for循环绘制子图。
工作代码:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
# dataframe sample data
df1 = pd.DataFrame(np.random.rand(10,2)*100, columns=['A', 'B'])
df2 = pd.DataFrame(np.random.rand(10,2)*100, columns=['A', 'B'])
df3 = pd.DataFrame(np.random.rand(10,2)*100, columns=['A', 'B'])
df4 = pd.DataFrame(np.random.rand(10,2)*100, columns=['A', 'B'])
df5 = pd.DataFrame(np.random.rand(10,2)*100, columns=['A', 'B'])
df6 = pd.DataFrame(np.random.rand(10,2)*100, columns=['A', 'B'])
#define number of rows and columns for subplots
nrow=3
ncol=2
# make a list of all dataframes
df_list = [df1 ,df2, df3, df4, df5, df6]
fig, axes = plt.subplots(nrow, ncol)
# plot counter
count=0
for r in range(nrow):
for c in range(ncol):
df_list[count].plot(ax=axes[r,c])
count+=1
使用这段代码,可以在任何配置中绘制子图。您需要定义行数nrow和列数ncol。此外,您还需要制作您想要绘制的数据帧df_list列表。
选项1:从具有长(整齐)数据的数据框架字典中创建子图
Assumptions: There is a dictionary of multiple dataframes of tidy data that are either: Created by reading in from files Created by separating a single dataframe into multiple dataframes The categories, cat, may be overlapping, but all dataframes don't necessarily contain all values of cat hue='cat' This example uses a dict of dataframes, but a list of dataframes would be similar. If the dataframes are wide, use pandas.DataFrame.melt to convert them to long form. Because dataframes are being iterated through, there's no guarantee that colors will be mapped the same for each plot A custom color map needs to be created from the unique 'cat' values for all the dataframes Since the colors will be the same, place one legend to the side of the plots, instead of a legend in every plot Tested in python 3.10, pandas 1.4.3, matplotlib 3.5.1, seaborn 0.11.2
导入和测试数据
import pandas as pd
import numpy as np # used for random data
import matplotlib.pyplot as plt
from matplotlib.patches import Patch # for custom legend - square patches
from matplotlib.lines import Line2D # for custom legend - round markers
import seaborn as sns
import math import ceil # determine correct number of subplot
# synthetic data
df_dict = dict()
for i in range(1, 7):
np.random.seed(i) # for repeatable sample data
data_length = 100
data = {'cat': np.random.choice(['A', 'B', 'C'], size=data_length),
'x': np.random.rand(data_length), 'y': np.random.rand(data_length)}
df_dict[i] = pd.DataFrame(data)
# display(df_dict[1].head())
cat x y
0 B 0.944595 0.606329
1 A 0.586555 0.568851
2 A 0.903402 0.317362
3 B 0.137475 0.988616
4 B 0.139276 0.579745
# display(df_dict[6].tail())
cat x y
95 B 0.881222 0.263168
96 A 0.193668 0.636758
97 A 0.824001 0.638832
98 C 0.323998 0.505060
99 C 0.693124 0.737582
创建颜色映射和绘图
# create color mapping based on all unique values of cat
unique_cat = {cat for v in df_dict.values() for cat in v.cat.unique()} # get unique cats
colors = sns.color_palette('tab10', n_colors=len(unique_cat)) # get a number of colors
cmap = dict(zip(unique_cat, colors)) # zip values to colors
col_nums = 3 # how many plots per row
row_nums = math.ceil(len(df_dict) / col_nums) # how many rows of plots
# create the figue and axes
fig, axes = plt.subplots(row_nums, col_nums, figsize=(9, 6), sharex=True, sharey=True)
# convert to 1D array for easy iteration
axes = axes.flat
# iterate through dictionary and plot
for ax, (k, v) in zip(axes, df_dict.items()):
sns.scatterplot(data=v, x='x', y='y', hue='cat', palette=cmap, ax=ax)
sns.despine(top=True, right=True)
ax.legend_.remove() # remove the individual plot legends
ax.set_title(f'dataset = {k}', fontsize=11)
fig.tight_layout()
# create legend from cmap
# patches = [Patch(color=v, label=k) for k, v in cmap.items()] # square patches
patches = [Line2D([0], [0], marker='o', color='w', markerfacecolor=v, label=k, markersize=8) for k, v in cmap.items()] # round markers
# place legend outside of plot; change the right bbox value to move the legend up or down
plt.legend(title='cat', handles=patches, bbox_to_anchor=(1.06, 1.2), loc='center left', borderaxespad=0, frameon=False)
plt.show()
选项2:从包含多个独立数据集的单个数据帧创建子图
数据帧必须是长格式的,具有相同的列名。 这个选项使用pd。Concat将多个数据帧合并为一个数据帧,.assign添加一个新列。 参见将多个csv文件导入pandas并连接到一个DataFrame以从文件列表创建单个DataFrame。 这个选项更简单,因为它不需要手动将颜色映射到“cat”
结合DataFrames
# using df_dict, with dataframes as values, from the top
# combine all the dataframes in df_dict to a single dataframe with an identifier column
df = pd.concat((v.assign(dataset=k) for k, v in df_dict.items()), ignore_index=True)
# display(df.head())
cat x y dataset
0 B 0.944595 0.606329 1
1 A 0.586555 0.568851 1
2 A 0.903402 0.317362 1
3 B 0.137475 0.988616 1
4 B 0.139276 0.579745 1
# display(df.tail())
cat x y dataset
595 B 0.881222 0.263168 6
596 A 0.193668 0.636758 6
597 A 0.824001 0.638832 6
598 C 0.323998 0.505060 6
599 C 0.693124 0.737582 6
用seaborn.relplot绘制FacetGrid
sns.relplot(kind='scatter', data=df, x='x', y='y', hue='cat', col='dataset', col_wrap=3, height=3)
这两个选项都创建了相同的结果,但是,合并所有数据框架并使用sns.relplot绘制图形级图形要简单得多。
下面是一个工作中的pandas子图示例,其中modes是数据框架的列名。
dpi=200
figure_size=(20, 10)
fig, ax = plt.subplots(len(modes), 1, sharex="all", sharey="all", dpi=dpi)
for i in range(len(modes)):
ax[i] = pivot_df.loc[:, modes[i]].plot.bar(figsize=(figure_size[0], figure_size[1]*len(modes)),
ax=ax[i], title=modes[i], color=my_colors[i])
ax[i].legend()
fig.suptitle(name)
import numpy as np
import pandas as pd
imoprt matplotlib.pyplot as plt
fig, ax = plt.subplots(2,2)
df = pd.DataFrame({'A':np.random.randint(1,100,10),
'B': np.random.randint(100,1000,10),
'C':np.random.randint(100,200,10)})
for ax in ax.flatten():
df.plot(ax =ax)