使用散射数据集生成热图

我有一组X,Y数据点(大约10k)，很容易绘制成散点图，但我想用热图来表示。

我查看了Matplotlib中的示例，它们似乎都已经从热图单元值开始生成图像。

有没有一种方法可以将一堆不同的x, y转换为热图(其中x, y频率较高的区域会“更温暖”)?

当前回答

恐怕我来晚了一点，但我之前也有一个类似的问题。接受的答案(@ptomato)帮助了我，但我也想张贴这个，以防它对某人有用。


''' I wanted to create a heatmap resembling a football pitch which would show the different actions performed '''

import numpy as np
import matplotlib.pyplot as plt
import random

#fixing random state for reproducibility
np.random.seed(1234324)

fig = plt.figure(12)
ax1 = fig.add_subplot(121)
ax2 = fig.add_subplot(122)

#Ratio of the pitch with respect to UEFA standards 
hmap= np.full((6, 10), 0)
#print(hmap)

xlist = np.random.uniform(low=0.0, high=100.0, size=(20))
ylist = np.random.uniform(low=0.0, high =100.0, size =(20))

#UEFA Pitch Standards are 105m x 68m
xlist = (xlist/100)*10.5
ylist = (ylist/100)*6.5

ax1.scatter(xlist,ylist)

#int of the co-ordinates to populate the array
xlist_int = xlist.astype (int)
ylist_int = ylist.astype (int)

#print(xlist_int, ylist_int)

for i, j in zip(xlist_int, ylist_int):
    #this populates the array according to the x,y co-ordinate values it encounters 
    hmap[j][i]= hmap[j][i] + 1   

#Reversing the rows is necessary 
hmap = hmap[::-1]

#print(hmap)
im = ax2.imshow(hmap)

这是结果

2019-04-26 06:36:53

其他回答

非常类似于@Piti的答案，但使用1次调用而不是2次调用来生成点:

import numpy as np
import matplotlib.pyplot as plt

pts = 1000000
mean = [0.0, 0.0]
cov = [[1.0,0.0],[0.0,1.0]]

x,y = np.random.multivariate_normal(mean, cov, pts).T
plt.hist2d(x, y, bins=50, cmap=plt.cm.jet)
plt.show()

输出:

2019-01-28 11:18:44

在Matplotlib词典，我认为你需要一个hexbin plot。

如果你不熟悉这种类型的图，它只是一个二元直方图，其中xy平面由一个规则的六边形网格镶嵌。

在直方图中，你可以数出每个六边形中的点的数量，将绘图区域离散化为一组窗口，将每个点分配给这些窗口中的一个;最后，将窗口映射到一个颜色数组上，你就得到了一个hexbin图。

虽然不像圆形或正方形那样常用，但直觉上，六边形是装箱容器的几何形状的更好选择:

六边形具有最近邻对称性(例如，方形容器没有，例如，从正方形边界上的一点到另一点的距离正方形内部并非处处相等)和六边形是给出正平面的最高n多边形镶嵌(例如，你可以安全地用六边形瓷砖重新设计厨房地板，因为当你完成时，瓷砖之间不会有任何空隙——而不是所有其他高n, n >= 7的多边形)。

(Matplotlib使用术语hexbin plot;所以(AFAIK)所有的绘图库的R;我仍然不知道这是否是这种类型的图表的普遍接受术语，尽管我怀疑它很可能是六角形装箱的缩写，这描述了准备数据显示的基本步骤。)

from matplotlib import pyplot as PLT
from matplotlib import cm as CM
from matplotlib import mlab as ML
import numpy as NP

n = 1e5
x = y = NP.linspace(-5, 5, 100)
X, Y = NP.meshgrid(x, y)
Z1 = ML.bivariate_normal(X, Y, 2, 2, 0, 0)
Z2 = ML.bivariate_normal(X, Y, 4, 1, 1, 1)
ZD = Z2 - Z1
x = X.ravel()
y = Y.ravel()
z = ZD.ravel()
gridsize=30
PLT.subplot(111)

# if 'bins=None', then color of each hexagon corresponds directly to its count
# 'C' is optional--it maps values to x-y coordinates; if 'C' is None (default) then 
# the result is a pure 2D histogram 

PLT.hexbin(x, y, C=z, gridsize=gridsize, cmap=CM.jet, bins=None)
PLT.axis([x.min(), x.max(), y.min(), y.max()])

cb = PLT.colorbar()
cb.set_label('mean value')
PLT.show()

2010-03-03 13:55:43

下面是我在100万个点集上做的一个，有3个类别(红色、绿色和蓝色)。如果您想尝试这个功能，这里有一个到存储库的链接。Github回购

histplot(
    X,
    Y,
    labels,
    bins=2000,
    range=((-3,3),(-3,3)),
    normalize_each_label=True,
    colors = [
        [1,0,0],
        [0,1,0],
        [0,0,1]],
    gain=50)

2019-06-11 17:37:16

创建一个与最终图像中的单元格对应的二维数组，称为say heatmap_cells，并将其实例化为全零。

选择两个比例因子来定义每个数组元素在实际单位中的差异，对于每个维度，例如x_scale和y_scale。选择这些，使所有数据点都在热图数组的范围内。

对于每个带x_value和y_value的原始数据点:

heatmap_cells[地板(x_value / x_scale),地板(y_value / y_scale)] + = 1

2010-03-03 12:37:50

如果你不想要六边形，你可以使用numpy的histogram2d函数:

import numpy as np
import numpy.random
import matplotlib.pyplot as plt

# Generate some test data
x = np.random.randn(8873)
y = np.random.randn(8873)

heatmap, xedges, yedges = np.histogram2d(x, y, bins=50)
extent = [xedges[0], xedges[-1], yedges[0], yedges[-1]]

plt.clf()
plt.imshow(heatmap.T, extent=extent, origin='lower')
plt.show()

这是一个50x50的热图。如果你想要，比如说512x384，你可以在调用histogram2d时放入bins=(512,384)。

例子:

2010-03-17 09:25:51

使用散射数据集生成热图

推荐文章

最新文章

标签