如何在numpy数组中找到最近的值?例子:
np.find_nearest(array, value)
如何在numpy数组中找到最近的值?例子:
np.find_nearest(array, value)
当前回答
如果你不想使用numpy,可以这样做:
def find_nearest(array, value):
n = [abs(i-value) for i in array]
idx = n.index(min(n))
return array[idx]
其他回答
下面是@Ari Onasafari的scipy版本,回答“在向量数组中找到最近的向量”
In [1]: from scipy import spatial
In [2]: import numpy as np
In [3]: A = np.random.random((10,2))*100
In [4]: A
Out[4]:
array([[ 68.83402637, 38.07632221],
[ 76.84704074, 24.9395109 ],
[ 16.26715795, 98.52763827],
[ 70.99411985, 67.31740151],
[ 71.72452181, 24.13516764],
[ 17.22707611, 20.65425362],
[ 43.85122458, 21.50624882],
[ 76.71987125, 44.95031274],
[ 63.77341073, 78.87417774],
[ 8.45828909, 30.18426696]])
In [5]: pt = [6, 30] # <-- the point to find
In [6]: A[spatial.KDTree(A).query(pt)[1]] # <-- the nearest point
Out[6]: array([ 8.45828909, 30.18426696])
#how it works!
In [7]: distance,index = spatial.KDTree(A).query(pt)
In [8]: distance # <-- The distances to the nearest neighbors
Out[8]: 2.4651855048258393
In [9]: index # <-- The locations of the neighbors
Out[9]: 9
#then
In [10]: A[index]
Out[10]: array([ 8.45828909, 30.18426696])
这是unutbu答案的矢量化版本:
def find_nearest(array, values):
array = np.asarray(array)
# the last dim must be 1 to broadcast in (array - values) below.
values = np.expand_dims(values, axis=-1)
indices = np.abs(array - values).argmin(axis=-1)
return array[indices]
image = plt.imread('example_3_band_image.jpg')
print(image.shape) # should be (nrows, ncols, 3)
quantiles = np.linspace(0, 255, num=2 ** 2, dtype=np.uint8)
quantiled_image = find_nearest(quantiles, image)
print(quantiled_image.shape) # should be (nrows, ncols, 3)
所有的答案都有助于收集信息来编写高效的代码。但是,我已经编写了一个小的Python脚本来针对各种情况进行优化。如果提供的数组已排序,则将是最佳情况。如果搜索一个指定值的最近点的索引,那么对半模块是最省时的。当一个索引对应一个数组时,numpy searchsorted是最有效的。
import numpy as np
import bisect
xarr = np.random.rand(int(1e7))
srt_ind = xarr.argsort()
xar = xarr.copy()[srt_ind]
xlist = xar.tolist()
bisect.bisect_left(xlist, 0.3)
In[63]: %时间平分。bisect_left (xlist, 0.3) CPU次数:user 0ns, sys: 0ns, total: 0ns 壁时间:22.2µs
np.searchsorted(xar, 0.3, side="left")
In [64]: %time np。Searchsorted (xar, 0.3, side="left") CPU次数:user 0ns, sys: 0ns, total: 0ns 壁时间:98.9µs
randpts = np.random.rand(1000)
np.searchsorted(xar, randpts, side="left")
%的时间np。Searchsorted (xar, randpts, side="left") CPU次数:用户4ms, sys: 0ns, total: 4ms 壁时间:1.2 ms
如果我们遵循乘法规则,那么numpy应该花费~100 ms,这意味着快了~83倍。
这个函数使用numpy searchsorted处理任意数量的查询,因此在对输入数组进行排序之后,它的速度也一样快。 它可以在2d, 3d的规则网格上工作…:
#!/usr/bin/env python3
# keywords: nearest-neighbor regular-grid python numpy searchsorted Voronoi
import numpy as np
#...............................................................................
class Near_rgrid( object ):
""" nearest neighbors on a Manhattan aka regular grid
1d:
near = Near_rgrid( x: sorted 1d array )
nearix = near.query( q: 1d ) -> indices of the points x_i nearest each q_i
x[nearix[0]] is the nearest to q[0]
x[nearix[1]] is the nearest to q[1] ...
nearpoints = x[nearix] is near q
If A is an array of e.g. colors at x[0] x[1] ...,
A[nearix] are the values near q[0] q[1] ...
Query points < x[0] snap to x[0], similarly > x[-1].
2d: on a Manhattan aka regular grid,
streets running east-west at y_i, avenues north-south at x_j,
near = Near_rgrid( y, x: sorted 1d arrays, e.g. latitide longitude )
I, J = near.query( q: nq × 2 array, columns qy qx )
-> nq × 2 indices of the gridpoints y_i x_j nearest each query point
gridpoints = np.column_stack(( y[I], x[J] )) # e.g. street corners
diff = gridpoints - querypoints
distances = norm( diff, axis=1, ord= )
Values at an array A definded at the gridpoints y_i x_j nearest q: A[I,J]
3d: Near_rgrid( z, y, x: 1d axis arrays ) .query( q: nq × 3 array )
See Howitworks below, and the plot Voronoi-random-regular-grid.
"""
def __init__( self, *axes: "1d arrays" ):
axarrays = []
for ax in axes:
axarray = np.asarray( ax ).squeeze()
assert axarray.ndim == 1, "each axis should be 1d, not %s " % (
str( axarray.shape ))
axarrays += [axarray]
self.midpoints = [_midpoints( ax ) for ax in axarrays]
self.axes = axarrays
self.ndim = len(axes)
def query( self, queries: "nq × dim points" ) -> "nq × dim indices":
""" -> the indices of the nearest points in the grid """
queries = np.asarray( queries ).squeeze() # or list x y z ?
if self.ndim == 1:
assert queries.ndim <= 1, queries.shape
return np.searchsorted( self.midpoints[0], queries ) # scalar, 0d ?
queries = np.atleast_2d( queries )
assert queries.shape[1] == self.ndim, [
queries.shape, self.ndim]
return [np.searchsorted( mid, q ) # parallel: k axes, k processors
for mid, q in zip( self.midpoints, queries.T )]
def snaptogrid( self, queries: "nq × dim points" ):
""" -> the nearest points in the grid, 2d [[y_j x_i] ...] """
ix = self.query( queries )
if self.ndim == 1:
return self.axes[0][ix]
else:
axix = [ax[j] for ax, j in zip( self.axes, ix )]
return np.array( axix )
def _midpoints( points: "array-like 1d, *must be sorted*" ) -> "1d":
points = np.asarray( points ).squeeze()
assert points.ndim == 1, points.shape
diffs = np.diff( points )
assert np.nanmin( diffs ) > 0, "the input array must be sorted, not %s " % (
points.round( 2 ))
return (points[:-1] + points[1:]) / 2 # floats
#...............................................................................
Howitworks = \
"""
How Near_rgrid works in 1d:
Consider the midpoints halfway between fenceposts | | |
The interval [left midpoint .. | .. right midpoint] is what's nearest each post --
| | | | points
| . | . | . | midpoints
^^^^^^ . nearest points[1]
^^^^^^^^^^^^^^^ nearest points[2] etc.
2d:
I, J = Near_rgrid( y, x ).query( q )
I = nearest in `x`
J = nearest in `y` independently / in parallel.
The points nearest [yi xj] in a regular grid (its Voronoi cell)
form a rectangle [left mid x .. right mid x] × [left mid y .. right mid y]
(in any norm ?)
See the plot Voronoi-random-regular-grid.
Notes
-----
If a query point is exactly halfway between two data points,
e.g. on a grid of ints, the lines (x + 1/2) U (y + 1/2),
which "nearest" you get is implementation-dependent, unpredictable.
"""
Murky = \
""" NaNs in points, in queries ?
"""
__version__ = "2021-10-25 oct denis-bz-py"
这是在向量数组中找到最近向量的扩展。
import numpy as np
def find_nearest_vector(array, value):
idx = np.array([np.linalg.norm(x+y) for (x,y) in array-value]).argmin()
return array[idx]
A = np.random.random((10,2))*100
""" A = array([[ 34.19762933, 43.14534123],
[ 48.79558706, 47.79243283],
[ 38.42774411, 84.87155478],
[ 63.64371943, 50.7722317 ],
[ 73.56362857, 27.87895698],
[ 96.67790593, 77.76150486],
[ 68.86202147, 21.38735169],
[ 5.21796467, 59.17051276],
[ 82.92389467, 99.90387851],
[ 6.76626539, 30.50661753]])"""
pt = [6, 30]
print find_nearest_vector(A,pt)
# array([ 6.76626539, 30.50661753])