如何在numpy数组中找到最近的值?例子:

np.find_nearest(array, value)

当前回答

对于大型数组,@Demitri给出的(优秀)答案比目前标记为最佳的答案快得多。我从以下两个方面调整了他的精确算法:

不管输入数组是否排序,下面的函数都有效。 下面的函数返回与最接近的值对应的输入数组的索引,这有点更一般。

请注意,下面的函数还处理了一个特定的边缘情况,这将导致@Demitri编写的原始函数中的错误。否则,我的算法和他的一样。

def find_idx_nearest_val(array, value):
    idx_sorted = np.argsort(array)
    sorted_array = np.array(array[idx_sorted])
    idx = np.searchsorted(sorted_array, value, side="left")
    if idx >= len(array):
        idx_nearest = idx_sorted[len(array)-1]
    elif idx == 0:
        idx_nearest = idx_sorted[0]
    else:
        if abs(value - sorted_array[idx-1]) < abs(value - sorted_array[idx]):
            idx_nearest = idx_sorted[idx-1]
        else:
            idx_nearest = idx_sorted[idx]
    return idx_nearest

其他回答

这个函数使用numpy searchsorted处理任意数量的查询,因此在对输入数组进行排序之后,它的速度也一样快。 它可以在2d, 3d的规则网格上工作…:

#!/usr/bin/env python3
# keywords: nearest-neighbor regular-grid python numpy searchsorted Voronoi

import numpy as np

#...............................................................................
class Near_rgrid( object ):
    """ nearest neighbors on a Manhattan aka regular grid
    1d:
    near = Near_rgrid( x: sorted 1d array )
    nearix = near.query( q: 1d ) -> indices of the points x_i nearest each q_i
        x[nearix[0]] is the nearest to q[0]
        x[nearix[1]] is the nearest to q[1] ...
        nearpoints = x[nearix] is near q
    If A is an array of e.g. colors at x[0] x[1] ...,
    A[nearix] are the values near q[0] q[1] ...
    Query points < x[0] snap to x[0], similarly > x[-1].

    2d: on a Manhattan aka regular grid,
        streets running east-west at y_i, avenues north-south at x_j,
    near = Near_rgrid( y, x: sorted 1d arrays, e.g. latitide longitude )
    I, J = near.query( q: nq × 2 array, columns qy qx )
    -> nq × 2 indices of the gridpoints y_i x_j nearest each query point
        gridpoints = np.column_stack(( y[I], x[J] ))  # e.g. street corners
        diff = gridpoints - querypoints
        distances = norm( diff, axis=1, ord= )
    Values at an array A definded at the gridpoints y_i x_j nearest q: A[I,J]

    3d: Near_rgrid( z, y, x: 1d axis arrays ) .query( q: nq × 3 array )

    See Howitworks below, and the plot Voronoi-random-regular-grid.
    """

    def __init__( self, *axes: "1d arrays" ):
        axarrays = []
        for ax in axes:
            axarray = np.asarray( ax ).squeeze()
            assert axarray.ndim == 1, "each axis should be 1d, not %s " % (
                    str( axarray.shape ))
            axarrays += [axarray]
        self.midpoints = [_midpoints( ax ) for ax in axarrays]
        self.axes = axarrays
        self.ndim = len(axes)

    def query( self, queries: "nq × dim points" ) -> "nq × dim indices":
        """ -> the indices of the nearest points in the grid """
        queries = np.asarray( queries ).squeeze()  # or list x y z ?
        if self.ndim == 1:
            assert queries.ndim <= 1, queries.shape
            return np.searchsorted( self.midpoints[0], queries )  # scalar, 0d ?
        queries = np.atleast_2d( queries )
        assert queries.shape[1] == self.ndim, [
                queries.shape, self.ndim]
        return [np.searchsorted( mid, q )  # parallel: k axes, k processors
                for mid, q in zip( self.midpoints, queries.T )]

    def snaptogrid( self, queries: "nq × dim points" ):
        """ -> the nearest points in the grid, 2d [[y_j x_i] ...] """
        ix = self.query( queries )
        if self.ndim == 1:
            return self.axes[0][ix]
        else:
            axix = [ax[j] for ax, j in zip( self.axes, ix )]
            return np.array( axix )


def _midpoints( points: "array-like 1d, *must be sorted*" ) -> "1d":
    points = np.asarray( points ).squeeze()
    assert points.ndim == 1, points.shape
    diffs = np.diff( points )
    assert np.nanmin( diffs ) > 0, "the input array must be sorted, not %s " % (
            points.round( 2 ))
    return (points[:-1] + points[1:]) / 2  # floats

#...............................................................................
Howitworks = \
"""
How Near_rgrid works in 1d:
Consider the midpoints halfway between fenceposts | | |
The interval [left midpoint .. | .. right midpoint] is what's nearest each post --

    |   |       |                     |   points
    | . |   .   |          .          |   midpoints
      ^^^^^^               .            nearest points[1]
            ^^^^^^^^^^^^^^^             nearest points[2]  etc.

2d:
    I, J = Near_rgrid( y, x ).query( q )
    I = nearest in `x`
    J = nearest in `y` independently / in parallel.
    The points nearest [yi xj] in a regular grid (its Voronoi cell)
    form a rectangle [left mid x .. right mid x] × [left mid y .. right mid y]
    (in any norm ?)
    See the plot Voronoi-random-regular-grid.

Notes
-----
If a query point is exactly halfway between two data points,
e.g. on a grid of ints, the lines (x + 1/2) U (y + 1/2),
which "nearest" you get is implementation-dependent, unpredictable.

"""

Murky = \
""" NaNs in points, in queries ?
"""

__version__ = "2021-10-25 oct  denis-bz-py"
import numpy as np
def find_nearest(array, value):
    array = np.asarray(array)
    idx = (np.abs(array - value)).argmin()
    return array[idx]

使用示例:

array = np.random.random(10)
print(array)
# [ 0.21069679  0.61290182  0.63425412  0.84635244  0.91599191  0.00213826
#   0.17104965  0.56874386  0.57319379  0.28719469]

print(find_nearest(array, value=0.5))
# 0.568743859261

我认为最python的方式是:

 num = 65 # Input number
 array = np.random.random((10))*100 # Given array 
 nearest_idx = np.where(abs(array-num)==abs(array-num).min())[0] # If you want the index of the element of array (array) nearest to the the given number (num)
 nearest_val = array[abs(array-num)==abs(array-num).min()] # If you directly want the element of array (array) nearest to the given number (num)

这是基本代码。你可以把它作为一个函数来使用

所有的答案都有助于收集信息来编写高效的代码。但是,我已经编写了一个小的Python脚本来针对各种情况进行优化。如果提供的数组已排序,则将是最佳情况。如果搜索一个指定值的最近点的索引,那么对半模块是最省时的。当一个索引对应一个数组时,numpy searchsorted是最有效的。

import numpy as np
import bisect
xarr = np.random.rand(int(1e7))

srt_ind = xarr.argsort()
xar = xarr.copy()[srt_ind]
xlist = xar.tolist()
bisect.bisect_left(xlist, 0.3)

In[63]: %时间平分。bisect_left (xlist, 0.3) CPU次数:user 0ns, sys: 0ns, total: 0ns 壁时间:22.2µs

np.searchsorted(xar, 0.3, side="left")

In [64]: %time np。Searchsorted (xar, 0.3, side="left") CPU次数:user 0ns, sys: 0ns, total: 0ns 壁时间:98.9µs

randpts = np.random.rand(1000)
np.searchsorted(xar, randpts, side="left")

%的时间np。Searchsorted (xar, randpts, side="left") CPU次数:用户4ms, sys: 0ns, total: 4ms 壁时间:1.2 ms

如果我们遵循乘法规则,那么numpy应该花费~100 ms,这意味着快了~83倍。

对于大型数组,@Demitri给出的(优秀)答案比目前标记为最佳的答案快得多。我从以下两个方面调整了他的精确算法:

不管输入数组是否排序,下面的函数都有效。 下面的函数返回与最接近的值对应的输入数组的索引,这有点更一般。

请注意,下面的函数还处理了一个特定的边缘情况,这将导致@Demitri编写的原始函数中的错误。否则,我的算法和他的一样。

def find_idx_nearest_val(array, value):
    idx_sorted = np.argsort(array)
    sorted_array = np.array(array[idx_sorted])
    idx = np.searchsorted(sorted_array, value, side="left")
    if idx >= len(array):
        idx_nearest = idx_sorted[len(array)-1]
    elif idx == 0:
        idx_nearest = idx_sorted[0]
    else:
        if abs(value - sorted_array[idx-1]) < abs(value - sorted_array[idx]):
            idx_nearest = idx_sorted[idx-1]
        else:
            idx_nearest = idx_sorted[idx]
    return idx_nearest