查找numpy数组中最近的值

如何在numpy数组中找到最近的值?例子:

np.find_nearest(array, value)

当前回答

我认为最python的方式是:

 num = 65 # Input number
 array = np.random.random((10))*100 # Given array 
 nearest_idx = np.where(abs(array-num)==abs(array-num).min())[0] # If you want the index of the element of array (array) nearest to the the given number (num)
 nearest_val = array[abs(array-num)==abs(array-num).min()] # If you directly want the element of array (array) nearest to the given number (num)

这是基本代码。你可以把它作为一个函数来使用

2017-01-31 05:43:29

其他回答

如果你有很多值需要搜索(值可以是多维数组)，下面是@Dimitri的快速向量化解决方案:

# `values` should be sorted
def get_closest(array, values):
    # make sure array is a numpy array
    array = np.array(array)

    # get insert positions
    idxs = np.searchsorted(array, values, side="left")
    
    # find indexes where previous index is closer
    prev_idx_is_less = ((idxs == len(array))|(np.fabs(values - array[np.maximum(idxs-1, 0)]) < np.fabs(values - array[np.minimum(idxs, len(array)-1)])))
    idxs[prev_idx_is_less] -= 1
    
    return array[idxs]

基准

>使用@Demitri的解决方案比使用for循环快100倍”

>>> %timeit ar=get_closest(np.linspace(1, 1000, 100), np.random.randint(0, 1050, (1000, 1000)))
139 ms ± 4.04 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

>>> %timeit ar=[find_nearest(np.linspace(1, 1000, 100), value) for value in np.random.randint(0, 1050, 1000*1000)]
took 21.4 seconds

2017-09-12 20:13:17

对于大型数组，@Demitri给出的(优秀)答案比目前标记为最佳的答案快得多。我从以下两个方面调整了他的精确算法:

不管输入数组是否排序，下面的函数都有效。下面的函数返回与最接近的值对应的输入数组的索引，这有点更一般。

请注意，下面的函数还处理了一个特定的边缘情况，这将导致@Demitri编写的原始函数中的错误。否则，我的算法和他的一样。

def find_idx_nearest_val(array, value):
    idx_sorted = np.argsort(array)
    sorted_array = np.array(array[idx_sorted])
    idx = np.searchsorted(sorted_array, value, side="left")
    if idx >= len(array):
        idx_nearest = idx_sorted[len(array)-1]
    elif idx == 0:
        idx_nearest = idx_sorted[0]
    else:
        if abs(value - sorted_array[idx-1]) < abs(value - sorted_array[idx]):
            idx_nearest = idx_sorted[idx-1]
        else:
            idx_nearest = idx_sorted[idx]
    return idx_nearest

2015-04-08 14:54:20

下面是一个使用2D数组的版本，如果用户拥有scipy的cdist函数，则使用它，如果用户没有，则使用更简单的距离计算。

默认情况下，输出是最接近输入值的索引，但您可以使用output关键字将其更改为'index'， 'value'或'both'之一，其中'value'输出数组[index]， 'both'输出索引，数组[index]。

对于非常大的数组，您可能需要使用kind='euclidean'，因为默认的scipy cdist函数可能会耗尽内存。

这可能不是绝对最快的解决方案，但已经很接近了。

def find_nearest_2d(array, value, kind='cdist', output='index'):
    # 'array' must be a 2D array
    # 'value' must be a 1D array with 2 elements
    # 'kind' defines what method to use to calculate the distances. Can choose one
    #    of 'cdist' (default) or 'euclidean'. Choose 'euclidean' for very large
    #    arrays. Otherwise, cdist is much faster.
    # 'output' defines what the output should be. Can be 'index' (default) to return
    #    the index of the array that is closest to the value, 'value' to return the
    #    value that is closest, or 'both' to return index,value
    import numpy as np
    if kind == 'cdist':
        try: from scipy.spatial.distance import cdist
        except ImportError:
            print("Warning (find_nearest_2d): Could not import cdist. Reverting to simpler distance calculation")
            kind = 'euclidean'
    index = np.where(array == value)[0] # Make sure the value isn't in the array
    if index.size == 0:
        if kind == 'cdist': index = np.argmin(cdist([value],array)[0])
        elif kind == 'euclidean': index = np.argmin(np.sum((np.array(array)-np.array(value))**2.,axis=1))
        else: raise ValueError("Keyword 'kind' must be one of 'cdist' or 'euclidean'")
    if output == 'index': return index
    elif output == 'value': return array[index]
    elif output == 'both': return index,array[index]
    else: raise ValueError("Keyword 'output' must be one of 'index', 'value', or 'both'")

2021-01-22 00:10:29

import numpy as np
def find_nearest(array, value):
    array = np.asarray(array)
    idx = (np.abs(array - value)).argmin()
    return array[idx]

使用示例:

array = np.random.random(10)
print(array)
# [ 0.21069679  0.61290182  0.63425412  0.84635244  0.91599191  0.00213826
#   0.17104965  0.56874386  0.57319379  0.28719469]

print(find_nearest(array, value=0.5))
# 0.568743859261

2010-04-02 12:01:01

如果你的数组已经排序并且非常大，这是一个更快的解决方案:

def find_nearest(array,value):
    idx = np.searchsorted(array, value, side="left")
    if idx > 0 and (idx == len(array) or math.fabs(value - array[idx-1]) < math.fabs(value - array[idx])):
        return array[idx-1]
    else:
        return array[idx]

这可以扩展到非常大的阵列。如果不能假定数组已经排序，可以很容易地修改上面的内容以在方法中排序。对于小型数组来说，这是多余的，但一旦它们变大，这就快得多了。

2014-09-24 20:48:09

查找numpy数组中最近的值

推荐文章

最新文章

标签