这是我想做的:

我定期用网络摄像头拍照。就像时间流逝一样。然而,如果没有什么真正的改变,也就是说,图片看起来几乎相同,我不想存储最新的快照。

我想有某种方法可以量化这种差异,我必须根据经验确定一个阈值。

我追求的是简单而不是完美。 我用的是python。


当前回答

我特别要解决的问题是如何计算它们是否“足够不同”。我假设你能弄清楚如何一个一个地减去像素。

首先,我将取一堆没有任何变化的图像,并找出任何像素变化的最大量,仅仅是因为捕获的变化、成像系统中的噪声、JPEG压缩工件和照明的每时每刻的变化。也许你会发现,即使没有任何移动,1或2位的差异也是可以预期的。

对于“真实”测试,你需要一个这样的标准:

如果最多P个像素的差异不超过E,则相同。

所以,如果E = 0.02, P = 1000,这可能意味着(大约)如果任何单个像素改变超过5个单位(假设8位图像),或者如果超过1000个像素有任何错误,这将是“不同的”。

这主要是一种很好的“分类”技术,用于快速识别足够接近而不需要进一步检查的图像。“失败”的图像可能更多的是一种更复杂/昂贵的技术,例如,如果相机抖动,或者对光线变化更健壮,就不会产生假阳性。

I run an open source project, OpenImageIO, that contains a utility called "idiff" that compares differences with thresholds like this (even more elaborate, actually). Even if you don't want to use this software, you may want to look at the source to see how we did it. It's used commercially quite a bit and this thresholding technique was developed so that we could have a test suite for rendering and image processing software, with "reference images" that might have small differences from platform-to-platform or as we made minor tweaks to tha algorithms, so we wanted a "match within tolerance" operation.

其他回答

一个简单的解决方案:

将图像编码为jpeg格式,并寻找文件大小的实质性变化。

我曾经用视频缩略图实现过类似的东西,并且取得了很大的成功和可伸缩性。

下面是我写的一个函数,它以2个图像(文件路径)作为参数,并返回两个图像“像素”组件之间的平均差值。这对我确定视觉上“相等”的图像(当它们不==相等时)非常有效。

(我发现8个是判断图像本质上是否相同的一个很好的限制。)

(如果不添加预处理,图像必须具有相同的尺寸。)

from PIL import Image

def imagesDifference( imageA, imageB ):
    A = list(Image.open(imageA, r'r').convert(r'RGB').getdata())
    B = list(Image.open(imageB, r'r').convert(r'RGB').getdata())
    if (len(A) != len(B)): return -1
    diff = []
    for i in range(0, len(A)):
        diff += [abs(A[i][0] - B[i][0]), abs(A[i][1] - B[i][1]), abs(A[i][2] - B[i][2])]
    return (sum(diff) / len(diff))
import os
from PIL import Image
from PIL import ImageFile
import imagehash
  
#just use to the size diferent picture
def compare_image(img_file1, img_file2):
    if img_file1 == img_file2:
        return True
    fp1 = open(img_file1, 'rb')
    fp2 = open(img_file2, 'rb')

    img1 = Image.open(fp1)
    img2 = Image.open(fp2)

    ImageFile.LOAD_TRUNCATED_IMAGES = True
    b = img1 == img2

    fp1.close()
    fp2.close()

    return b





#through picturu hash to compare
def get_hash_dict(dir):
    hash_dict = {}
    image_quantity = 0
    for _, _, files in os.walk(dir):
        for i, fileName in enumerate(files):
            with open(dir + fileName, 'rb') as fp:
                hash_dict[dir + fileName] = imagehash.average_hash(Image.open(fp))
                image_quantity += 1

    return hash_dict, image_quantity

def compare_image_with_hash(image_file_name_1, image_file_name_2, max_dif=0):
    """
    max_dif: The maximum hash difference is allowed, the smaller and more accurate, the minimum is 0.
    recommend to use
    """
    ImageFile.LOAD_TRUNCATED_IMAGES = True
    hash_1 = None
    hash_2 = None
    with open(image_file_name_1, 'rb') as fp:
        hash_1 = imagehash.average_hash(Image.open(fp))
    with open(image_file_name_2, 'rb') as fp:
        hash_2 = imagehash.average_hash(Image.open(fp))
    dif = hash_1 - hash_2
    if dif < 0:
        dif = -dif
    if dif <= max_dif:
        return True
    else:
        return False


def compare_image_dir_with_hash(dir_1, dir_2, max_dif=0):
    """
    max_dif: The maximum hash difference is allowed, the smaller and more accurate, the minimum is 0.

    """
    ImageFile.LOAD_TRUNCATED_IMAGES = True
    hash_dict_1, image_quantity_1 = get_hash_dict(dir_1)
    hash_dict_2, image_quantity_2 = get_hash_dict(dir_2)

    if image_quantity_1 > image_quantity_2:
        tmp = image_quantity_1
        image_quantity_1 = image_quantity_2
        image_quantity_2 = tmp

        tmp = hash_dict_1
        hash_dict_1 = hash_dict_2
        hash_dict_2 = tmp

    result_dict = {}

    for k in hash_dict_1.keys():
        result_dict[k] = None

    for dif_i in range(0, max_dif + 1):
        have_none = False

        for k_1 in result_dict.keys():
            if result_dict.get(k_1) is None:
                have_none = True

        if not have_none:
            return result_dict

        for k_1, v_1 in hash_dict_1.items():
            for k_2, v_2 in hash_dict_2.items():
                sub = (v_1 - v_2)
                if sub < 0:
                    sub = -sub
                if sub == dif_i and result_dict.get(k_1) is None:
                    result_dict[k_1] = k_2
                    break
    return result_dict


def main():
    print(compare_image('image1\\815.jpg', 'image2\\5.jpg'))
    print(compare_image_with_hash('image1\\815.jpg', 'image2\\5.jpg', 7))
    r = compare_image_dir_with_hash('image1\\', 'image2\\', 10)
    for k in r.keys():
        print(k, r.get(k))


if __name__ == '__main__':
    main()

输出: 假 真正的 image2 jpg image1 5. \ \ 815. jpg image2 jpg image1 6. \ \ 819. jpg image2 jpg image1 7. \ \ 900. jpg image2 jpg image1 8. \ \ 998. jpg image2 jpg image1 9. \ \ 1012. jpg 示例图片: 815. jpg 5. jpg

一种更有原则的方法是使用全局描述符来比较图像,比如GIST或CENTRIST。这里描述的哈希函数也提供了类似的解决方案。

可以尝试的小事:

将两个图像重新采样为小的缩略图(例如64 x 64),并将缩略图与某个阈值逐像素进行比较。如果原始图像几乎相同,重新采样的缩略图将非常相似,甚至完全相同。这种方法可以处理特别是在低光场景中可能出现的噪音。如果你调成灰度,效果可能会更好。