非加密用途的最快哈希?

我基本上准备短语被放入数据库，他们可能是畸形的，所以我想要存储他们的一个短哈希代替(我将只是比较他们是否存在，所以哈希是理想的)。

我假设MD5在100,000+请求时相当慢，所以我想知道什么是哈希短语的最佳方法，也许是推出我自己的哈希函数或使用哈希('md4'， '…“最终会更快吗?”

我知道MySQL有MD5()，所以这将在查询端补充一点速度，但也许在MySQL中还有一个更快的哈希函数，我不知道这将与PHP一起工作。

当前回答

+-------------------+---------+------+--------------+
|       NAME        |  LOOPS  | TIME |     OP/S     |
+-------------------+---------+------+--------------+
| sha1ShortString   | 1638400 | 2.85 | 574,877.19   |
| md5ShortString    | 2777680 | 4.11 | 675,834.55   |
| crc32ShortString  | 3847980 | 3.61 | 1,065,922.44 |
| sha1MediumString  | 602620  | 4.75 | 126,867.37   |
| md5MediumString   | 884860  | 4.69 | 188,669.51   |
| crc32MediumString | 819200  | 4.85 | 168,907.22   |
| sha1LongString    | 181800  | 4.95 | 36,727.27    |
| md5LongString     | 281680  | 4.93 | 57,135.90    |
| crc32LongString   | 226220  | 4.95 | 45,701.01    |
+-------------------+---------+------+--------------+

似乎crc32对于较小的消息(在本例中为26个字符)更快，而md5对于较长的消息(在本例中为>852个字符)。

2012-02-07 21:01:53

其他回答

Adler32在我的机器上运行得最好。 md5()比crc32()更快。

2012-11-07 16:52:31

与其假设MD5“相当慢”，不如尝试一下。在一台简单的PC上(我的PC是一台2.4 GHz Core2，使用单核)，一个简单的基于c的MD5实现每秒可以散列600万条小消息。一个小消息在这里是最大55字节。对于较长的消息，MD5哈希速度与消息大小成线性，即它以大约每秒400兆字节的速度处理数据。您可能会注意到，这是一个好的硬盘或千兆以太网卡最大速度的四倍。

因为我的电脑有四个核心，这意味着我的硬盘最多只能提供或接收6%的可用计算能力。只有在非常特殊的情况下，哈希速度才会成为瓶颈，甚至在PC上产生明显的成本。

在较小的体系结构上，哈希速度可能变得有些重要，您可能希望使用MD4。MD4可以用于非加密目的(对于加密目的，无论如何都不应该使用MD5)。据报道，在基于arm的平台上，MD4甚至比CRC32更快。

2010-09-10 14:05:07

在哈希中实现md5比md5()快一点。所以这可以是一个选项或其他，请尝试:

echo '<pre>';

$run = array();

function test($algo)
{
  #static $c = 0;
  #if($c>10) return;
  #$c++;

 $tss = microtime(true);
 for($i=0; $i<100000; $i++){
  $x = hash($algo, "ana are mere");
 }
 $tse = microtime(true);

 $GLOBALS['run'][(string)round($tse-$tss, 5)] = "\nhash({$algo}): \t".round($tse-$tss, 5) . " \t" . $x;
 #echo "\n$i nhash({$algo}): \t".round($tse-$tss, 5) . " \t" . $x;
}
array_map('test', hash_algos());
ksort($run);
print_r($run);
echo '</pre>';

你可以在http://www.dozent.net/Tipps-Tricks/PHP/hash-performance上看到

2013-12-09 12:47:13

fcn     time  generated hash
crc32:  0.03163  798740135
md5:    0.0731   0dbab6d0c841278d33be207f14eeab8b
sha1:   0.07331  417a9e5c9ac7c52e32727cfd25da99eca9339a80
xor:    0.65218  119
xor2:   0.29301  134217728
add:    0.57841  1105

生成这个的代码是:

 $loops = 100000;
 $str = "ana are mere";

 echo "<pre>";

 $tss = microtime(true);
 for($i=0; $i<$loops; $i++){
  $x = crc32($str);
 }
 $tse = microtime(true);
 echo "\ncrc32: \t" . round($tse-$tss, 5) . " \t" . $x;

 $tss = microtime(true);
 for($i=0; $i<$loops; $i++){
  $x = md5($str);
 }
 $tse = microtime(true);
 echo "\nmd5: \t".round($tse-$tss, 5) . " \t" . $x;

 $tss = microtime(true);
 for($i=0; $i<$loops; $i++){
  $x = sha1($str);
 }
 $tse = microtime(true);
 echo "\nsha1: \t".round($tse-$tss, 5) . " \t" . $x;

 $tss = microtime(true);
 for($i=0; $i<$loops; $i++){
  $l = strlen($str);
  $x = 0x77;
  for($j=0;$j<$l;$j++){
   $x = $x xor ord($str[$j]);
  }
 }
 $tse = microtime(true);
 echo "\nxor: \t".round($tse-$tss, 5) . " \t" . $x;

 $tss = microtime(true);
 for($i=0; $i<$loops; $i++){
  $l = strlen($str);
  $x = 0x08;
  for($j=0;$j<$l;$j++){
   $x = ($x<<2) xor $str[$j];
  }
 }
 $tse = microtime(true);
 echo "\nxor2: \t".round($tse-$tss, 5) . " \t" . $x;

 $tss = microtime(true);
 for($i=0; $i<$loops; $i++){
  $l = strlen($str);
  $x = 0;
  for($j=0;$j<$l;$j++){
   $x = $x + ord($str[$j]);
  }
 }
 $tse = microtime(true);
 echo "\nadd: \t".round($tse-$tss, 5) . " \t" . $x;

2010-09-08 07:34:58

在xxHash存储库上有一个速度比较。这是它显示的2021年1月12日。

Hash Name	Width	Bandwidth (GB/s)	Small Data Velocity	Quality	Comment
XXH3 (SSE2)	64	31.5 GB/s	133.1	10
XXH128 (SSE2)	128	29.6 GB/s	118.1	10
RAM sequential read	N/A	28.0 GB/s	N/A	N/A	for reference
City64	64	22.0 GB/s	76.6	10
T1ha2	64	22.0 GB/s	99.0	9	Slightly worse [collisions]
City128	128	21.7 GB/s	57.7	10
XXH64	64	19.4 GB/s	71.0	10
SpookyHash	64	19.3 GB/s	53.2	10
Mum	64	18.0 GB/s	67.0	9	Slightly worse [collisions]
XXH32	32	9.7 GB/s	71.9	10
City32	32	9.1 GB/s	66.0	10
Murmur3	32	3.9 GB/s	56.1	10
SipHash	64	3.0 GB/s	43.2	10
FNV64	64	1.2 GB/s	62.7	5	Poor avalanche properties
Blake2	256	1.1 GB/s	5.1	10	Cryptographic
SHA1	160	0.8 GB/s	5.6	10	Cryptographic but broken
MD5	128	0.6 GB/s	7.8	10	Cryptographic but broken

xxHash似乎是目前为止最快的hash，而其他许多hash都打败了旧的hash，比如CRC32、MD5和SHA。

2014-01-23 17:56:59

非加密用途的最快哈希?

推荐文章

最新文章

标签