我需要在平面文件中存储多维数据关联数组,以便进行缓存。我可能偶尔会遇到需要将其转换为JSON以在我的web应用程序中使用,但绝大多数情况下,我将直接在PHP中使用数组。

在这个文本文件中将数组存储为JSON还是PHP序列化数组更有效?我查看了一下,似乎在最新版本的PHP(5.3)中,json_decode实际上比反序列化更快。

我目前倾向于将数组存储为JSON,因为我觉得如果有必要的话,它更容易被人阅读,它可以在PHP和JavaScript中使用,而且从我所读到的,它甚至可能更快地解码(虽然不确定编码)。

有人知道有什么陷阱吗?有人有好的基准来显示这两种方法的性能优势吗?


当前回答

我知道这有点晚了,但答案很旧,我想我的基准测试可能会有帮助,因为我刚刚在PHP 7.4中测试过

Serialize/Unserialize比JSON快得多,占用的内存和空间更少,在PHP 7.4中完全胜出,但我不确定我的测试是最有效或最好的。

我基本上创建了一个PHP文件,它返回一个数组,我编码,序列化,然后解码和反序列化。

$array = include __DIR__.'/../tests/data/dao/testfiles/testArray.php';

//JSON ENCODE
$json_encode_memory_start = memory_get_usage();
$json_encode_time_start = microtime(true);

for ($i=0; $i < 20000; $i++) { 
    $encoded = json_encode($array);
}

$json_encode_time_end = microtime(true);
$json_encode_memory_end = memory_get_usage();
$json_encode_time = $json_encode_time_end - $json_encode_time_start;
$json_encode_memory = 
$json_encode_memory_end - $json_encode_memory_start;


//SERIALIZE
$serialize_memory_start = memory_get_usage();
$serialize_time_start = microtime(true);

for ($i=0; $i < 20000; $i++) { 
    $serialized = serialize($array);
}

$serialize_time_end = microtime(true);
$serialize_memory_end = memory_get_usage();
$serialize_time = $serialize_time_end - $serialize_time_start;
$serialize_memory = $serialize_memory_end - $serialize_memory_start;


//Write to file time:
$fpc_memory_start = memory_get_usage();
$fpc_time_start = microtime(true);

for ($i=0; $i < 20000; $i++) { 
    $fpc_bytes = 
    file_put_contents(
        __DIR__.'/../tests/data/dao/testOneBigFile',
        '<?php return '.var_export($array,true).' ?>;'
    );
}

$fpc_time_end = microtime(true);
$fpc_memory_end = memory_get_usage();
$fpc_time = $fpc_time_end - $fpc_time_start;
$fpc_memory = $fpc_memory_end - $fpc_memory_start;


//JSON DECODE
$json_decode_memory_start = memory_get_usage();
$json_decode_time_start = microtime(true);

for ($i=0; $i < 20000; $i++) { 
    $decoded = json_encode($encoded);
}

$json_decode_time_end = microtime(true);
$json_decode_memory_end = memory_get_usage();
$json_decode_time = $json_decode_time_end - $json_decode_time_start;
$json_decode_memory = 
$json_decode_memory_end - $json_decode_memory_start;


//UNSERIALIZE
$unserialize_memory_start = memory_get_usage();
$unserialize_time_start = microtime(true);

for ($i=0; $i < 20000; $i++) { 
    $unserialized = unserialize($serialized);
}

$unserialize_time_end = microtime(true);
$unserialize_memory_end = memory_get_usage();
$unserialize_time = $unserialize_time_end - $unserialize_time_start;
$unserialize_memory = 
$unserialize_memory_end - $unserialize_memory_start;


//GET FROM VAR EXPORT:
$var_export_memory_start = memory_get_usage();
$var_export_time_start = microtime(true);

for ($i=0; $i < 20000; $i++) { 
    $array = include __DIR__.'/../tests/data/dao/testOneBigFile';
}

$var_export_time_end = microtime(true);
$var_export_memory_end = memory_get_usage();
$var_export_time = $var_export_time_end - $var_export_time_start;
$var_export_memory = $var_export_memory_end - $var_export_memory_start;

结果:

Var输出长度:11447 序列化长度:11541 Json编码长度:11895 文件放内容字节:11464

Json编码时间:1.9197590351105 序列化时间:0.160325050354 FPC时间:6.2793469429016

Json编码内存:12288 序列化内存:12288 FPC内存:0

JSON解码时间:1.7493588924408 UnSerialize Time: 0.19309520721436 Var导出和包括:3.1974139213562

JSON解码内存:16384 反序列化内存:14360 Var Export and Include: 192

其他回答

我已经在一个相当复杂、嵌套简单、包含各种数据(字符串、NULL、整数)的多散列上对此进行了非常彻底的测试,序列化/反序列化最终比json_encode/json_decode快得多。

在我的测试中,json的唯一优势是它的“打包”大小更小。

这些都是在PHP 5.3.3下完成的,如果你想了解更多细节,请告诉我。

下面是测试结果,然后是生成它们的代码。我不能提供测试数据,因为它会泄露一些我不能公开的信息。

JSON encoded in 2.23700618744 seconds
PHP serialized in 1.3434419632 seconds
JSON decoded in 4.0405561924 seconds
PHP unserialized in 1.39393305779 seconds

serialized size : 14549
json_encode size : 11520
serialize() was roughly 66.51% faster than json_encode()
unserialize() was roughly 189.87% faster than json_decode()
json_encode() string was roughly 26.29% smaller than serialize()

//  Time json encoding
$start = microtime( true );
for($i = 0; $i < 10000; $i++) {
    json_encode( $test );
}
$jsonTime = microtime( true ) - $start;
echo "JSON encoded in $jsonTime seconds<br>";

//  Time serialization
$start = microtime( true );
for($i = 0; $i < 10000; $i++) {
    serialize( $test );
}
$serializeTime = microtime( true ) - $start;
echo "PHP serialized in $serializeTime seconds<br>";

//  Time json decoding
$test2 = json_encode( $test );
$start = microtime( true );
for($i = 0; $i < 10000; $i++) {
    json_decode( $test2 );
}
$jsonDecodeTime = microtime( true ) - $start;
echo "JSON decoded in $jsonDecodeTime seconds<br>";

//  Time deserialization
$test2 = serialize( $test );
$start = microtime( true );
for($i = 0; $i < 10000; $i++) {
    unserialize( $test2 );
}
$unserializeTime = microtime( true ) - $start;
echo "PHP unserialized in $unserializeTime seconds<br>";

$jsonSize = strlen(json_encode( $test ));
$phpSize = strlen(serialize( $test ));

echo "<p>serialized size : " . strlen(serialize( $test )) . "<br>";
echo "json_encode size : " . strlen(json_encode( $test )) . "<br></p>";

//  Compare them
if ( $jsonTime < $serializeTime )
{
    echo "json_encode() was roughly " . number_format( ($serializeTime / $jsonTime - 1 ) * 100, 2 ) . "% faster than serialize()";
}
else if ( $serializeTime < $jsonTime )
{
    echo "serialize() was roughly " . number_format( ($jsonTime / $serializeTime - 1 ) * 100, 2 ) . "% faster than json_encode()";
} else {
    echo 'Unpossible!';
}
    echo '<BR>';

//  Compare them
if ( $jsonDecodeTime < $unserializeTime )
{
    echo "json_decode() was roughly " . number_format( ($unserializeTime / $jsonDecodeTime - 1 ) * 100, 2 ) . "% faster than unserialize()";
}
else if ( $unserializeTime < $jsonDecodeTime )
{
    echo "unserialize() was roughly " . number_format( ($jsonDecodeTime / $unserializeTime - 1 ) * 100, 2 ) . "% faster than json_decode()";
} else {
    echo 'Unpossible!';
}
    echo '<BR>';
//  Compare them
if ( $jsonSize < $phpSize )
{
    echo "json_encode() string was roughly " . number_format( ($phpSize / $jsonSize - 1 ) * 100, 2 ) . "% smaller than serialize()";
}
else if ( $phpSize < $jsonSize )
{
    echo "serialize() string was roughly " . number_format( ($jsonSize / $phpSize - 1 ) * 100, 2 ) . "% smaller than json_encode()";
} else {
    echo 'Unpossible!';
}

我对测试进行了扩展,以包括反序列化性能。这是我得到的数字。

Serialize

JSON encoded in 2.5738489627838 seconds
PHP serialized in 5.2861361503601 seconds
Serialize: json_encode() was roughly 105.38% faster than serialize()


Unserialize

JSON decode in 10.915472984314 seconds
PHP unserialized in 7.6223039627075 seconds
Unserialize: unserialize() was roughly 43.20% faster than json_decode() 

因此json似乎编码更快,但解码较慢。因此,这可能取决于您的应用程序以及您希望最大限度地实现什么。

似乎serialize是我要使用的一个,有两个原因:

有人指出,unserialize比json_decode更快,而且'read' case听起来比'write' case更有可能。 当使用无效UTF-8字符的字符串时,我遇到了json_encode的问题。当这种情况发生时,字符串最终为空,导致信息丢失。

我知道这有点晚了,但答案很旧,我想我的基准测试可能会有帮助,因为我刚刚在PHP 7.4中测试过

Serialize/Unserialize比JSON快得多,占用的内存和空间更少,在PHP 7.4中完全胜出,但我不确定我的测试是最有效或最好的。

我基本上创建了一个PHP文件,它返回一个数组,我编码,序列化,然后解码和反序列化。

$array = include __DIR__.'/../tests/data/dao/testfiles/testArray.php';

//JSON ENCODE
$json_encode_memory_start = memory_get_usage();
$json_encode_time_start = microtime(true);

for ($i=0; $i < 20000; $i++) { 
    $encoded = json_encode($array);
}

$json_encode_time_end = microtime(true);
$json_encode_memory_end = memory_get_usage();
$json_encode_time = $json_encode_time_end - $json_encode_time_start;
$json_encode_memory = 
$json_encode_memory_end - $json_encode_memory_start;


//SERIALIZE
$serialize_memory_start = memory_get_usage();
$serialize_time_start = microtime(true);

for ($i=0; $i < 20000; $i++) { 
    $serialized = serialize($array);
}

$serialize_time_end = microtime(true);
$serialize_memory_end = memory_get_usage();
$serialize_time = $serialize_time_end - $serialize_time_start;
$serialize_memory = $serialize_memory_end - $serialize_memory_start;


//Write to file time:
$fpc_memory_start = memory_get_usage();
$fpc_time_start = microtime(true);

for ($i=0; $i < 20000; $i++) { 
    $fpc_bytes = 
    file_put_contents(
        __DIR__.'/../tests/data/dao/testOneBigFile',
        '<?php return '.var_export($array,true).' ?>;'
    );
}

$fpc_time_end = microtime(true);
$fpc_memory_end = memory_get_usage();
$fpc_time = $fpc_time_end - $fpc_time_start;
$fpc_memory = $fpc_memory_end - $fpc_memory_start;


//JSON DECODE
$json_decode_memory_start = memory_get_usage();
$json_decode_time_start = microtime(true);

for ($i=0; $i < 20000; $i++) { 
    $decoded = json_encode($encoded);
}

$json_decode_time_end = microtime(true);
$json_decode_memory_end = memory_get_usage();
$json_decode_time = $json_decode_time_end - $json_decode_time_start;
$json_decode_memory = 
$json_decode_memory_end - $json_decode_memory_start;


//UNSERIALIZE
$unserialize_memory_start = memory_get_usage();
$unserialize_time_start = microtime(true);

for ($i=0; $i < 20000; $i++) { 
    $unserialized = unserialize($serialized);
}

$unserialize_time_end = microtime(true);
$unserialize_memory_end = memory_get_usage();
$unserialize_time = $unserialize_time_end - $unserialize_time_start;
$unserialize_memory = 
$unserialize_memory_end - $unserialize_memory_start;


//GET FROM VAR EXPORT:
$var_export_memory_start = memory_get_usage();
$var_export_time_start = microtime(true);

for ($i=0; $i < 20000; $i++) { 
    $array = include __DIR__.'/../tests/data/dao/testOneBigFile';
}

$var_export_time_end = microtime(true);
$var_export_memory_end = memory_get_usage();
$var_export_time = $var_export_time_end - $var_export_time_start;
$var_export_memory = $var_export_memory_end - $var_export_memory_start;

结果:

Var输出长度:11447 序列化长度:11541 Json编码长度:11895 文件放内容字节:11464

Json编码时间:1.9197590351105 序列化时间:0.160325050354 FPC时间:6.2793469429016

Json编码内存:12288 序列化内存:12288 FPC内存:0

JSON解码时间:1.7493588924408 UnSerialize Time: 0.19309520721436 Var导出和包括:3.1974139213562

JSON解码内存:16384 反序列化内存:14360 Var Export and Include: 192

看看这里的结果(很抱歉把PHP代码放在JS代码框中):

http://jsfiddle.net/newms87/h3b0a0ha/embedded/result/

结果:serialize()和unserialize()在PHP 5.4中对于不同大小的数组都要快得多。

我在真实世界的数据上做了一个测试脚本,比较json_encode vs serialize和json_decode vs unserialize。测试是在一个生产中的电子商务网站的缓存系统上运行的。它只是获取缓存中已经存在的数据,并测试编码/解码(或序列化/反序列化)所有数据的时间,然后我将其放入一个易于查看的表中。

我在PHP 5.4共享托管服务器上运行了这个程序。

结果是非常结论性的,对于这些大到小的数据集,序列化和非序列化是明显的赢家。特别是对于我的用例,json_decode和unserialize对于缓存系统是最重要的。在这里,Unserialize几乎是一个无处不在的赢家。它的速度通常是json_decode的2到4倍(有时是6或7倍)。

有趣的是,@peter-bailey的结果有所不同。

下面是用于生成结果的PHP代码:

<?php

ini_set('display_errors', 1);
error_reporting(E_ALL);

function _count_depth($array)
{
    $count     = 0;
    $max_depth = 0;
    foreach ($array as $a) {
        if (is_array($a)) {
            list($cnt, $depth) = _count_depth($a);
            $count += $cnt;
            $max_depth = max($max_depth, $depth);
        } else {
            $count++;
        }
    }

    return array(
        $count,
        $max_depth + 1,
    );
}

function run_test($file)
{
    $memory     = memory_get_usage();
    $test_array = unserialize(file_get_contents($file));
    $memory     = round((memory_get_usage() - $memory) / 1024, 2);

    if (empty($test_array) || !is_array($test_array)) {
        return;
    }

    list($count, $depth) = _count_depth($test_array);

    //JSON encode test
    $start            = microtime(true);
    $json_encoded     = json_encode($test_array);
    $json_encode_time = microtime(true) - $start;

    //JSON decode test
    $start = microtime(true);
    json_decode($json_encoded);
    $json_decode_time = microtime(true) - $start;

    //serialize test
    $start          = microtime(true);
    $serialized     = serialize($test_array);
    $serialize_time = microtime(true) - $start;

    //unserialize test
    $start = microtime(true);
    unserialize($serialized);
    $unserialize_time = microtime(true) - $start;

    return array(
        'Name'                   => basename($file),
        'json_encode() Time (s)' => $json_encode_time,
        'json_decode() Time (s)' => $json_decode_time,
        'serialize() Time (s)'   => $serialize_time,
        'unserialize() Time (s)' => $unserialize_time,
        'Elements'               => $count,
        'Memory (KB)'            => $memory,
        'Max Depth'              => $depth,
        'json_encode() Win'      => ($json_encode_time > 0 && $json_encode_time < $serialize_time) ? number_format(($serialize_time / $json_encode_time - 1) * 100, 2) : '',
        'serialize() Win'        => ($serialize_time > 0 && $serialize_time < $json_encode_time) ? number_format(($json_encode_time / $serialize_time - 1) * 100, 2) : '',
        'json_decode() Win'      => ($json_decode_time > 0 && $json_decode_time < $serialize_time) ? number_format(($serialize_time / $json_decode_time - 1) * 100, 2) : '',
        'unserialize() Win'      => ($unserialize_time > 0 && $unserialize_time < $json_decode_time) ? number_format(($json_decode_time / $unserialize_time - 1) * 100, 2) : '',
    );
}

$files = glob(dirname(__FILE__) . '/system/cache/*');

$data = array();

foreach ($files as $file) {
    if (is_file($file)) {
        $result = run_test($file);

        if ($result) {
            $data[] = $result;
        }
    }
}

uasort($data, function ($a, $b) {
    return $a['Memory (KB)'] < $b['Memory (KB)'];
});

$fields = array_keys($data[0]);
?>

<table>
    <thead>
    <tr>
        <?php foreach ($fields as $f) { ?>
            <td style="text-align: center; border:1px solid black;padding: 4px 8px;font-weight:bold;font-size:1.1em"><?= $f; ?></td>
        <?php } ?>
    </tr>
    </thead>

    <tbody>
    <?php foreach ($data as $d) { ?>
        <tr>
            <?php foreach ($d as $key => $value) { ?>
                <?php $is_win = strpos($key, 'Win'); ?>
                <?php $color = ($is_win && $value) ? 'color: green;font-weight:bold;' : ''; ?>
                <td style="text-align: center; vertical-align: middle; padding: 3px 6px; border: 1px solid gray; <?= $color; ?>"><?= $value . (($is_win && $value) ? '%' : ''); ?></td>
            <?php } ?>
        </tr>
    <?php } ?>
    </tbody>
</table>