如何使四舍五入百分比加起来为100%

考虑下面四个百分比，用浮点数表示:

    13.626332%
    47.989636%
     9.596008%
    28.788024%
   -----------
   100.000000%

我需要用整数表示这些百分比。如果我简单地使用Math.round()，我最终得到的总数是101%。

14 + 48 + 10 + 29 = 101

如果我使用parseInt()，我最终得到了97%。

13 + 47 + 9 + 28 = 97

有什么好的算法可以将任何百分比数表示为整数，同时还保持总数为100%?

编辑:在阅读了一些评论和回答后，显然有很多方法可以解决这个问题。

在我看来，为了保持数字的真实性，“正确”的结果是最小化总体误差的结果，定义为相对于实际值会引入多少误差舍入:

        value  rounded     error               decision
   ----------------------------------------------------
    13.626332       14      2.7%          round up (14)
    47.989636       48      0.0%          round up (48)
     9.596008       10      4.0%    don't round up  (9)
    28.788024       29      2.7%          round up (29)

在平局的情况下(3.33,3.33,3.33)可以做出任意的决定(例如3,4,3)。

当前回答

我认为以下几点可以达到你的目的

function func( orig, target ) {

    var i = orig.length, j = 0, total = 0, change, newVals = [], next, factor1, factor2, len = orig.length, marginOfErrors = [];

    // map original values to new array
    while( i-- ) {
        total += newVals[i] = Math.round( orig[i] );
    }

    change = total < target ? 1 : -1;

    while( total !== target ) {

        // Iterate through values and select the one that once changed will introduce
        // the least margin of error in terms of itself. e.g. Incrementing 10 by 1
        // would mean an error of 10% in relation to the value itself.
        for( i = 0; i < len; i++ ) {

            next = i === len - 1 ? 0 : i + 1;

            factor2 = errorFactor( orig[next], newVals[next] + change );
            factor1 = errorFactor( orig[i], newVals[i] + change );

            if(  factor1 > factor2 ) {
                j = next; 
            }
        }

        newVals[j] += change;
        total += change;
    }


    for( i = 0; i < len; i++ ) { marginOfErrors[i] = newVals[i] && Math.abs( orig[i] - newVals[i] ) / orig[i]; }

    // Math.round() causes some problems as it is difficult to know at the beginning
    // whether numbers should have been rounded up or down to reduce total margin of error. 
    // This section of code increments and decrements values by 1 to find the number
    // combination with least margin of error.
    for( i = 0; i < len; i++ ) {
        for( j = 0; j < len; j++ ) {
            if( j === i ) continue;

            var roundUpFactor = errorFactor( orig[i], newVals[i] + 1)  + errorFactor( orig[j], newVals[j] - 1 );
            var roundDownFactor = errorFactor( orig[i], newVals[i] - 1) + errorFactor( orig[j], newVals[j] + 1 );
            var sumMargin = marginOfErrors[i] + marginOfErrors[j];

            if( roundUpFactor < sumMargin) { 
                newVals[i] = newVals[i] + 1;
                newVals[j] = newVals[j] - 1;
                marginOfErrors[i] = newVals[i] && Math.abs( orig[i] - newVals[i] ) / orig[i];
                marginOfErrors[j] = newVals[j] && Math.abs( orig[j] - newVals[j] ) / orig[j];
            }

            if( roundDownFactor < sumMargin ) { 
                newVals[i] = newVals[i] - 1;
                newVals[j] = newVals[j] + 1;
                marginOfErrors[i] = newVals[i] && Math.abs( orig[i] - newVals[i] ) / orig[i];
                marginOfErrors[j] = newVals[j] && Math.abs( orig[j] - newVals[j] ) / orig[j];
            }

        }
    }

    function errorFactor( oldNum, newNum ) {
        return Math.abs( oldNum - newNum ) / oldNum;
    }

    return newVals;
}


func([16.666, 16.666, 16.666, 16.666, 16.666, 16.666], 100); // => [16, 16, 17, 17, 17, 17]
func([33.333, 33.333, 33.333], 100); // => [34, 33, 33]
func([33.3, 33.3, 33.3, 0.1], 100); // => [34, 33, 33, 0] 
func([13.25, 47.25, 11.25, 28.25], 100 ); // => [13, 48, 11, 28]
func( [25.5, 25.5, 25.5, 23.5], 100 ); // => [25, 25, 26, 24]

最后一件事，我使用问题中最初给出的数字运行函数，与期望的输出进行比较

func([13.626332, 47.989636, 9.596008, 28.788024], 100); // => [48, 29, 13, 10]

这与问题想要的不同=>[48,29,14,9]。我无法理解这一点，直到我看了总误差范围

-------------------------------------------------
| original  | question | % diff | mine | % diff |
-------------------------------------------------
| 13.626332 | 14       | 2.74%  | 13   | 4.5%   |
| 47.989636 | 48       | 0.02%  | 48   | 0.02%  |
| 9.596008  | 9        | 6.2%   | 10   | 4.2%   |
| 28.788024 | 29       | 0.7%   | 29   | 0.7%   |
-------------------------------------------------
| Totals    | 100      | 9.66%  | 100  | 9.43%  |
-------------------------------------------------

从本质上讲，我的函数的结果实际上引入了最少的误差。

小提琴在这里

2012-11-21 00:11:47

其他回答

我写了一个c#版本的舍入帮助器，算法和Varun Vohra的答案一样，希望对你有帮助。

public static List<decimal> GetPerfectRounding(List<decimal> original,
    decimal forceSum, int decimals)
{
    var rounded = original.Select(x => Math.Round(x, decimals)).ToList();
    Debug.Assert(Math.Round(forceSum, decimals) == forceSum);
    var delta = forceSum - rounded.Sum();
    if (delta == 0) return rounded;
    var deltaUnit = Convert.ToDecimal(Math.Pow(0.1, decimals)) * Math.Sign(delta);

    List<int> applyDeltaSequence; 
    if (delta < 0)
    {
        applyDeltaSequence = original
            .Zip(Enumerable.Range(0, int.MaxValue), (x, index) => new { x, index })
            .OrderBy(a => original[a.index] - rounded[a.index])
            .ThenByDescending(a => a.index)
            .Select(a => a.index).ToList();
    }
    else
    {
        applyDeltaSequence = original
            .Zip(Enumerable.Range(0, int.MaxValue), (x, index) => new { x, index })
            .OrderByDescending(a => original[a.index] - rounded[a.index])
            .Select(a => a.index).ToList();
    }

    Enumerable.Repeat(applyDeltaSequence, int.MaxValue)
        .SelectMany(x => x)
        .Take(Convert.ToInt32(delta/deltaUnit))
        .ForEach(index => rounded[index] += deltaUnit);

    return rounded;
}

通过以下单元测试:

[TestMethod]
public void TestPerfectRounding()
{
    CollectionAssert.AreEqual(Utils.GetPerfectRounding(
        new List<decimal> {3.333m, 3.334m, 3.333m}, 10, 2),
        new List<decimal> {3.33m, 3.34m, 3.33m});

    CollectionAssert.AreEqual(Utils.GetPerfectRounding(
        new List<decimal> {3.33m, 3.34m, 3.33m}, 10, 1),
        new List<decimal> {3.3m, 3.4m, 3.3m});

    CollectionAssert.AreEqual(Utils.GetPerfectRounding(
        new List<decimal> {3.333m, 3.334m, 3.333m}, 10, 1),
        new List<decimal> {3.3m, 3.4m, 3.3m});


    CollectionAssert.AreEqual(Utils.GetPerfectRounding(
        new List<decimal> { 13.626332m, 47.989636m, 9.596008m, 28.788024m }, 100, 0),
        new List<decimal> {14, 48, 9, 29});
    CollectionAssert.AreEqual(Utils.GetPerfectRounding(
        new List<decimal> { 16.666m, 16.666m, 16.666m, 16.666m, 16.666m, 16.666m }, 100, 0),
        new List<decimal> { 17, 17, 17, 17, 16, 16 });
    CollectionAssert.AreEqual(Utils.GetPerfectRounding(
        new List<decimal> { 33.333m, 33.333m, 33.333m }, 100, 0),
        new List<decimal> { 34, 33, 33 });
    CollectionAssert.AreEqual(Utils.GetPerfectRounding(
        new List<decimal> { 33.3m, 33.3m, 33.3m, 0.1m }, 100, 0),
        new List<decimal> { 34, 33, 33, 0 });
}

2016-01-18 00:55:47

下面是一个实现了最大余数方法的Ruby宝石: https://github.com/jethroo/lare_round

使用方法:

a =  Array.new(3){ BigDecimal('0.3334') }
# => [#<BigDecimal:887b6c8,'0.3334E0',9(18)>, #<BigDecimal:887b600,'0.3334E0',9(18)>, #<BigDecimal:887b4c0,'0.3334E0',9(18)>]
a = LareRound.round(a,2)
# => [#<BigDecimal:8867330,'0.34E0',9(36)>, #<BigDecimal:8867290,'0.33E0',9(36)>, #<BigDecimal:88671f0,'0.33E0',9(36)>]
a.reduce(:+).to_f
# => 1.0

2020-12-31 03:24:09

舍入的目标是产生最少的错误。当您对单个值进行舍入时，这个过程简单而直接，大多数人都很容易理解。当你同时四舍五入多个数字时，这个过程变得更加棘手——你必须定义如何组合错误，即必须最小化的错误。

Varun Vohra的答案将绝对误差的总和最小化，而且实现起来非常简单。然而，有一些边缘情况它不能处理-舍入24.25,23.25,27.25,25.25的结果应该是什么?其中一个需要被围捕，而不是减少。你可能会任意选择列表中的第一个或最后一个。

也许用相对误差比绝对误差更好。将23.25四舍五入到24会使它变化3.2%，而将27.25四舍五入到28只会使它变化2.8%。现在有一个明显的赢家。

我们还可以做进一步的调整。一种常见的技术是对每个错误进行平方运算，这样大错误的计数就不成比例地多于小错误。我还会使用非线性除数来得到相对误差——1%的误差比99%的误差重要99倍，这似乎是不对的。在下面的代码中，我使用了平方根。

完整算法如下:

将这些百分比四舍五入后相加，再减去100。这将告诉您这些百分比中有多少必须四舍五入。为每个百分比生成两个错误分数，一个是四舍五入，另一个是四舍五入。取两者之差。对上面产生的误差差异进行排序。对于需要四舍五入的百分比数，从已排序的列表中选取一项，并将四舍五入后的百分比增加1。

您仍然可能有多个具有相同错误和的组合，例如33.3333333,33.3333333,33.3333333。这是不可避免的，结果完全是任意的。下面给出的代码倾向于四舍五入左边的值。

在Python中把它们放在一起是这样的。

from math import isclose, sqrt

def error_gen(actual, rounded):
    divisor = sqrt(1.0 if actual < 1.0 else actual)
    return abs(rounded - actual) ** 2 / divisor

def round_to_100(percents):
    if not isclose(sum(percents), 100):
        raise ValueError
    n = len(percents)
    rounded = [int(x) for x in percents]
    up_count = 100 - sum(rounded)
    errors = [(error_gen(percents[i], rounded[i] + 1) - error_gen(percents[i], rounded[i]), i) for i in range(n)]
    rank = sorted(errors)
    for i in range(up_count):
        rounded[rank[i][1]] += 1
    return rounded

>>> round_to_100([13.626332, 47.989636, 9.596008, 28.788024])
[14, 48, 9, 29]
>>> round_to_100([33.3333333, 33.3333333, 33.3333333])
[34, 33, 33]
>>> round_to_100([24.25, 23.25, 27.25, 25.25])
[24, 23, 28, 25]
>>> round_to_100([1.25, 2.25, 3.25, 4.25, 89.0])
[1, 2, 3, 4, 90]

正如您在最后一个示例中看到的，该算法仍然能够提供非直观的结果。尽管89.0不需要四舍五入，但是列表中的一个值需要四舍五入;相对误差最小的结果是将较大的值舍入，而不是较小的可选值。

这个答案最初主张遍历所有可能的向上舍入/向下舍入组合，但正如评论中指出的那样，更简单的方法效果更好。算法和代码反映了这种简化。

2016-01-23 05:30:01

对于那些在熊猫系列中有百分比的人，这里是我的最大余数方法的实现(就像Varun Vohra的答案一样)，在那里你甚至可以选择你想要四舍五入的小数。

import numpy as np

def largestRemainderMethod(pd_series, decimals=1):

    floor_series = ((10**decimals * pd_series).astype(np.int)).apply(np.floor)
    diff = 100 * (10**decimals) - floor_series.sum().astype(np.int)
    series_decimals = pd_series - floor_series / (10**decimals)
    series_sorted_by_decimals = series_decimals.sort_values(ascending=False)

    for i in range(0, len(series_sorted_by_decimals)):
        if i < diff:
            series_sorted_by_decimals.iloc[[i]] = 1
        else:
            series_sorted_by_decimals.iloc[[i]] = 0

    out_series = ((floor_series + series_sorted_by_decimals) / (10**decimals)).sort_values(ascending=False)

    return out_series

2020-01-14 16:16:56

可能做到这一点的“最佳”方法(引用是因为“最佳”是一个主观术语)是保持你所处位置的连续(非积分)计数，并四舍五入该值。

然后将其与历史记录一起使用，以确定应该使用什么值。例如，使用您给出的值:

Value      CumulValue  CumulRounded  PrevBaseline  Need
---------  ----------  ------------  ------------  ----
                                  0
13.626332   13.626332            14             0    14 ( 14 -  0)
47.989636   61.615968            62            14    48 ( 62 - 14)
 9.596008   71.211976            71            62     9 ( 71 - 62)
28.788024  100.000000           100            71    29 (100 - 71)
                                                    ---
                                                    100

在每个阶段，都不需要四舍五入数字本身。相反，将累积值四舍五入，并计算出从上一个基线中达到该值的最佳整数——该基线是前一行的累积值(四舍五入)。

这是可行的，因为您不会在每个阶段都丢失信息，而是更聪明地使用信息。“正确的”四舍五入值在最后一列，你可以看到它们的和是100。

在上面的第三个值中，您可以看到这与盲目舍入每个值之间的区别。虽然9.596008通常会四舍五入到10，但累积的71.211976正确地四舍五入到71 -这意味着只需要9就可以加上之前的基线62。

这也适用于“有问题的”序列，比如三个大约1/3的值，其中一个应该四舍五入:

Value      CumulValue  CumulRounded  PrevBaseline  Need
---------  ----------  ------------  ------------  ----
                                  0
33.333333   33.333333            33             0    33 ( 33 -  0)
33.333333   66.666666            67            33    34 ( 67 - 33)
33.333333   99.999999           100            67    33 (100 - 67)
                                                    ---
                                                    100

2012-11-20 22:43:54

如何使四舍五入百分比加起来为100%

推荐文章

最新文章

标签