我一直在研究。net反汇编和GCC源代码,但似乎在任何地方都找不到sin()和其他数学函数的实际实现…他们似乎总是在引用别的东西。

有人能帮我找到他们吗?我觉得不太可能运行C语言的所有硬件都支持三角函数,所以一定有软件算法,对吧?


我知道有几种方法可以计算函数,并且为了好玩,我写了自己的例程来使用泰勒级数计算函数。我很好奇生产语言是如何做到这一点的,因为我的所有实现总是慢几个数量级,尽管我认为我的算法非常聪明(显然不是)。


当前回答

没有什么比点击源代码,看看人们是如何在常用的库中实际完成它的了;让我们特别看看一个C库实现。我选择了uLibC。

这是sin函数:

http://git.uclibc.org/uClibc/tree/libm/s_sin.c

看起来它处理了一些特殊情况,然后执行一些参数约简,将输入映射到范围[-pi/4,pi/4],(将参数分成两部分,一个大的部分和一个尾巴),然后调用

http://git.uclibc.org/uClibc/tree/libm/k_sin.c

然后作用于这两个部分。 如果没有尾巴,则使用13次多项式生成近似答案。 如果有尾巴,根据sin(x+y) = sin(x) + sin'(x')y的原理,你会得到一个小的修正

其他回答

使用泰勒级数,试着找出级数项之间的关系这样你就不用一遍又一遍地计算了

下面是一个关于余窦的例子:

double cosinus(double x, double prec)
{
    double t, s ;
    int p;
    p = 0;
    s = 1.0;
    t = 1.0;
    while(fabs(t/s) > prec)
    {
        p++;
        t = (-t * x * x) / ((2 * p - 1) * (2 * p));
        s += t;
    }
    return s;
}

使用这个,我们可以得到新的和项使用已经使用的和项(我们避免阶乘和x2p)

它如何做到这一点的本质在于杰拉德·惠特利的《应用数值分析》节选:

当你的软件程序要求计算机获取一个值时 或者,你有没有想过它是如何得到 如果它能计算的最强大的函数是多项式? 它不会在表中查找这些并进行插值!相反, 计算机逼近除多项式以外的所有函数 一个多项式,可以精确地给出值。

上面要提到的几点是,一些算法实际上是从表中插值的,尽管只是在前几次迭代中。还要注意它是如何提到计算机利用近似多项式而没有指定哪种类型的近似多项式。正如本文中其他人指出的那样,在这种情况下,切比雪夫多项式比泰勒多项式更有效。

盲汉回答的改进版代码

#define EPSILON .0000000000001
// this is smallest effective threshold, at least on my OS (WSL ubuntu 18)
// possibly because factorial part turns 0 at some point
// and it happens faster then series element turns 0;
// validation was made against sin() from <math.h>
double ft_sin(double x)
{
    int k = 2;
    double r = x;
    double acc = 1;
    double den = 1;
    double num = x;

//  precision drops rapidly when x is not close to 0
//  so move x to 0 as close as possible
    while (x > PI)
        x -= PI;
    while (x < -PI)
        x += PI;
    if (x > PI / 2)
        return (ft_sin(PI - x));
    if (x < -PI / 2)
        return (ft_sin(-PI - x));
//  not using fabs for performance reasons
    while (acc > EPSILON || acc < -EPSILON)
    {
        num *= -x * x;
        den *= k * (k + 1);
        acc = num / den;
        r += acc;
        k += 2;
    }
    return (r);
}

OK kiddies, time for the pros.... This is one of my biggest complaints with inexperienced software engineers. They come in calculating transcendental functions from scratch (using Taylor's series) as if nobody had ever done these calculations before in their lives. Not true. This is a well defined problem and has been approached thousands of times by very clever software and hardware engineers and has a well defined solution. Basically, most of the transcendental functions use Chebyshev Polynomials to calculate them. As to which polynomials are used depends on the circumstances. First, the bible on this matter is a book called "Computer Approximations" by Hart and Cheney. In that book, you can decide if you have a hardware adder, multiplier, divider, etc, and decide which operations are fastest. e.g. If you had a really fast divider, the fastest way to calculate sine might be P1(x)/P2(x) where P1, P2 are Chebyshev polynomials. Without the fast divider, it might be just P(x), where P has much more terms than P1 or P2....so it'd be slower. So, first step is to determine your hardware and what it can do. Then you choose the appropriate combination of Chebyshev polynomials (is usually of the form cos(ax) = aP(x) for cosine for example, again where P is a Chebyshev polynomial). Then you decide what decimal precision you want. e.g. if you want 7 digits precision, you look that up in the appropriate table in the book I mentioned, and it will give you (for precision = 7.33) a number N = 4 and a polynomial number 3502. N is the order of the polynomial (so it's p4.x^4 + p3.x^3 + p2.x^2 + p1.x + p0), because N=4. Then you look up the actual value of the p4,p3,p2,p1,p0 values in the back of the book under 3502 (they'll be in floating point). Then you implement your algorithm in software in the form: (((p4.x + p3).x + p2).x + p1).x + p0 ....and this is how you'd calculate cosine to 7 decimal places on that hardware.

请注意,在FPU中大多数硬件实现的超越操作通常涉及一些微码和类似的操作(取决于硬件)。 切比雪夫多项式用于大多数先验多项式,但不是全部。例:使用Newton raphson方法的两次迭代,首先使用查询表,使用平方根更快。 同样,《计算机逼近》这本书会告诉你。

If you plan on implmementing these functions, I'd recommend to anyone that they get a copy of that book. It really is the bible for these kinds of algorithms. Note that there are bunches of alternative means for calculating these values like cordics, etc, but these tend to be best for specific algorithms where you only need low precision. To guarantee the precision every time, the chebyshev polynomials are the way to go. Like I said, well defined problem. Has been solved for 50 years now.....and thats how it's done.

Now, that being said, there are techniques whereby the Chebyshev polynomials can be used to get a single precision result with a low degree polynomial (like the example for cosine above). Then, there are other techniques to interpolate between values to increase the accuracy without having to go to a much larger polynomial, such as "Gal's Accurate Tables Method". This latter technique is what the post referring to the ACM literature is referring to. But ultimately, the Chebyshev Polynomials are what are used to get 90% of the way there.

享受。

如果你想犯罪

 __asm__ __volatile__("fsin" : "=t"(vsin) : "0"(xrads));

如果你想的话,因为

 __asm__ __volatile__("fcos" : "=t"(vcos) : "0"(xrads));

如果你想要根号方根

 __asm__ __volatile__("fsqrt" : "=t"(vsqrt) : "0"(value));

那么,既然机器指令可以做到,为什么还要使用不准确的代码呢?