我一直在研究。net反汇编和GCC源代码,但似乎在任何地方都找不到sin()和其他数学函数的实际实现…他们似乎总是在引用别的东西。

有人能帮我找到他们吗?我觉得不太可能运行C语言的所有硬件都支持三角函数,所以一定有软件算法,对吧?


我知道有几种方法可以计算函数,并且为了好玩,我写了自己的例程来使用泰勒级数计算函数。我很好奇生产语言是如何做到这一点的,因为我的所有实现总是慢几个数量级,尽管我认为我的算法非常聪明(显然不是)。


当前回答

我将尝试在一个C程序中回答sin()的情况,该程序用GCC的C编译器在当前的x86处理器(假设是Intel Core 2 Duo)上编译。

在C语言中,标准C库包含了一些常见的数学函数,而这些函数并不包含在语言本身中(例如pow, sin和cos分别表示幂,sin和cos)。它们的头文件包含在math.h中。

现在在GNU/Linux系统上,这些库函数是由glibc (GNU libc或GNU C库)提供的。但是GCC编译器希望您使用-lm编译器标志链接到数学库(libm.so),以启用这些数学函数的使用。我不确定为什么它不是标准C库的一部分。这些将是浮点函数的软件版本,或“软浮动”。

题外话:将数学函数分开的原因由来已久,据我所知,可能是在共享库可用之前,它仅仅是为了在非常古老的Unix系统中减少可执行程序的大小。

Now the compiler may optimize the standard C library function sin() (provided by libm.so) to be replaced with an call to a native instruction to your CPU/FPU's built-in sin() function, which exists as an FPU instruction (FSIN for x86/x87) on newer processors like the Core 2 series (this is correct pretty much as far back as the i486DX). This would depend on optimization flags passed to the gcc compiler. If the compiler was told to write code that would execute on any i386 or newer processor, it would not make such an optimization. The -mcpu=486 flag would inform the compiler that it was safe to make such an optimization.

现在,如果程序执行sin()函数的软件版本,它将基于CORDIC(坐标旋转数字计算机)或BKM算法,或者更可能是现在通常用于计算此类超越函数的表格或幂级数计算。(Src: http://en.wikipedia.org/wiki/Cordic应用程序)

任何最新的gcc版本(大约2.9倍以来)也提供了内置的sin版本__builtin_sin(),作为优化,它将用于取代对C库版本的标准调用。

我相信这是非常清楚的,但希望给你更多的信息比你期望的,和许多出发点,以了解更多自己。

其他回答

如果你想要一个软件实现,而不是硬件实现,可以在《数值公式》的第5章中找到这个问题的明确答案。我的副本在一个盒子里,所以我不能给出细节,但简短的版本(如果我没记错的话)是你把tan(theta/2)作为你的基本操作,然后从那里计算其他的。计算是用级数近似完成的,但它比泰勒级数收敛得快得多。

抱歉,我没拿到书就想不起来了。

计算正弦/余弦/正切其实很容易通过代码使用泰勒级数来实现。自己写一个只需5秒钟。

整个过程可以用这个方程来概括:

下面是我为C语言写的一些例程:

double _pow(double a, double b) {
    double c = 1;
    for (int i=0; i<b; i++)
        c *= a;
    return c;
}

double _fact(double x) {
    double ret = 1;
    for (int i=1; i<=x; i++) 
        ret *= i;
    return ret;
}

double _sin(double x) {
    double y = x;
    double s = -1;
    for (int i=3; i<=100; i+=2) {
        y+=s*(_pow(x,i)/_fact(i));
        s *= -1;
    }  
    return y;
}
double _cos(double x) {
    double y = 1;
    double s = -1;
    for (int i=2; i<=100; i+=2) {
        y+=s*(_pow(x,i)/_fact(i));
        s *= -1;
    }  
    return y;
}
double _tan(double x) {
     return (_sin(x)/_cos(x));  
}

对于罪恶,用泰勒展开可以得到

Sin (x) = x - x^3/3!+ x ^ 5/5 !- x ^ 7/7 !+……(1)

您将继续添加项,直到它们之间的差异低于可接受的容忍水平,或者只是有限的步数(更快,但不太精确)。举个例子:

float sin(float x)
{
  float res=0, pow=x, fact=1;
  for(int i=0; i<5; ++i)
  {
    res+=pow/fact;
    pow*=-1*x*x;
    fact*=(2*(i+1))*(2*(i+1)+1);
  }

  return res;
}

注:(1)适用于小角度的近似值sin(x)=x。对于更大的角度,你需要计算越来越多的项才能得到可接受的结果。 你可以使用while参数并继续,以达到一定的准确性:

double sin (double x){
    int i = 1;
    double cur = x;
    double acc = 1;
    double fact= 1;
    double pow = x;
    while (fabs(acc) > .00000001 &&   i < 100){
        fact *= ((2*i)*(2*i+1));
        pow *= -1 * x*x; 
        acc =  pow / fact;
        cur += acc;
        i++;
    }
    return cur;

}

这是一个复杂的问题。x86家族的类似intel的CPU有一个sin()函数的硬件实现,但它是x87 FPU的一部分,不再用于64位模式(使用SSE2寄存器代替)。在这种模式下,使用软件实现。

有几个这样的实现。一个在fdlibm中,在Java中使用。据我所知,glibc实现包含fdlibm的部分,以及IBM贡献的其他部分。

先验函数的软件实现,如sin(),通常使用多项式逼近,通常从泰勒级数获得。

OK kiddies, time for the pros.... This is one of my biggest complaints with inexperienced software engineers. They come in calculating transcendental functions from scratch (using Taylor's series) as if nobody had ever done these calculations before in their lives. Not true. This is a well defined problem and has been approached thousands of times by very clever software and hardware engineers and has a well defined solution. Basically, most of the transcendental functions use Chebyshev Polynomials to calculate them. As to which polynomials are used depends on the circumstances. First, the bible on this matter is a book called "Computer Approximations" by Hart and Cheney. In that book, you can decide if you have a hardware adder, multiplier, divider, etc, and decide which operations are fastest. e.g. If you had a really fast divider, the fastest way to calculate sine might be P1(x)/P2(x) where P1, P2 are Chebyshev polynomials. Without the fast divider, it might be just P(x), where P has much more terms than P1 or P2....so it'd be slower. So, first step is to determine your hardware and what it can do. Then you choose the appropriate combination of Chebyshev polynomials (is usually of the form cos(ax) = aP(x) for cosine for example, again where P is a Chebyshev polynomial). Then you decide what decimal precision you want. e.g. if you want 7 digits precision, you look that up in the appropriate table in the book I mentioned, and it will give you (for precision = 7.33) a number N = 4 and a polynomial number 3502. N is the order of the polynomial (so it's p4.x^4 + p3.x^3 + p2.x^2 + p1.x + p0), because N=4. Then you look up the actual value of the p4,p3,p2,p1,p0 values in the back of the book under 3502 (they'll be in floating point). Then you implement your algorithm in software in the form: (((p4.x + p3).x + p2).x + p1).x + p0 ....and this is how you'd calculate cosine to 7 decimal places on that hardware.

请注意,在FPU中大多数硬件实现的超越操作通常涉及一些微码和类似的操作(取决于硬件)。 切比雪夫多项式用于大多数先验多项式,但不是全部。例:使用Newton raphson方法的两次迭代,首先使用查询表,使用平方根更快。 同样,《计算机逼近》这本书会告诉你。

If you plan on implmementing these functions, I'd recommend to anyone that they get a copy of that book. It really is the bible for these kinds of algorithms. Note that there are bunches of alternative means for calculating these values like cordics, etc, but these tend to be best for specific algorithms where you only need low precision. To guarantee the precision every time, the chebyshev polynomials are the way to go. Like I said, well defined problem. Has been solved for 50 years now.....and thats how it's done.

Now, that being said, there are techniques whereby the Chebyshev polynomials can be used to get a single precision result with a low degree polynomial (like the example for cosine above). Then, there are other techniques to interpolate between values to increase the accuracy without having to go to a much larger polynomial, such as "Gal's Accurate Tables Method". This latter technique is what the post referring to the ACM literature is referring to. But ultimately, the Chebyshev Polynomials are what are used to get 90% of the way there.

享受。