我知道递归有时比循环要简洁得多,我不是在问什么时候应该用递归而不是迭代,我知道已经有很多关于这个的问题了。

我问的是,递归比循环快吗?在我看来,你总是能够细化一个循环,让它比递归函数执行得更快,因为循环是不存在的,不断地建立新的堆栈帧。

我特别在寻找在递归是正确处理数据的方法的应用程序中递归是否更快,例如在一些排序函数中,在二叉树中等等。


当前回答

这里有一个例子,在Java中递归比循环运行得快。这是一个对两个数组执行冒泡排序的程序。recBubbleSort(....)方法使用递归对数组arr排序,bbSort(....)方法仅使用循环对数组narr排序。两个数组中的数据是相同的。

public class BBSort_App {
    public static void main(String args[]) {
        int[] arr = {231,414235,23,543,245,6,324,-32552,-4};

        long time = System.nanoTime();
        recBubbleSort(arr, arr.length-1, 0);
        time = System.nanoTime() - time;

        System.out.println("Time Elapsed: "+time+"nanos");
        disp(arr);

        int[] narr = {231,414235,23,543,245,6,324,-32552,-4};
    
        time = System.nanoTime();
        bbSort(narr);
        time = System.nanoTime()-time;
        System.out.println("Time Elapsed: "+time+"nanos");
        disp(narr);
    }

    static void disp(int[] origin) {
        System.out.print("[");
        for(int b: origin)
            System.out.print(b+", ");
        System.out.println("\b\b \b]");
    }
    static void recBubbleSort(int[] origin, int i, int j) {
        if(i>0)
            if(j!=i) {
                if(origin[i]<origin[j]) {
                    int temp = origin[i];
                    origin[i] = origin[j];
                    origin[j] = temp;
                }
                recBubbleSort(origin, i, j+1);
            }
            else
                recBubbleSort(origin, i-1, 0);
    }

    static void bbSort(int[] origin) {
        for(int out=origin.length-1;out>0;out--)
            for(int in=0;in<out;in++)
                if(origin[out]<origin[in]) {
                    int temp = origin[out];
                    origin[out] = origin[in];
                    origin[in] = temp;
                }
    }
}

即使运行该测试50次,也会得到几乎相同的结果:

对这个问题的回答是令人满意的,但没有简单的例子。有人能给出为什么这个递归更快的原因吗?

其他回答

从理论上讲,两者是一样的。 具有相同O()复杂度的递归和循环将以相同的理论速度工作,但当然,实际速度取决于语言、编译器和处理器。 数幂例可用O(ln(n))迭代编码:

  int power(int t, int k) {
  int res = 1;
  while (k) {
    if (k & 1) res *= t;
    t *= t;
    k >>= 1;
  }
  return res;
  }

这里有一个例子,在Java中递归比循环运行得快。这是一个对两个数组执行冒泡排序的程序。recBubbleSort(....)方法使用递归对数组arr排序,bbSort(....)方法仅使用循环对数组narr排序。两个数组中的数据是相同的。

public class BBSort_App {
    public static void main(String args[]) {
        int[] arr = {231,414235,23,543,245,6,324,-32552,-4};

        long time = System.nanoTime();
        recBubbleSort(arr, arr.length-1, 0);
        time = System.nanoTime() - time;

        System.out.println("Time Elapsed: "+time+"nanos");
        disp(arr);

        int[] narr = {231,414235,23,543,245,6,324,-32552,-4};
    
        time = System.nanoTime();
        bbSort(narr);
        time = System.nanoTime()-time;
        System.out.println("Time Elapsed: "+time+"nanos");
        disp(narr);
    }

    static void disp(int[] origin) {
        System.out.print("[");
        for(int b: origin)
            System.out.print(b+", ");
        System.out.println("\b\b \b]");
    }
    static void recBubbleSort(int[] origin, int i, int j) {
        if(i>0)
            if(j!=i) {
                if(origin[i]<origin[j]) {
                    int temp = origin[i];
                    origin[i] = origin[j];
                    origin[j] = temp;
                }
                recBubbleSort(origin, i, j+1);
            }
            else
                recBubbleSort(origin, i-1, 0);
    }

    static void bbSort(int[] origin) {
        for(int out=origin.length-1;out>0;out--)
            for(int in=0;in<out;in++)
                if(origin[out]<origin[in]) {
                    int temp = origin[out];
                    origin[out] = origin[in];
                    origin[in] = temp;
                }
    }
}

即使运行该测试50次,也会得到几乎相同的结果:

对这个问题的回答是令人满意的,但没有简单的例子。有人能给出为什么这个递归更快的原因吗?

递归比循环快吗?

不,迭代总是比递归快。(冯·诺依曼架构)

解释:

如果你从头开始构建一个通用计算机的最小操作,“迭代”首先作为一个构建块,比“递归”更少的资源密集型,因此更快。

从头开始建造一台伪计算机:

问问你自己:你需要什么来计算一个值,即遵循一个算法并得到一个结果?

我们将建立一个概念层次结构,从头开始,首先定义基本的核心概念,然后用这些概念构建第二级概念,以此类推。

First Concept: Memory cells, storage, State. To do something you need places to store final and intermediate result values. Let’s assume we have an infinite array of "integer" cells, called Memory, M[0..Infinite]. Instructions: do something - transform a cell, change its value. alter state. Every interesting instruction performs a transformation. Basic instructions are: a) Set & move memory cells store a value into memory, e.g.: store 5 m[4] copy a value to another position: e.g.: store m[4] m[8] b) Logic and arithmetic and, or, xor, not add, sub, mul, div. e.g. add m[7] m[8] An Executing Agent: a core in a modern CPU. An "agent" is something that can execute instructions. An Agent can also be a person following the algorithm on paper. Order of steps: a sequence of instructions: i.e.: do this first, do this after, etc. An imperative sequence of instructions. Even one line expressions are "an imperative sequence of instructions". If you have an expression with a specific "order of evaluation" then you have steps. It means than even a single composed expression has implicit “steps” and also has an implicit local variable (let’s call it “result”). e.g.: 4 + 3 * 2 - 5 (- (+ (* 3 2) 4 ) 5) (sub (add (mul 3 2) 4 ) 5) The expression above implies 3 steps with an implicit "result" variable. // pseudocode 1. result = (mul 3 2) 2. result = (add 4 result) 3. result = (sub result 5) So even infix expressions, since you have a specific order of evaluation, are an imperative sequence of instructions. The expression implies a sequence of operations to be made in a specific order, and because there are steps, there is also an implicit "result" intermediate variable. Instruction Pointer: If you have a sequence of steps, you have also an implicit "instruction pointer". The instruction pointer marks the next instruction, and advances after the instruction is read but before the instruction is executed. In this pseudo-computing-machine, the Instruction Pointer is part of Memory. (Note: Normally the Instruction Pointer will be a “special register” in a CPU core, but here we will simplify the concepts and assume all data (registers included) are part of “Memory”) Jump - Once you have an ordered number of steps and an Instruction Pointer, you can apply the "store" instruction to alter the value of the Instruction Pointer itself. We will call this specific use of the store instruction with a new name: Jump. We use a new name because is easier to think about it as a new concept. By altering the instruction pointer we're instructing the agent to “go to step x“. Infinite Iteration: By jumping back, now you can make the agent "repeat" a certain number of steps. At this point we have infinite Iteration. 1. mov 1000 m[30] 2. sub m[30] 1 3. jmp-to 2 // infinite loop Conditional - Conditional execution of instructions. With the "conditional" clause, you can conditionally execute one of several instructions based on the current state (which can be set with a previous instruction). Proper Iteration: Now with the conditional clause, we can escape the infinite loop of the jump back instruction. We have now a conditional loop and then proper Iteration 1. mov 1000 m[30] 2. sub m[30] 1 3. (if not-zero) jump 2 // jump only if the previous // sub instruction did not result in 0 // this loop will be repeated 1000 times // here we have proper ***iteration***, a conditional loop. Naming: giving names to a specific memory location holding data or holding a step. This is just a "convenience" to have. We do not add any new instructions by having the capacity to define “names” for memory locations. “Naming” is not a instruction for the agent, it’s just a convenience to us. Naming makes code (at this point) easier to read and easier to change. #define counter m[30] // name a memory location mov 1000 counter loop: // name a instruction pointer location sub counter 1 (if not-zero) jmp-to loop One-level subroutine: Suppose there’s a series of steps you need to execute frequently. You can store the steps in a named position in memory and then jump to that position when you need to execute them (call). At the end of the sequence you'll need to return to the point of calling to continue execution. With this mechanism, you’re creating new instructions (subroutines) by composing core instructions. Implementation: (no new concepts required) Store the current Instruction Pointer in a predefined memory position jump to the subroutine at the end of the subroutine, you retrieve the Instruction Pointer from the predefined memory location, effectively jumping back to the following instruction of the original call Problem with the one-level implementation: You cannot call another subroutine from a subroutine. If you do, you'll overwrite the returning address (global variable), so you cannot nest calls. To have a better Implementation for subroutines: You need a STACK Stack: You define a memory space to work as a "stack", you can “push” values on the stack, and also “pop” the last “pushed” value. To implement a stack you'll need a Stack Pointer (similar to the Instruction Pointer) which points to the actual “head” of the stack. When you “push” a value, the stack pointer decrements and you store the value. When you “pop”, you get the value at the actual Stack Pointer and then the Stack Pointer is incremented. Subroutines Now that we have a stack we can implement proper subroutines allowing nested calls. The implementation is similar, but instead of storing the Instruction Pointer in a predefined memory position, we "push" the value of the IP in the stack. At the end of the subroutine, we just “pop” the value from the stack, effectively jumping back to the instruction after the original call. This implementation, having a “stack” allows calling a subroutine from another subroutine. With this implementation we can create several levels of abstraction when defining new instructions as subroutines, by using core instructions or other subroutines as building blocks. Recursion: What happens when a subroutine calls itself?. This is called "recursion". Problem: Overwriting the local intermediate results a subroutine can be storing in memory. Since you are calling/reusing the same steps, if the intermediate result are stored in predefined memory locations (global variables) they will be overwritten on the nested calls. Solution: To allow recursion, subroutines should store local intermediate results in the stack, therefore, on each recursive call (direct or indirect) the intermediate results are stored in different memory locations.

...

到达递归之后,我们就停在这里。

结论:

在冯·诺依曼架构中,“迭代”显然是一个比“递归”更简单/基本的概念。我们有一种形式的“迭代”在第7级,而“递归”在概念层次的第14级。

在机器代码中迭代总是更快,因为它意味着更少的指令,因此更少的CPU周期。

哪个“更好”?

在处理简单的、顺序的数据结构时,应该使用“迭代”,而在任何地方都可以使用“简单循环”。 当你需要处理递归数据结构(我喜欢称它们为“分形数据结构”)时,或者递归解决方案明显更“优雅”时,你应该使用“递归”。

建议:使用最好的工具,但要了解每个工具的内部工作原理,以便做出明智的选择。

最后,请注意,您有很多机会使用递归。你到处都有递归数据结构,你现在看到的是:DOM中支持你读取内容的部分是RDS, JSON表达式是RDS,计算机中的分层文件系统也是RDS,即:你有一个根目录,包含文件和目录,每个目录包含文件和目录,每个目录包含文件和目录……

考虑每个迭代和递归都必须做什么。

迭代:跳转到循环的开始 递归:跳转到被调用函数的开头

你看,这里没有多少分歧的余地。

(我假设递归是尾部调用,编译器知道这种优化)。

这取决于使用的语言。你写了'language-agnostic',所以我会举一些例子。

在Java、C和Python中,与迭代相比,递归是相当昂贵的(通常),因为它需要分配一个新的堆栈框架。在一些C编译器中,可以使用编译器标志来消除这种开销,它将某些类型的递归(实际上是某些类型的尾部调用)转换为跳转而不是函数调用。

In functional programming language implementations, sometimes, iteration can be very expensive and recursion can be very cheap. In many, recursion is transformed into a simple jump, but changing the loop variable (which is mutable) sometimes requires some relatively heavy operations, especially on implementations which support multiple threads of execution. Mutation is expensive in some of these environments because of the interaction between the mutator and the garbage collector, if both might be running at the same time.

我知道在某些Scheme实现中,递归通常比循环快。

简而言之,答案取决于代码和实现。你喜欢什么风格就用什么风格。如果使用函数式语言,递归可能更快。如果使用命令式语言,迭代可能更快。在某些环境中,这两种方法将生成相同的程序集(将其放入管道中并吸食)。

Addendum: In some environments, the best alternative is neither recursion nor iteration but instead higher order functions. These include "map", "filter", and "reduce" (which is also called "fold"). Not only are these the preferred style, not only are they often cleaner, but in some environments these functions are the first (or only) to get a boost from automatic parallelization — so they can be significantly faster than either iteration or recursion. Data Parallel Haskell is an example of such an environment.

列表推导式是另一种选择,但它们通常只是用于迭代、递归或更高阶函数的语法糖。