既然你的第二个问题比较具体,我就先回答你的第二个问题,然后结合第二个问题的背景回答你的第一个问题。我想给出一个比现有的更有根据的答案。
问题2:大多数编译器是否意识到变量已经
被声明过,跳过这部分,或者它实际上创建了一个
每次都在记忆中找到它?
您可以通过在运行汇编程序之前停止编译器并查看asm来自己回答这个问题。(如果你的编译器有gcc风格的接口,使用-S标志,如果你想要我在这里使用的语法风格,使用-masm=intel。)
在任何情况下,对于x86-64的现代编译器(gcc 10.2, clang 11.0),如果禁用优化,它们只在每次循环传递时重新加载变量。考虑下面的c++程序——为了直观地映射到asm,我主要保持C风格,并使用整数而不是字符串,尽管同样的原则适用于字符串情况:
#include <iostream>
static constexpr std::size_t LEN = 10;
void fill_arr(int a[LEN])
{
/* *** */
for (std::size_t i = 0; i < LEN; ++i) {
const int t = 8;
a[i] = t;
}
/* *** */
}
int main(void)
{
int a[LEN];
fill_arr(a);
for (std::size_t i = 0; i < LEN; ++i) {
std::cout << a[i] << " ";
}
std::cout << "\n";
return 0;
}
我们可以将这个版本与以下不同的版本进行比较:
/* *** */
const int t = 8;
for (std::size_t i = 0; i < LEN; ++i) {
a[i] = t;
}
/* *** */
在禁用优化的情况下,对于循环中声明版本,gcc 10.2在循环的每一次循环中都在堆栈上放置8:
mov QWORD PTR -8[rbp], 0
.L3:
cmp QWORD PTR -8[rbp], 9
ja .L4
mov DWORD PTR -12[rbp], 8 ;✷
而对于循环外版本,它只执行一次:
mov DWORD PTR -12[rbp], 8 ;✷
mov QWORD PTR -8[rbp], 0
.L3:
cmp QWORD PTR -8[rbp], 9
ja .L4
这会对性能产生影响吗?在我的CPU (Intel i7-7700K)上,我没有看到它们在运行时上的显著差异,直到我将迭代次数提高到数十亿次,即使在那时,平均差异也不到0.01秒。毕竟,这只是循环中的一个额外操作。(对于字符串,循环内操作的差异显然要大一些,但不是很明显。)
而且,这个问题很大程度上是学术问题,因为在优化级别为-O1或更高时,gcc为两个源文件输出相同的asm, clang也是如此。因此,至少对于这种简单的情况,它不太可能对性能产生任何影响。当然,在实际的程序中,您应该始终进行分析,而不是进行假设。
问题#1:在循环中声明变量是一种好做法还是
糟糕的实践?
As with practically every question like this, it depends. If the declaration is inside a very tight loop and you're compiling without optimizations, say for debugging purposes, it's theoretically possible that moving it outside the loop would improve performance enough to be handy during your debugging efforts. If so, it might be sensible, at least while you're debugging. And although I don't think it's likely to make any difference in an optimized build, if you do observe one, you/your pair/your team can make a judgement call as to whether it's worth it.
At the same time, you have to consider not only how the compiler reads your code, but also how it comes off to humans, yourself included. I think you'll agree that a variable declared in the smallest scope possible is easier to keep track of. If it's outside the loop, it implies that it's needed outside the loop, which is confusing if that's not actually the case. In a big codebase, little confusions like this add up over time and become fatiguing after hours of work, and can lead to silly bugs. That can be much more costly than what you reap from a slight performance improvement, depending on the use case.