保护可执行文件不受逆向工程的影响?

我一直在思考如何保护我的C/ c++代码不被反汇编和逆向工程破坏。通常情况下，在我的代码中，我绝不会宽恕这种行为;然而，为了各种人的安全，我目前正在研究的协议决不能被检查或理解。

现在这对我来说是一个新的主题，互联网并没有真正的资源来防止逆向工程，而是描述了大量关于如何逆向工程的信息

到目前为止我想到的一些事情是:

Code injection (calling dummy functions before and after actual function calls) Code obfustication (mangles the disassembly of the binary) Write my own startup routines (harder for debuggers to bind to) void startup(); int _start() { startup( ); exit (0) } void startup() { /* code here */ } Runtime check for debuggers (and force exit if detected) Function trampolines void trampoline(void (*fnptr)(), bool ping = false) { if(ping) fnptr(); else trampoline(fnptr, true); } Pointless allocations and deallocations (stack changes a lot) Pointless dummy calls and trampolines (tons of jumping in disassembly output) Tons of casting (for obfuscated disassembly)

我的意思是，这些是我想过的一些事情，但它们都可以在适当的时间框架内由代码分析师解决。我还有别的选择吗?

当前回答

我不认为任何代码都是牢不可破的，但奖励必须非常棒，才能让人们愿意尝试它。

话虽如此，你还是应该做以下事情:

Use the highest optimization level possible (reverse engineering is not only about getting the assembly sequence, it is also about understanding the code and porting it into a higher-level language such as C). Highly optimized code can be a b---h to follow. Make structures dense by not having larger data types than necessary. Rearrange structure members between official code releases. Rearranged bit fields in structures are also something you can use. You can check for the presence of certain values which shouldn't be changed (a copyright message is an example). If a byte vector contains "vwxyz" you can have another byte vector containing "abcde" and compare the differences. The function doing it should not be passed pointers to the vectors but use external pointers defined in other modules as (pseudo-C code) "char *p1=&string1[539];" and "char p2=&string2[-11731];". That way there won't be any pointers pointing exactly at the two strings. In the comparison code you then compare for "(p1-539+i)-*(p2+11731+i)==some value". The cracker will think it is safe to change string1 because no one appears to reference it. Bury the test in some unexpected place.

尝试自己破解汇编代码，看看哪些是容易的，哪些是困难的。您可以尝试一些想法，使代码更难进行反向工程，并使调试更加困难。

2011-06-27 11:11:54

其他回答

最好的反反汇编技巧，特别是在可变字长指令集上，是在汇编程序/机器代码中，而不是在c中

CLC
BCC over
.byte 0x09
over:

The disassembler has to resolve the problem that a branch destination is the second byte in a multi byte instruction. An instruction set simulator will have no problem though. Branching to computed addresses, which you can cause from C, also make the disassembly difficult to impossible. Instruction set simulator will have no problem with it. Using a simulator to sort out branch destinations for you can aid the disassembly process. Compiled code is relatively clean and easy for a disassembler. So I think some assembly is required.

I think it was near the beginning of Michael Abrash's Zen of Assembly Language where he showed a simple anti disassembler and anti-debugger trick. The 8088/6 had a prefetch queue what you did was have an instruction that modified the next instruction or a couple ahead. If single stepping then you executed the modified instruction, if your instruction set simulator did not simulate the hardware completely, you executed the modified instruction. On real hardware running normally the real instruction would already be in the queue and the modified memory location wouldnt cause any damage so long as you didnt execute that string of instructions again. You could probably still use a trick like this today as pipelined processors fetch the next instruction. Or if you know that the hardware has a separate instruction and data cache you can modify a number of bytes ahead if you align this code in the cache line properly, the modified byte will not be written through the instruction cache but the data cache, and an instruction set simulator that did not have proper cache simulators would fail to execute properly. I think software only solutions are not going to get you very far.

上面这些都是老的和众所周知的，我对当前的工具了解不够，不知道它们是否已经围绕这些事情工作了。自修改代码可能/将使调试器出错，但是人类可以/将缩小问题范围，然后看到自修改代码并解决它。

It used to be that the hackers would take about 18 months to work something out, dvds for example. Now they are averaging around 2 days to 2 weeks (if motivated) (blue ray, iphones, etc). That means to me if I spend more than a few days on security, I am likely wasting my time. The only real security you will get is through hardware (for example your instructions are encrypted and only the processor core well inside the chip decrypts just before execution, in a way that it cannot expose the decrypted instructions). That might buy you months instead of days.

另外，读读凯文·米特尼克的《欺骗的艺术》。这样的人可以拿起电话，让你或同事把秘密交给系统，以为那是公司其他部门的经理、其他同事或硬件工程师。你的安全系统也被破坏了。安全不仅仅是管理技术，还要管理人。

2011-06-26 05:25:20

如果有人想花时间来反转你的二进制文件，那么你绝对无法阻止他们。你可以适度增加难度，但仅此而已。如果您真的想了解这一点，请获取http://www.hex-rays.com/idapro/的副本并分解一些二进制文件。

CPU需要执行代码的事实是你的失败。CPU只执行机器代码…程序员可以阅读机器代码。

话虽如此……你可能有不同的问题，可以用另一种方式解决。你想保护什么?根据您的问题，您可以使用加密来保护您的产品。

2011-06-26 03:56:49

要了解自己，请阅读有关代码混淆的学术文献。亚利桑那大学的克里斯蒂安·科尔伯格是这一领域的著名学者;哈佛大学的Salil Vadhan也做了一些不错的工作。

我在这方面落后了，但我知道的基本思想是，你不能阻止攻击者看到你将执行的代码，但你可以用没有执行的代码包围它，攻击者花费指数级的时间(使用最知名的技术)来发现你的代码的哪些片段被执行了，哪些没有。

2011-06-27 03:55:12

Take, for example, the AES algorithm. It's a very, very public algorithm, and it is VERY secure. Why? Two reasons: It's been reviewed by lots of smart people, and the "secret" part is not the algorithm itself - the secret part is the key which is one of the inputs to the algorithm. It's a much better approach to design your protocol with a generated "secret" that is outside your code, rather than to make the code itself secret. The code can always be interpreted no matter what you do, and (ideally) the generated secret can only be jeopardized by a massive brute force approach or through theft.

我认为一个有趣的问题是“为什么你想让你的代码变得模糊?”你想让攻击者难以破解你的算法?让他们更难在你的代码中发现可利用的漏洞?如果代码一开始就不可破解，那么您就不需要混淆代码。问题的根源在于易破解的软件。解决问题的根源，不要只是混淆它。

而且，你的代码越混乱，你就越难找到安全漏洞。是的，这对黑客来说很难，但你也需要找到漏洞。从现在开始，代码应该很容易维护，即使是编写良好的清晰代码也很难维护。不要让事情变得更糟。

2011-06-27 14:00:39

最近有一篇论文叫做“程序混淆和一次性程序”。如果你真的想保护你的应用程序。本文主要围绕使用简单通用硬件的理论不可能结果。

如果你负担不起额外的硬件，那么还有另一篇论文，在所有具有相同功能和相同大小的程序中，给出了理论上的最佳可能混淆“On最佳可能混淆”。然而，本文表明，信息理论的最佳可能意味着多项式层次结构的崩溃。

如果这些结果不能满足你的需要，这些论文至少应该给你足够的参考书目引导去查阅相关文献。

更新:一种新的混淆概念，称为不可区分混淆，可以减轻不可能性结果(论文)

2011-06-26 20:25:51

保护可执行文件不受逆向工程的影响?

推荐文章

最新文章

标签