如果您无法实现自己的密码学,请考虑长期和困难
问题的丑陋真相是,如果你问这个问题,你可能无法设计和实现一个安全的系统。
Let me illustrate my point: Imagine you are building a web application and you need to store some session data. You could assign each user a session ID and store the session data on the server in a hash map mapping session ID to session data. But then you have to deal with this pesky state on the server and if at some point you need more than one server things will get messy. So instead you have the idea to store the session data in a cookie on the client side. You will encrypt it of course so the user cannot read and manipulate the data. So what mode should you use? Coming here you read the top answer (sorry for singling you out myforwik). The first one covered - ECB - is not for you, you want to encrypt more than one block, the next one - CBC - sounds good and you don't need the parallelism of CTR, you don't need random access, so no XTS and patents are a PITA, so no OCB. Using your crypto library you realize that you need some padding because you can only encrypt multiples of the block size. You choose PKCS7 because it was defined in some serious cryptography standards. After reading somewhere that CBC is provably secure if used with a random IV and a secure block cipher, you rest at ease even though you are storing your sensitive data on the client side.
多年以后,在您的服务确实发展到相当大的规模之后,一位IT安全专家以负责任的方式与您联系。她告诉您,她可以使用填充甲骨文攻击解密您的所有cookie,因为如果填充以某种方式被破坏,您的代码将生成一个错误页面。
这不是一个假想的场景:微软在ASP中就有这样的缺陷。NET直到几年前。
问题是关于密码学有很多陷阱,构建一个对外行来说看起来很安全,但对有知识的攻击者来说是微不足道的系统是非常容易的。
如果需要加密数据,该怎么做
For live connections use TLS (be sure to check the hostname of the certificate and the issuer chain). If you can't use TLS, look for the highest level API your system has to offer for your task and be sure you understand the guarantees it offers and more important what it does not guarantee. For the example above a framework like Play offers client side storage facilities, it does not invalidate the stored data after some time, though, and if you changed the client side state, an attacker can restore a previous state without you noticing.
If there is no high level abstraction available use a high level crypto library. A prominent example is NaCl and a portable implementation with many language bindings is Sodium. Using such a library you do not have to care about encryption modes etc. but you have to be even more careful about the usage details than with a higher level abstraction, like never using a nonce twice. For custom protocol building (say you want something like TLS, but not over TCP or UDP) there are frameworks like Noise and associated implementations that do most of the heavy lifting for you, but their flexibility also means there is a lot of room for error, if you don't understand in depth what all the components do.
如果由于某种原因,你不能使用高级加密库,例如,因为你需要以特定的方式与现有系统交互,那么就没有办法彻底自学。我推荐阅读Ferguson、Kohno和Schneier的《密码学工程》。请不要欺骗自己,相信没有必要的背景就可以构建一个安全的系统。密码学是非常微妙的,几乎不可能测试系统的安全性。
模式比较
只加密:
Modes that require padding:
Like in the example, padding can generally be dangerous because it opens up the possibility of padding oracle attacks. The easiest defense is to authenticate every message before decryption. See below.
ECB encrypts each block of data independently and the same plaintext block will result in the same ciphertext block. Take a look at the ECB encrypted Tux image on the ECB Wikipedia page to see why this is a serious problem. I don't know of any use case where ECB would be acceptable.
CBC has an IV and thus needs randomness every time a message is encrypted, changing a part of the message requires re-encrypting everything after the change, transmission errors in one ciphertext block completely destroy the plaintext and change the decryption of the next block, decryption can be parallelized / encryption can't, the plaintext is malleable to a certain degree - this can be a problem.
Stream cipher modes: These modes generate a pseudo random stream of data that may or may not depend the plaintext. Similarly to stream ciphers generally, the generated pseudo random stream is XORed with the plaintext to generate the ciphertext. As you can use as many bits of the random stream as you like you don't need padding at all. Disadvantage of this simplicity is that the encryption is completely malleable, meaning that the decryption can be changed by an attacker in any way he likes as for a plaintext p1, a ciphertext c1 and a pseudo random stream r and attacker can choose a difference d such that the decryption of a ciphertext c2=c1⊕d is p2 = p1⊕d, as p2 = c2⊕r = (c1 ⊕ d) ⊕ r = d ⊕ (c1 ⊕ r). Also the same pseudo random stream must never be used twice as for two ciphertexts c1=p1⊕r and c2=p2⊕r, an attacker can compute the xor of the two plaintexts as c1⊕c2=p1⊕r⊕p2⊕r=p1⊕p2. That also means that changing the message requires complete reencryption, if the original message could have been obtained by an attacker. All of the following steam cipher modes only need the encryption operation of the block cipher, so depending on the cipher this might save some (silicon or machine code) space in extremely constricted environments.
CTR is simple, it creates a pseudo random stream that is independent of the plaintext, different pseudo random streams are obtained by counting up from different nonces/IVs which are multiplied by a maximum message length so that overlap is prevented, using nonces message encryption is possible without per message randomness, decryption and encryption are completed parallelizable, transmission errors only effect the wrong bits and nothing more
OFB also creates a pseudo random stream independent of the plaintext, different pseudo random streams are obtained by starting with a different nonce or random IV for every message, neither encryption nor decryption is parallelizable, as with CTR using nonces message encryption is possible without per message randomness, as with CTR transmission errors only effect the wrong bits and nothing more
CFB's pseudo random stream depends on the plaintext, a different nonce or random IV is needed for every message, like with CTR and OFB using nonces message encryption is possible without per message randomness, decryption is parallelizable / encryption is not, transmission errors completely destroy the following block, but only effect the wrong bits in the current block
Disk encryption modes: These modes are specialized to encrypt data below the file system abstraction. For efficiency reasons changing some data on the disc must only require the rewrite of at most one disc block (512 bytes or 4kib). They are out of scope of this answer as they have vastly different usage scenarios than the other. Don't use them for anything except block level disc encryption. Some members: XEX, XTS, LRW.
认证加密:
To prevent padding oracle attacks and changes to the ciphertext, one can compute a message authentication code (MAC) on the ciphertext and only decrypt it if it has not been tampered with. This is called encrypt-then-mac and should be preferred to any other order. Except for very few use cases authenticity is as important as confidentiality (the latter of which is the aim of encryption). Authenticated encryption schemes (with associated data (AEAD)) combine the two part process of encryption and authentication into one block cipher mode that also produces an authentication tag in the process. In most cases this results in speed improvement.
CCM is a simple combination of CTR mode and a CBC-MAC. Using two block cipher encryptions per block it is very slow.
OCB is faster but encumbered by patents. For free (as in freedom) or non-military software the patent holder has granted a free license, though.
GCM is a very fast but arguably complex combination of CTR mode and GHASH, a MAC over the Galois field with 2^128 elements. Its wide use in important network standards like TLS 1.2 is reflected by a special instruction Intel has introduced to speed up the calculation of GHASH.
推荐:
考虑到身份验证的重要性,对于大多数用例(磁盘加密目的除外),我建议使用以下两种分组密码模式:如果数据通过非对称签名进行身份验证,则使用CBC,否则使用GCM。