Always Encrypted cryptography
Applies to: SQL Server Azure SQL Database Azure SQL Managed Instance
This document describes encryption algorithms and mechanisms to derive cryptographic material used in the Always Encrypted feature in SQL Server and Azure SQL Database.
Keys, key stores, and key encryption algorithms
Always Encrypted uses two types of keys: Column master keys and column encryption keys.
A column master key (CMK) is a key encrypting key (for example, a key that is used to encrypt other keys) that is always in a client's control, and is stored in an external key store. An Always Encrypted-enabled client driver interacts with the key store via a CMK store provider, which can be either part of the driver library (a Microsoft/system provider) or part of the client application (a custom provider). Client driver libraries currently include Microsoft key store providers for Windows Certificate Store and hardware security modules (HSMs). For the current list of providers, see CREATE COLUMN MASTER KEY (Transact-SQL). An application developer can supply a custom provider for an arbitrary store.
A column encryption key (CEK), is a content encryption key (for example, a key that is used to protect data) that is protected by a CMK.
All Microsoft CMK store providers encrypt CEKs by using RSA with Optimal Asymmetric Encryption Padding (RSA-OAEP). The key store provider that supports Microsoft Cryptography API: Next Generation (CNG) in .NET Framework (SqlColumnEncryptionCngProvider Class) uses the default parameters specified by RFC 8017 in Section A.2.1. Those default parameters are using a hash function of SHA-1 and a mask generation function of MGF1 with SHA-1. All other key store providers use SHA-256.
Always Encrypted internally uses FIPS 140-2 validated cryptographic modules.
Data Encryption Algorithm
Always Encrypted uses the AEAD_AES_256_CBC_HMAC_SHA_256 algorithm to encrypt data in the database.
AEAD_AES_256_CBC_HMAC_SHA_256 is derived from the specification draft at https://tools.ietf.org/html/draft-mcgrew-aead-aes-cbc-hmac-sha2-05. It uses an Authenticated Encryption scheme with Associated Data, following an Encrypt-then-MAC approach. That is, the plaintext is first encrypted, and the MAC is produced based on the resulting ciphertext.
In order to conceal patterns, AEAD_AES_256_CBC_HMAC_SHA_256 uses the Cipher Block Chaining (CBC) mode of operation, where an initial value is fed into the system named the initialization vector (IV). The full description of the CBC mode can be found at https://csrc.nist.gov/publications/nistpubs/800-38a/sp800-38a.pdf.
AEAD_AES_256_CBC_HMAC_SHA_256 computes a ciphertext value for a given plaintext value using the following steps.
Step 1: Generating the initialization vector (IV)
Always Encrypted supports two variations of AEAD_AES_256_CBC_HMAC_SHA_256:
Randomized
Deterministic
For randomized encryption, the IV is randomly generated. As a result, each time the same plaintext is encrypted, a different ciphertext is generated, which prevents any information disclosure.
When using randomized encryption: IV = Generate cryptographicaly random 128bits
If there's deterministic encryption, the IV isn't randomly generated, but instead it's derived from the plaintext value using the following algorithm:
When using deterministic encryption: IV = HMAC-SHA-256( iv_key, cell_data ) truncated to 128 bits.
Where iv_key is derived from the CEK in the following way:
iv_key = HMAC-SHA-256(CEK, "Microsoft SQL Server cell IV key" + algorithm + CEK_length)
The HMAC value truncation is performed to fit one block of data as needed for the IV. As a result, deterministic encryption always produces the same ciphertext for a given plaintext value, which enables inferring whether two plaintext values are equal by comparing their corresponding ciphertext values. This limited information disclosure allows the database system to support equality comparison on encrypted column values.
Deterministic encryption is more effective in concealing patterns, compared to alternatives, such as using a pre-defined IV value.
Step 2: Computing AES_256_CBC Ciphertext
After computing the IV, the AES_256_CBC ciphertext is generated:
aes_256_cbc_ciphertext = AES-CBC-256(enc_key, IV, cell_data) with PKCS7 padding.
Where the encryption key (enc_key) is derived from the CEK as follows.
enc_key = HMAC-SHA-256(CEK, "Microsoft SQL Server cell encryption key" + algorithm + CEK_length )
Step 3: Computing MAC
Subsequently, the MAC is computed using the following algorithm:
MAC = HMAC-SHA-256(mac_key, versionbyte + IV + Ciphertext + versionbyte_length)
Where:
versionbyte = 0x01 and versionbyte_length = 1
mac_key = HMAC-SHA-256(CEK, "Microsoft SQL Server cell MAC key" + algorithm + CEK_length)
Step 4: Concatenation
Finally, the encrypted value is produced by concatenating the algorithm version byte, the MAC, the IV, and the AES_256_CBC ciphertext:
aead_aes_256_cbc_hmac_sha_256 = versionbyte + MAC + IV + aes_256_cbc_ciphertext
Ciphertext Length
The lengths (in bytes) of particular components of AEAD_AES_256_CBC_HMAC_SHA_256 ciphertext are:
versionbyte: 1
MAC: 32
IV: 16
aes_256_cbc_ciphertext:
(FLOOR (DATALENGTH(cell_data)/ block_size) + 1)* block_size
, where:block_size is 16 bytes
cell_data is a plaintext value
Therefore, the minimal size of aes_256_cbc_ciphertext is 1 block, which is 16 bytes.
Thus, the length of ciphertext, resulting from encrypting a given plaintext values (cell_data), can be calculated using the following formula:
1 + 32 + 16 + (FLOOR(DATALENGTH(cell_data)/16) + 1) * 16
For example:
A 4-byte long int plaintext value becomes a 65-byte long binary value after encryption.
A 2,000-byte long nchar(1000) plaintext values becomes a 2,065-byte long binary value after encryption.
The following table contains a complete list of data types and the length of ciphertext for each type.
Data Type | Ciphertext Length [bytes] |
---|---|
bigint | 65 |
binary | Varies. Use the formula above. |
bit | 65 |
char | Varies. Use the formula above. |
date | 65 |
datetime | 65 |
datetime2 | 65 |
datetimeoffset | 65 |
decimal | 81 |
float | 65 |
geography | N/A (not supported) |
geometry | N/A (not supported) |
hierarchyid | N/A (not supported) |
image | N/A (not supported) |
int | 65 |
money | 65 |
nchar | Varies. Use the formula above. |
ntext | N/A (not supported) |
numeric | 81 |
nvarchar | Varies. Use the formula above. |
real | 65 |
smalldatetime | 65 |
smallint | 65 |
smallmoney | 65 |
sql_variant | N/A (not supported) |
sysname | N/A (not supported) |
text | N/A (not supported) |
time | 65 |
timestamp (rowversion) |
N/A (not supported) |
tinyint | 65 |
uniqueidentifier | 81 |
varbinary | Varies. Use the formula above. |
varchar | Varies. Use the formula above. |
xml | N/A (not supported) |
.NET Reference
For details about the algorithms, discussed in this document, see the SqlAeadAes256CbcHmac256Algorithm.cs, SqlColumnEncryptionCertificateStoreProvider.cs, and SqlColumnEncryptionCertificateStoreProvider.cs files in the .NET Reference.