Transparent Data Encryption - SQL Server 2008

SQL Server 2008 introduces a new encryption option known as transparent data encryption (TDE). TDE encrypts every page of your entire database and automatically decrypts each page as required during access. This feature allows you to secure your entire database without worrying about the minutiae of column-level encryption. It has the added benefit of allowing you to secure your database transparently with no changes to your front-end applications. TDE does not require extra storage space, and it can generate far more efficient query plans than queries on data encrypted at the column level, since TDE allows SQL Server to use proper indexes. The downside to TDE is that it incurs additional overhead, since SQL Server must decrypt pages of data with every query. We discuss the effects of encryption on search efficiency in the “Query Efficiency” section later in this chapter.

Enabling TDE

To enable TDE and encrypt a database, you must first create a DMK and server certificate in the master database:

USE master;
WITH SUBJECT = 'Certificate to Encrypt AdventureWorks DB',
EXPIRY_DATE = '2022-12-31';

Notice that the master key and certificate creation statements are the same as in any other instance, except that they are created in the master database.

The second step is to create a database encryption key and turn on encryption in the database you want to secure. The following example turns on TDE in the AdventureWorks database.

USE AdventureWorks;

The new CREATE DATABASE ENCRYPTION KEY statement creates the database encryption key for the AdventureWorks database. With this statement, we specify that the database encryption key is secured by the server certificate AdvWorksCert we previously created, and we specify use of the three-key Triple DES algorithm. The algorithms available to the CREATE DATABASE ENCRYPTION KEY statement are limited to TRIPLE_DES_3KEY, AES_128, AES_192, and AES_256.

In addition to CREATE DATABASE ENCRYPTION KEY, T-SQL provides equivalent DROP and ALTER statements to manage database encryption keys. There is no backup statement, and SQL Server provides no way to export a database encryption key from your database.

To properly secure your database, TDE must take some additional steps when it is turned on. One step that TDE takes is to zero out and encrypt your database transaction log. That way, a hacker with a log reader will not be able to reconstruct your sensitive data from the logs. TDE also encrypts the tempdb database. This prevents hackers from gaining access to sensitive data temporarily stored during processing. The fact that TDE encrypts the tempdb database can slow down processing in other databases on the same server, and you might gain performance improvements by placing encrypted databases on a SQL Server instance separate from your unencrypted databases.

Choosing Between TDE and Column-Level Encryption

TDE offers the advantage of being transparent to client applications, and it does a thorough job of securing your data. A disadvantage of TDE is that it incurs CPU and memory overhead during every database access, since whole pages of data need to be decrypted for any given database access. TDE also encrypts the tempdb database, which can affect the performance of every other database on the same SQL Server instance.

Column-level encryption has the advantages of providing pinpoint precision and additional flexibility when securing your data. The main cons of column-level encryption are that it requires additional programming, so you may need to change existing client applications, and it cannot effectively utilize indexes on encrypted data to optimize queries (see the “Query Efficiency” section later in this chapter).

So with these encryption options available to secure your SQL Server data, how do you decide which to use and when? Most real-world scenarios cover a wide range of possibilities. The following are some possible scenarios and the related considerations when deciding which method to use:

  • Only a relatively small portion of your data requires encryption. In this scenario, it makes sense to avoid the overhead of TDE in favor of the more targeted approach provided by column-level encryption. If most or all of your data needs to be encrypted, however, TDE is the method of choice.
  • You must search on encrypted columns. Keeping in mind that search efficiency and security are opposing goals, you should make every attempt to avoid designing systems that need to search on encrypted columns. If you absolutely must search on encrypted columns, TDE will provide far better efficiency, since column-level encryption will always result in a scan.
  • Regulatory requirements dictate the requirements. In some cases, government regulations at the local, state, and federal levels dictate how much of your data needs to be encrypted. Usually, these regulations deal with confidential consumer, credit, and health-care information. You may also have other industry-wide requirements that mandate additional protections on sensitive data. More often than not, however, the regulations will state that “confidential” data must be encrypted, and it falls on you to determine how much of your data is affected by these requirements.
  • Contractual obligations state the requirements. When dealing with credit card companies, credit bureaus, and credit issuers of any kind, security requirements that your company must implement are often spelled out to the nth degree in contract form. These contracts might specify several key items, such as which data must be encrypted, the algorithm that must be used to encrypt and decrypt the data, and logging requirements for data changes or queries.
  • You are supporting front-end applications that cannot be changed. It’s not uncommon for information technology (IT) managers and database administrators (DBAs) to support inherited legacy databases and client applications. If the source code for the application has been long lost, or there is no time or resources to modify the existing application, it makes sense to use TDE to secure the database, since the encryption it provides is transparent to client applications.
  • You need additional flexibility during encryption. In some instances, you may need additional flexibility, such as a hard requirement to encrypt your symmetric key with an asymmetric key instead of a certificate, or the need to specify an authenticator during encryption, or even the need to encrypt data by passphrase instead of by key. In these cases, you may need to use column-level encryption instead of TDE. This isn’t as convincing a scenario as most of the others, however, and you probably would be better off revisiting the specifications if this is the deciding factor.

As DBAs, developers, and especially IT managers learn about TDE, we expect that there will be a lot of people who will decide to take the easy path by “encrypting the whole database and forgetting about it.” That’s unfortunate, because in many situations, that plan will amount to serious overkill and will force the server to perform a lot of unnecessary encryption and decryption, tying up server resources that could be used elsewhere. We highly recommend that you thoroughly consider the pros and cons of TDE and column-level encryption before deciding on an encryption strategy for your databases.

Extensible Key Management

In addition to TDE, SQL Server 2008 includes a new feature known as Extensible Key Management (EKM). EKM allows you to use the Microsoft Cryptographic API (CryptoAPI) for encryption and key generation.

EKM support is designed to allow third-party vendors to offer encryption key-generation hardware and other tools, known as Hardware Security Modules (HSMs). HSM vendors can offer a lot of advantages over the standard built-in encryption functionality, including hardware-accelerated encryption and decryption, bulk encryption and decryption, and additional encryption key management. An HSM might be a smart card, a USB or flash device, or a specialized external module.

The T-SQL encryption Data Manipulation Language (DML) statements, like CREATE SYMMETRIC KEY, now include the FROM PROVIDER clause and a CREATE_DISPOSITION option to provide support for third-party EKM.

EKM/HSM functionality is vendor specific, and your HSM vendor will provide specific instructions for implementing its EKM/HSM solution in your environment.

Encryption Without Keys

In addition to using certificates, asymmetric keys, and symmetric keys, you can encrypt your data using passphrases. A passphrase is a string or binary value from which SQL Server can derive a symmetric key to encrypt your data.

The EncryptByPassPhrase and DecryptByPassPhrase functions allow you to use this type of encryption

DECLARE @plain_text nvarchar(1000),
@enc_text varbinary(2000);
SET @plain_text = N'Ask not what your country can do for you...';
SET @enc_text = EncryptByPassPhrase(N'E Pluribus Unum', @plain_text);
SELECT 'Original plain text = ', @plain_text;
SELECT 'Encrypted text = ', @enc_text;
SELECT 'Decrypted plain text = ',
CAST(DecryptByPassPhrase(N'E Pluribus Unum', @enc_text) AS nvarchar(1000));

EncryptByPassPhrase accepts the plain text that you want to encrypt. DecryptByPassPhrase, on the other hand, accepts the previously encrypted cipher text that will be decrypted. For both functions, you can add an authenticator value to further obfuscate your encrypted text, as follows:

SET @enc_text = EncryptByPassPhrase(N'E Pluribus Unum', @plain_text,
1, N'Authentic');

Both functions return a varbinary(8000) value. After you use DecryptByPassPhrase, you may need to cast your result back to another datatype, such as varchar or nvarchar, as in the preceding example.

Hashing and Signing Data

Prior to SQL Server 2005, T-SQL included a couple of very simple, very basic hash functions: CHECKSUM and BINARY_CHECKSUM. Neither of these hash functions is collision-free, and both return a 32-bit hash, which is well below the minimum length recommended by cryptographic experts for secure applications.

Introduced in SQL Server 2005, the HashBytes function accepts the name of a hash algorithm and an input string:

SELECT HashBytes ('SHA1', 'Now is the time for all good men...');

The hash algorithm used in the example is SHA-1. You can use MD2, MD4, MD5, SHA, or SHA-1 for this parameter. The former three are the Message Digest algorithms, which generate 128-bit hashes of the input. The latter two are the Secure Hash Algorithm, which generates a 160-bit digest of the input.

The input to the HashBytes function is a varchar, an nvarchar, or a varbinary value. The result of HashBytes is always a varbinary value with amaximum length of 8,000 bytes.

SQL Server also provides functions to sign data with certificates and asymmetric keys, and to verify those signatures. This is useful for protecting the integrity of sensitive data, since any small change in the data will affect the signature. The SignByCert and SignByAsymKey functions sign your data with a certificate or an asymmetric key and return the signature as a varbinary. The length of the signature depends on the length of the certificate or asymmetric key’s private key. A 2,048-bit private key generates a 256-byte signature; a 1,024-bit private key generates a 128-byte signature; and so on.

SignByCert ( certificate_ID, plaintext, password )
SignByAsymKey ( asym_key_ID, plaintext, password )

The SignByCert function accepts a certificate ID, which can be retrieved with the Cert_ID function. The SignByAsymKey function accepts the asymmetric key ID, which is retrieved with the AsymKey_ID function. The plaintext parameter in both functions is the plain text to be signed—a char, a varchar, an nchar, or an nvarchar value. The password is the password required to decrypt the certificate or asymmetric key, if it is protected by password.

You can verify previously signed data with the VerifySignedByCert and VerifySignedByAsymKey functions, which have the following format:

VerifySignedByCert ( certificate_ID, plaintext, signature )
VerifySignedByAsymKey ( asym_key_ID, plaintext, signature )

The VerifySignedByCert and VerifySignedByAsymKey functions accept a certificate ID and an asymmetric key ID, respectively. The plaintext parameter of both functions is the plain text that was previously signed, and the signature parameter is the varbinary signature that was generated. These two functions generate the signature for the plaintext value and compare the newly generated signature to the signature you pass in to the function. Both functions return a 1 if the data matches the signature, or a 0 if the data and signature do not match.

Security Catalog Views

SQL Server 2008 provides several security catalog views and a dynamic management view, all of which can be used to retrieve information about encryption functionality. The following views are available in SQL Server 2008:

  • sys.asymmetric_keys: This catalog view returns information about the asymmetric key pairs installed in the current database. The information returned by this view includes the name, asymmetric key ID, private key encryption type, encryption algorithm used, public key, and additional information about each installed asymmetric key pair.
  • sys.certificates: This catalog view returns information about the certificates installed in the current database. The information returned by this view is similar to that returned by the sys.asymmetric_keys view. It includes the name, certificate ID, private key encryption type, name of the certificate’s issuer, certificate serial number, and additional certificate-specific information (such as subject, start date, and expiration date).
  • sys.crypt_properties: This catalog view returns a row for each cryptographic property associated with a securable in the database. The information returned about each securable includes the class of the securable, ID of the securable, encryption type used, and SHA-1 hash of the certificate or asymmetric key used to encrypt the securable.
  • sys.dm_database_encryption_keys: This dynamic management view returns information about the encryption state of a database and the encryption keys used in the database. Some of the values returned in the encryption_state column of this view are 0 if no encryption is present, 1 if the database is unencrypted, 3 when the database is encrypted, or another value indicating a database encryption or decryption action is currently in progress.
  • sys.key_encryptions: This catalog view returns a row for every key encryption, as specified by the CREATE SYMMETRIC KEY statement’s ENCRYPTION BY clause. Information returned includes the ID of the encrypted key, encryption type, and thumbprint of the certificate or symmetric key used to encrypt the key. A thumbprint, in terms of SQL Server 2008 security catalog views, is an SHA-1 hash of a certificate or an asymmetric key, or a GUID for a symmetric key. Several of the security catalog views return a thumbprint of certificates, asymmetric keys, or symmetric keys.
  • sys.master_key_passwords: This catalog view returns a row for each DMK password added with the sp_control_dbmasterkey_password stored procedure. Each row returns an ID of the credential to which the password belongs and a GUID of the original database at creation time. The GUID is used by SQL Server to identify credentials that may contain passwords that protect the DMK in the event that automatic decryption fails. Passwords used to protect the DMKs are stored in the credential store.
  • sys.openkeys: This catalog view returns information about all open encryption keys in the current session. Information returned includes the ID and name of the database that contains the key; IDs, names, and GUIDs of each open key; and the date and time the key was opened.
  • sys.symmetric_keys: This catalog view returns a row for each symmetric key in the database. Information returned includes the name, ID, GUID, length, and algorithm of the symmetric key. Also returned are the ID of the principal who owns the key and the dates that the symmetric key was first created and last modified.

Query Efficiency

As we mentioned earlier in the chapter, SQL Server automatically generates a random IV to help prevent statistical analysis attacks on columns of data. The need to eliminate patterns from encrypted data is at odds with the need to index and quickly search the same data. Indexing takes advantage of these patterns to organize data for efficient search and retrieval.

A hacker who knows the relative frequency with which certain pieces of encrypted data occur in a given column could use that information to deduce even further information about it. For example, a corporate database containing employee information in a table encrypted without the use of random IVs might leak additional information from the patterns provided. Consider the HumanResources.Employee table in the AdventureWorks database. Most of the executive and managerial titles occur only once, while the lower-level positions may occur dozens of times. A hacker might be able to infer additional information from this pattern, including information about which employees are paid the most. The hacker might use knowledge like this to help focus his attack. SQL Server’s random IV generation helps to eliminate these patterns from encrypted data. This has two main implications for T-SQL developers:

  • The same IV used during encryption is required during decryption.
  • The encryption functions are nondeterministic, which means that encrypting the same plain text multiple times with the same key will not generate the same encrypted text.

The nondeterministic nature of the SQL 2008 encryption functions makes it useless to index an encrypted column directly. Searching encrypted columns requires decrypting every value in the column and comparing them one by one. This is very inefficient and can be a bottleneck in your applications if your tables are large.

Some methods have been suggested for increasing the efficiency of searching encrypted data. These methods generally include storing a hash of the encrypted data for indexing. The main problem with these methods is that they reintroduce the statistical patterns eliminated by the random IVs.

You can take several approaches to strike a balance between data security and search efficiency. The most important recommendation is to not encrypt columns you will use heavily in your query search criteria (WHERE clause), sort criteria (ORDER BY clause), or grouping (GROUP BY clause).

However, sometimes you might not have a choice—you may need to encrypt a column that is part of your WHERE clause or other query criteria. One thing you can do to make this more efficient is to narrow down your results using other criteria involving nonencrypted columns first.

You can create a “pseudo-index” of your data by adding an additional column to your table with a one-way hash code of your plain text, and creating an index on that column. The built-in SQL Server 2008 HashBytes function can be used to generate a one-way MD5, SHA-1, or other hash value of your plain text and store it in the new column. Indexing this new plain text hash value column can make equality searches (using the T-SQL = operator) much more efficient. Range searches (operators like <, >, BETWEEN, and so on), however, cannot be used on hashed or encrypted data.

One of the implications of pseudo-indexing with a hash value is that it once again opens the door for statistical analysis attacks using the hash values as a guide. Using a hash value as an index also makes dictionary attacks against the hashed values possible. A dictionary attack is one in which a hacker uses a large list of plain text values to try to guess the plain text of a hashed or an encrypted value by brute force.

Another method of pseudo-indexing encrypted data is a variation on the previous method, except that it uses a hashed message authentication code (HMAC) in place of the hash value. The HMAC basically takes a “secret” value, combines it with the plain text, and generates a hash value based on that data. Although the HMAC method provides protection against dictionary attacks, it doesn’t provide any additional protection against statistical analysis.

The main thing to consider when using SQL Server’s data-encryption facilities is that encryption and search efficiency are opposing goals. The purpose of encryption is data security, often at the expense of search efficiency. While you can use the methods suggested here to increase the efficiency of SQL queries on encrypted data, the hash value and HMAC index methods require more storage and can actually circumvent SQL Server’s protection against statistical analysis (via random IV).

All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd Protection Status

SQL Server 2008 Topics