SNAPSHOT Isolation Level - SQL Server 2008

A common problem in database systems is that of blocking and concurrency. The system needs to ensure that a reader gets consistent data, so writes cannot take place during a read. Unfortunately, larger systems often fall victim to huge scalability bottlenecks due to blocking problems. DBAs must constantly do battle with queries, attempting to control lock granularities and transaction lengths in order to keep blocking to a minimum. But after a while, many give up and take an easier route, risking getting some inconsistent data from time to time by using “dirty reads,” the READ UNCOMMITTED transaction isolation level, or the NOLOCK table hint.

Those days have come to an end with the SNAPSHOT isolation level and database snapshots features within the SQL Server database engine. These features provide mechanisms for readers to get consistent, committed data, while allowing writers to work unabated. Simply put, this means no more blocking and no more inconsistent data.

Snapshots represent the best of both worlds, but they have a cost. DBAs will pay a disk I/O penalty when using these new features due to the overhead of maintaining previous versions of rows.

SNAPSHOT Isolation Level

The SNAPSHOT isolation level can best be described as a combination of the consistency of the REPEATABLE READ isolation level with the nonblocking characteristics of the READ UNCOMMITTED isolation level. Transactions in the SNAPSHOT isolation level will not create shared locks on rows being read. And repeated requests for the same data within a SNAPSHOT transaction guarantee the same results.

This nonblocking behavior is achieved by storing previous committed versions of rows in the tempdb database. When an update or a delete operation occurs, the previous version of the row is copied to tempdb and a pointer to the previous version is left with the current version. Readers that started transactions before the write and that have already read the previous version will continue to read that version. Meanwhile, the write can occur, and other transactions will see the new version.

This is a definite improvement over the behavior of either the REPEATABLE READ or READ UNCOMMITTED isolation levels. The REPEATABLE READ isolation level creates shared locks for the duration of the read transaction, thereby blocking any writers. And the READ UNCOMMITTED isolation level, while not creating locks, will also not return consistent, committed data if there are updates occurring at the same time that the transaction is reading the data.

Due to its being used as a repository for maintaining data changes, the tempdb database will see greatly increased activity when the SNAPSHOT isolation level is used for write-intensive databases. To avoid problems, the isolation level should not be enabled by DBAs arbitrarily. Specific behaviors that indicate that the isolation level may be helpful include performance issues due to blocking, deadlocked transactions, and previous use of the READ UNCOMMITTED isolation level to promote increased concurrency. Before enabling the isolation level in a production environment, test carefully to ensure that tempdb can handle the additional load.

In addition to the SNAPSHOT isolation level, the READ COMMITTED isolation level can be enhanced to behave like SNAPSHOT isolation for individual queries not within a transaction. The enhanced version is called READ COMMITTED SNAPSHOT.

Enabling SNAPSHOT Isolation for a Database

Use of the SNAPSHOT isolation level is not allowed by default in SQL Server databases. Enabling it for production databases should be done only after careful testing in a development or quality-assurance environment. The row-versioning feature that allows the isolation level to work requires stamping every row in the database with a 14-byte structure that includes a unique identifier and a pointer to the previous versions of the row in tempdb. The extra 14-byte overhead per row and the work required to maintain the previous versions can add up to quite a bit of extra disk I/O, which is why the feature is OFF by default (except in the master and msdb system databases, in which it is ON by default; these databases are small enough that the additional I/O will not cause problems). If you don’t actually need row-versioning capabilities, do not turn it on.

Two options are available for enabling row versioning in a database. One is for the SNAPSHOT isolation level itself. The second is for the READ COMMITTED SNAPSHOT isolation level. Both of these are options on ALTER DATABASE, and both are OFF by default. No users can be connected to the database when enabling or disabling row versioning. The READ_COMMITTED_SNAPSHOT option cannot be enabled in the master, tempdb, or msdb system databases.

For example, to allow the SNAPSHOT isolation level to be used for a database called Sales, the following T-SQL would be used:

ALTER DATABASE Sales
SET ALLOW_SNAPSHOT_ISOLATION ON

For the READ COMMITTED SNAPSHOT isolation level, the following T-SQL would be used:

ALTER DATABASE Sales
SET READ_COMMITTED_SNAPSHOT ON

Note that these options are independent of each other—either or both can be on for a database. However, since they use the same row-versioning mechanism behind the scenes, turning a second one on once the first is enabled will incur no additional overhead.

To disable either of these options, simply change the flag to OFF:

ALTER DATABASE Sales
SET ALLOW_SNAPSHOT_ISOLATION OFF

To find out which databases allow the SNAPSHOT isolation level or use the READ COMMITTED SNAPSHOT isolation level, you can query the sys.databases catalog view. The snapshot_isolation_state and is_read_committed_snapshot_on columns will contain 1 if either option is enabled or 0 otherwise. The view can be queried for the Sales database using the following T-SQL:

SELECT
name,
snapshot_isolation_state,
is_read_committed_snapshot_on
FROM sys.databases

Enabling SNAPSHOT Isolation for a Transaction

Once the SNAPSHOT isolation level is turned on for a database, it can be set for a transaction using SET TRANSACTION ISOLATION LEVEL SNAPSHOT. Its behavior as compared to other isolation levels is best illustrated with a hands-on example.

The following table is created in a database with row versioning enabled:

CREATE TABLE TestSnapshot
(
ColA INT,
ColB VARCHAR(20)
)
INSERT TestSnapshot (ColA, ColB)
VALUES (1, 'Original Value')

Now assume that two SQL Server Management Studio connections are open to the database. In the first, the following T-SQL is executed:

SET TRANSACTION ISOLATION LEVEL SNAPSHOT
BEGIN TRANSACTION
SELECT ColB
FROM TestSnapshot
WHERE ColA = 1

This query returns the value 'Original Value' for ColB.

With the transaction still running, the following T-SQL is executed in the second connection:

UPDATE TestSnapshot
SET ColB = 'New Value'
WHERE ColA = 1

This update will execute successfully and will return the message '(1 row(s) affected)'. Had the REPEATABLE READ isolation level been used in the first connection, the update would have been blocked waiting for the transaction to finish.

Back in the first window, the SELECT can be run again. It will still return the value 'Original Value', even though the actual value has been updated. Had the transaction been using the READ UNCOMMITTED isolation level, results would not be consistent between reads and the value returned the second time would have been 'New Value'.

This is a very simple example to show the power of the SNAPSHOT isolation level to deliver consistent yet nonblocking results. It represents a very powerful addition to SQL Server’s arsenal.

Handling Concurrent Writes in the SNAPSHOT Isolation Level

Although SNAPSHOT provides consistent repeated reads like the REPEATED READ isolation level, it has a very different behavior when it comes to writing data. Should a SNAPSHOT isolated transaction read some data and then attempt to update it after another transaction has updated it, the entire SNAPSHOT transaction will be rolled back. This is similar to a deadlock and will need to be handled the same way in production code.

To illustrate this behavior, we’ll use the same TestSnapshot table from the previous example. In this case, however, suppose that the goal is to select some data into a temporary table, perform some very complex logic, and then update the table. First, the data is inserted into a temporary table:

SET TRANSACTION ISOLATION LEVEL SNAPSHOT
BEGIN TRANSACTION
SELECT ColB
INTO #Temp
FROM TestSnapshot
WHERE ColA = 1

The temporary table, #Temp, now contains a row with the value 'Original Value'. As before, another transaction is operating on the TestSnapshot table in another connection with an update:

UPDATE TestSnapshot
SET ColB = 'New Value'
WHERE ColA = 1

After this, the first transaction has completed its complex logic and attempts to do an update of its own:

UPDATE TestSnapshot
SET ColB = 'Even Newer Value'
WHERE ColA = 1

Unfortunately, this results in the following error:

Msg 3960, Level 16, State 1, Line 1Cannot use snapshot isolation to access table 'TestSnapshot' in database 'Sales'.Snapshot transaction aborted due to update conflict. Retry transaction.

So what’s the moral of this story? Treat any SNAPSHOT transaction that performs data updates exactly like transactions that are susceptible to deadlocks. Put code in place around these transactions to ensure that when this error occurs, an appropriate course of action will be taken.

Using the READ COMMITTED SNAPSHOT Isolation Level

Similar to the SNAPSHOT isolation level is the READ COMMITTED SNAPSHOT isolation level. This isolation level is actually a modification of the default behavior of the READ COMMITTED isolation level. By turning this option on, any single read query within an implicit or explicit READ COMMITTED transaction will behave like a snapshot read—but only for the duration of the query. So repeatable reads do not occur, but consistency is guaranteed.

Again, this is best illustrated through an example using the TestSnapshot table. Assume the database has READ COMMITTED SNAPSHOT enabled. The following query is run on one connection:

SET TRANSACTION ISOLATION LEVEL READ COMMITTED

SELECT ColB
FROM TestSnapshot
WHERE ColA = 1

This, of course, returns 'Original Value'. Now in a second connection, another transaction is started, but not committed:

BEGIN TRANSACTION

UPDATE TestSnapshot
SET ColB = 'New Value'
WHERE ColA = 1

Rerunning the select in the first connection will return 'Original Value' again, because the second transaction has not committed—no blocking occurs, as in the normal READ COMMITTED isolationlevel. However, as soon as the second transaction commits, the first connection will now see the updated value.

READ COMMITTED SNAPSHOT can be a good balance for databases that have a lot of read activity of data that is regularly updated, where consistency is important but repeatable results within a single transaction are not.

Database Snapshots

Like the SNAPSHOT isolation level, database snapshots give DBAs a way of presenting a consistent view of data at a certain time. However, whereas the SNAPSHOT isolation level provides this for only small amounts of data (the data involved in a given transaction), database snapshots provide a frozen replica of the database at the time the snapshot was created. This can be helpful for situations in which DBAs need to provide timestamped data for reporting or auditing purposes.

What differentiates database snapshots from other methods of providing this same functionality (such as taking a backup) is that database snapshots have no data when they’re first created, and they are therefore created almost instantaneously. This is made possible by a scheme similar to that which allows the SNAPSHOT isolation level to work.

Instead of copying all of the data from the source database, the database snapshot begins life as nothing more than a pointer to the original database. As data changes in that database, the older versions of rows are migrated into the snapshot database, but at any time, the snapshot database is only as large as the amount of data that has changed since it was created. Of course, this works the other way around as well. A database snapshot can grow to be as big as the original database was at the moment the snapshot was created, so ensure that enough disk space exists to provide room for growth. DBAs should also attempt to place snapshot databases on separate physical disks from production databases, to reduce contention due to additional write operations when changes are migrated into the snapshot.

Creating Database Snapshots

Database snapshots are created using CREATE DATABASE with the AS SNAPSHOT OF option. To create a snapshot, each logical filename from the original database must appear in the definition for thesnapshot, exactly as it was originally defined. The physical filename should be changed to have the .ss extension, but drives or paths can also be changed.

A recommended naming scheme for database snapshots is the same name as the database, followed by _Snapshot, optionally followed by the date and time the snapshot was created. This naming scheme should help users more quickly determine which snapshot they require for a task. It’s also recommended that the snapshot’s physical filenames be similarly timestamped, to make disk management easier.

As an example, assume that there is a database called Sales, with two filegroups, SalesPrimary and SalesPrimary_01. It’s September 1, 2008. The following T-SQL could be used to create the snapshot:

CREATE DATABASE Sales_Snapshot_20080901
ON
(NAME = SalesPrimary,
FILENAME = 'F:DataSalesPrimary_20070901.ss'),
(NAME = SalesPrimary_01,
FILENAME = 'F:DataSalesPrimary_01_20070901.ss')
AS SNAPSHOT OF Sales

Once a snapshot is created, it will appear to clients to be a read-only database and will persist until it is explicitly dropped using DROP DATABASE. The base database cannot be dropped until all referencingsnapshots are dropped. Any number of snapshots can be created for a database, allowing DBAs to keep a running tally of data states, as disk space allows. Remember that these databases will grow as data changes in the base database.

Reverting to a Database Snapshot

Along with providing a readily available view of the database at a specific point in time, snapshots can be used as a fail-safe in case of accidental data corruption. Please note that using database snapshots is no replacement for a solid backup plan. However, there are times when reverting to a snapshot could be very useful. For instance, imagine a scenario in which a development team is testing a data-upgrade script. These kinds of development tasks generally require the DBA to restore a database, run a version of the update script, regression test, and repeat the process iteratively as bugs are discovered. Using a database snapshot and reverting will decrease a lot of the downtime required for these kinds of tasks and generally make life easier for the DBA.

Reverting to a snapshot is similar to restoring from a backup. For instance, to revert the Sales database from a snapshot created on September 1, 2008, the following T-SQL could be used:

RESTORE DATABASE Sales
FROM
DATABASE_SNAPSHOT = Sales_Snapshot_20080901

A few restrictions apply. A restore from a snapshot can occur only if the database has just one snapshot. So if multiple snapshots have been created, those other than the one to be restored from will need to be dropped. The database can have no full-text indexes. Finally, during the restore process, both the database and the snapshot will be unavailable for use.


All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

SQL Server 2008 Topics