Planning for Upsizing - MS Access

In this section, you will look at how to plan for upsizing an Access database to SQL Server. This includes looking at how the available data types in SQL Server can be matched with those in Access, and the steps that you need to take to verify that your Access database can be correctly converted. This verification will save you a lot of time, regardless of what tool you choose to perform the conversion.

Text Data Types and UNICODE

The ASCII character encoding scheme has been in use for a long time, but is has an inherent restriction because it supports only 255 different characters in its code set. UNICODE, however, supports over 65,000 characters. The SQL Server data types for character data are CHAR, VARCHAR, and an older TEXT data type, which is equivalent to the Access Memo data type. CHAR is a fixed size and VARCHAR is varying size, so the Access Text data type matches the SQL Server VARCHAR data type.

The SQL Server UNICODE data types can be identified by the “N” prefix, such as NCHAR, NVARCHAR and NTEXT. If you choose to use UNICODE, this will double the required storage, which could have an impact on performance, but you gain a greater flexibility and support for international deployment of your application. SQL Server has an 8 K page—the maximum column size is 8,000 bytes, and the maximum row size is 8,060 bytes (1 row per page). But SQL Server 2008 R2 does allow the combined length of character and certain other data types to exceed this maximum row size.

When dealing with a Memo data type, you have the option of mapping this to an older TEXT or NTEXT data type, or, beginning with SQL Server 2005, you can use VARCHAR(MAX) or NVARCHAR(MAX) data type, which is a better choice. The VARCHAR(MAX) data type can hold up to two gigabytes of data, and is stored in-row, when possible, to improve performance.

Date and Time Data

The Access Date data type can hold dates from 1 January, 100 to 31 December, 9999. When no date is given and time-only data is provided, it uses the date 30 December, 1899 to indicate time-only data and hides the date information.

The corresponding data type in SQL Server is the DATETIME data type. This can hold dates from 1 January, 1753 to 31 December, 9999. When no date is provided, it defaults the date to 1 January, 1900 (which you might have spotted is different in Access). SQL Server also has a SMALLDATETIME data type.

Beginning with SQL Server 2008, there are some new date and time data types, they are: DATE, TIME, DATETIME2 and DATETIMEOFFSET. When you link Access to these data types, the DATE data type will appear as an Access Date data type, and can store dates in the range 1 January, 0000 to 31 December, 9999.

The most common choice of mapping is to map the Access Date data type to the DATETIME data type, but if you only had date data you could consider using the newer DATE data type.

If you use a map to the DATETIME data type, then because it is possible for your Access tables to contain badly entered dates for example in the year 100 before you convert your database, you should verify that your date data is no earlier than 1 January, 1753 (this is one of the most common problems that applications can have and can cause data conversions to fail). If you find bad dates, and if the field is not a required field, you can set the value to NULL. If it is a required date, set the field to a safe value. The following VBA code in the Northwind_ProblemsAndFixes.accdb can be used to search through all the dates in your database, looking for bad dates and making corrections:

With TIME-only data, you need to be aware that as long as Access is used to continue entering this date, the default 30 December 1899 will be applied. But if you decide to write T-SQL in SQL Server, then ensure that you explicitly supply this date, because the default in SQL Server is 1 January 1900.

Boolean Data

An Access Yes/No data type corresponds to the SQL Server BIT data type. In Access TRUE is –1 and FALSE is 0; in SQL Server, TRUE is 1 and FALSE is 0.

In Access, if you don’t provide a default value for a Yes/No field, it will be always be displayed as FALSE. In SQL Server, if you don’t specify a default, the default value is NULL. This means that unlike Access, a SQL Server BIT (Yes/No) field can be NULL, TRUE or FALSE. Access, when linked to this data type, displays NULL as FALSE. The best advice is to check all your Yes/No fields to verify that they have a default value so that once converted to SQL Server, they have defaults when new data is entered.

And in future, when creating new BIT fields in SQL Server, always give them a default value. The reason that this is very important is because SQL Server does not treat NULL as FALSE; it treats NULL as NULL, and testing for FALSE does not identify any rows containing NULL values. If you ignore this point, then at some point when you convert an Access query to a SQL Server View, if that query tests for FALSE, you will find that it no longer works as expected (during conversion you need to watch that FALSE becomes 0 and that –1 or TRUE becomes 1 or <>0). The following code sample could be used to add missing defaults to Yes/No fields prior to converting your database:

Integer Numbers

SQL Server integers range as follows (the Access equivalent types are shown in parentheses): TINYINT (Byte), SMALLINT (Integer), INT (Long Integer), BIGINT (no equivalent). If you define a BIGINT in SQL Server and use a linked table to Access, the field appears as a Text field.

Real Numbers, Decimals, and Floating-Point Numbers

SQL Server can hold very large or very small numbers by using the FLOAT data type. The type REAL (Access equivalent, Single) is equivalent to a FLOAT(24), 24-byte floating-point number. The default FLOAT is a FLOAT(53), which is equivalent to an Access Double. SQL Server also supports the DECIMAL data type, which aligns with the Access Decimal data type.

Hyperlinks

SQL Server has no equivalent to the Access Hyperlink data type, which you would translate to a VARCHAR or NVARCHAR data type. You would then require some code in Access to make it behave as a hyperlink, using an unbound hyperlink control. Access stores the hyperlink in a memo type column with the Address, Text, and ScreenTip information.

IMAGE, VARBINARY(Max), and OLE Data

SQL Server supports two data types into which you can transfer Access OLE Data. These data types are, IMAGE (an older data type) and VARBINARY(MAX). The recommended choice is VARBINARY(MAX), but if you encounter any problems when migrating or working with the translated OLE Data, then it is worth trying the alternative data type mapping using the IMAGE data type.

Memo Data

SQL Server supports two data types into which Memo data can be transferred: TEXT, (which is the older of the two) and VARCHAR(MAX) (and their UNICODE equivalents). In many respects the better choice is VARCHAR(MAX); if you intend to bind .NET controls such as a DetailsView control, then updates with TEXT are not supported, but this will work with the VARCHAR(MAX) data type.

INSIDE OUT Driver limitations with VARCHAR(MAX)

You should be aware that there is a serious problem when working with VARCHAR(MAX) and either the Native Client Driver or Native Client 10.0 Driver. If you try pasting in a large amount of text, you get the error [Microsoft][SQL Native Client] String Data, right truncation (#0). In this case, you either need to stick with the older SQL Server Driver or revert to using the TEXT data type, which works with all three drivers. The choice of different drivers is discussed in Topic, “Linking Access Tables.”

Currency SQL Server supports two currency types, MONEY and SMALLMONEY. The MONEY data type maps to the Access Currency data type.

Attachments and Multi-Value Data

Unlike Access and Microsoft SharePoint, SQL Server does not support multi-value fields (Access and SharePoint provide full support for these), This means that you need to normalize your data by creating a new table to hold the multi-value or Attachment field data. For example, if you have a table called tblCustomer with a primary key CustomerID and an Attachment field called Documents, you need to create a new table called tblCustomer Documents with an AutoNumber primary key, a foreign key to CustomerID, and an OLE Data field called Documents. You then need to create new records in the new table to hold the documents for a customer. Unfortunately, there is no easy way to transfer Attachment data to an OLE Data field.

One strategy to convert your Attachment data is to first use the SaveToFile VBA method of an Attachment object to scan through your attachments and save them as files to a folder (this was described in Chapter, “Understanding the Data Access Object Model”). The next step is to create a form bound to your new OLEData Type, and then using the following VBA code for a control called for example TheOLEObject, you can load the file into the bound control and save the record:

Required Fields

In Access, an existing field can be changed to a Required field, but you can skip checking existing data. These fields are created in SQL Server with a NOT NULL attribute; during the process of migrating data, this can cause problems. The following code can be used to check if you have NULL values in any required field. To fix this problem, you can either go through the data, adding a safe default value for the field, or set the Required field attribute to FALSE, as shown in the following:

Cycles and Multiple Cascade Paths

It is possible to create relationships in Access that cannot be converted into relationships in SQL Server, although this rarely happens. If it does happen, you get an error message similar to the following: may cause cycles or multiple cascade paths. Specify ON DELETE NO ACTION or ON UPDATE NO ACTION, or modify other FOREIGN KEY constraints.

The problem is that your relationship includes cascade update or cascade delete actions, which appear to SQL Server to be causing a feedback loop that can lead to problems. The solution is to remove one or more of the cascade actions. If you still need to implement the cascade operation, you can either write a trigger or create a stored procedure to replace it.

INSIDE OUT Partially-completed foreign keys

In Access, if you have a multi-part foreign key, if you start to enter values for one part of the key, you can only save the record once you have valid values in all parts of the key. In SQL Server, when you have a multi-part foreign key, you can save the record even if not all parts of the key are complete (some fields in the key can contain NULL values). In fact, you can even have illegal values in parts of the key, because it is only when the key is complete that the values are checked.

This is not a common problem, but it’s worthy of a note. The SQL Standards show a MATCH option that applies to defining relationships (or references); Access uses the FULL match option, and SQL Server uses the SIMPLE match option.

Mismatched Fields in Relationships

With Access, you can construct relationships between columns of different data types and sizes; for example, in a Customer table, you can have a foreign key field to a Country table that is Text(20), and in the Country table, the corresponding field could be a Text(30). SQL Server does not allow this, and you need to modify your design to ensure that both fields are the same size and of the same data type. For the previous example, you can set the foreign key to a Text(30) field. You could use code based on the following example to detect and possibly autocorrect these problems:

Replicated Databases and Random Autonumbers

If your database was at one time replicated, then the tables will have picked up a number of special replication fields such as s_Generation, s_Lineage, s_GUID, or s_ColLineage. It is a good idea to remove these now unwanted fields prior to upsizing, and possibly convert any ReplicationID fields, if used as a primary key, to standard autonumbers.

A general feature of Access that is associated with replication is the option to have a random Autonumber sometimes used as a primary key. If, for example, in an Access database you have a table called Customer2 with a field NewId that was a random Autonumber, then if you look in CurrentDB.TableDefs (“Customers2”).Fields (“NewId”).Default Value, you can spot a function called GenUniqueID(); this is the internal function that Access uses to generate the random Autonumbers.

One nice feature of the Upsizing Wizard is that it will automatically write the following trigger code (usp_GenUniqueID.sql) to generate the random Autonumber in SQL Server, and sets a default of 0 for the field NewId, which is the primary key:

The preceding approach, however, has a logical flaw: if an insert is from data in another table with multiple rows or using multiple values, then, as demonstrated in the code that follows, it generates duplicate primary key values of 0 and fails (remember the trigger fires after the insert). The good news is that this is very easily fixed, and the trigger code gives you all the information to write a SQL Server function to resolve this.

This code is interesting because you want to create a generic function that uses the RAND function, but it turns out that you are not allowed to use the RAND system function inside a user-defined function. So, you must employ a cunning trick and wrap the RAND function inside a View, and then you are allowed to do this. After allowing the Upsizing Wizard to convert the sample table Customer2, you can cause the previously described trigger to fail by using the following code:

Next, to resolve this problem, you create a View and then a function to use the View. In the following, notice that the first definition of the function will not work; however, the code starting from creating the View will work:

To complete this, you need to drop the existing trigger (by using DROP), drop the existing default, and then add your new default:

In the preceding code, we have exchanged using a trigger for using a function. The subtle difference is that the function is called once for each individual row that is changed, but the trigger would be called once to process changes to multiple rows.

INSIDE OUT Allowing for changes to multiple rows in trigger code

The previous example demonstrates how trigger code should always be written to allow for multiple rows to be inserted, updated, or deleted; otherwise, you run the risk that at some point you perform an operation involving multiple rows and the application then generates errors. In some situations (although not in this case), if you cannot write the trigger for multiple rows, then you can consider using the following T-SQL to temporarily disable and then enable a trigger after completing the operation; this would only be recommended for use in maintenance and not while users are interacting with the system:

This example also shows the benefit in having constraints with easily managed names for triggers and other objects in the database.

Unique Index and Ignore Nulls

Access allows you to create a Unique Index that ignores NULL values; we have added an example to the Employees EmploymentCode column in the sample database to demonstrate this problem.In SQL Server, NULL is a unique value, and you cannot create a unique index to ignore NULL values. You can add a non-unique index and then write the following trigger code to prevent duplicates but allow multiple NULL values. After you have converted the sample database to SQL Server, you can take a look at how the code (UniqueIndexIgnoreNull.sql) that follows can be used to resolve this problem in the converted SQL Server database:

It’s also worthwhile to consider using the following code to identify potential issues with these indexes

Timestamps and Row Versioning

When you use a software tool to convert an Access database to SQL Server, you are normally provided with an option to add a TIMESTAMP column to your tables (it cannot be over-emphasized how important this option is when working with Access linked to SQL Server), and when you create new tables in SQL Server, you should remember to add a TIMESTAMP column to each table. Most of the current documentation from Microsoft no longer talks about a TIMESTAMP data type but instead discusses a ROWVERSION data type. However, Management Studio only displays a TIMESTAMP data type and not a ROW VERSION data type; you can regard both as meaning the same thing.

When you edit data over a linked table to SQL Server, Access maintains in memory a “Before” image of your data, and once you are ready to save the record, it needs to write the data back to SQL Server by using the Before image to detect any conflict in the data caused by another user simultaneously editing and saving changes to the same record. This standard product feature opens a write-conflict dialog box when a conflict is detected. Access can use one of two strategies for detecting a conflict during the write-back. If you ignore TIMESTAMPs, Access checks that the values in the Before image for every field matches the current data held in the row in SQL Server. It does this by using a WHERE clause on the Update command, as shown in the following:

WHERE fld1 = fld1 (old value) and fld2 = fld2 (old value)

The following example illustrates how you can see the trace output from the SQL Server Profiler (discussed beginning on page 586) when you perform an update on the Customers table linked to Northwind:

The problem with this strategy is that if you have 100 fields, then the WHERE clause includes 100 comparisons, so this update involves a lot of comparisons. The other danger is when working with floating-point numbers; for example, if there were any difference in the precision with which the numbers were held in the two products, then the update could fail to locate the target record, causing the update to incorrectly register a conflict. The second strategy is to use a TIMESTAMP as soon as you add this column to a table, and then ensure that the linked table is refreshed. Access switches automatically to this second strategy, which involves verifying that the primary key is matched and the TIMESTAMP column is unchanged, as shown here:

WHERE pkey1 = pkey1 (old value) and TS = TS (old value).

If you now add a TIMESTAMP column called TSCustomers to the Customers table, and then refresh the linked table and repeat the update, you obtain the following output in the Profiler:

This second method has the advantage of fewer comparisons and no dependencies on the form of internal storage of data in the products (with the exception of the primary key). The TIMESTAMP column data is automatically generated by SQL Server and changed whenever a record is changed. If you want to see this in greater detail, you can use the SQL Server Profiler to monitor what happens under the hood.

Schemas and Synonyms

After you create a SQL Server database, click Security | Schemas. There you will find a list of existing schemas. The dbo schema is typically the one that contains all your design objects, and it is where new objects that you create are stored, unless your login has been altered to use a different default schema for the database.

In earlier versions of SQL Server, objects were owned by users, so you could have a table owned by User1.Customers and User2.Customers. However, this caused problems if you wanted to remove User1 from the system but keep an object such as User1.Customers. As a result, in SQL Server 2005 a new interpretation of schemas was introduced that was independent of users.

If you have a large Access database (for example, 200 tables), you can use schemas to split up the tables, Views, and so on, into logical groupings to better manage the database design. You can also use schemas to flexibly and easily configure security; for example, restricting a group of users to having only read permissions on a particular set of tables. The following example (UsingSchemas.sql) illustrates how to create a Schema, create a table in the Schema, and then create a Synonym to refer to the table without specifying the full Schema prefix:

The Synonym is very important, because if you already had a table on the dbo schema and moved it to the Companies Schema, any SQL that referred to the table would no longer work, but once you add the Synonym, that problem is resolved. If you don’t specify an explicit schema when creating objects, they will be placed on your default schema, which, as previously mentioned, is normally dbo. The following example shows an object being moved into a Schema and creating an appropriate Synonym for the object:

If you want to try the previous code and then move the Customers table back to the dbo Schema, you need to drop the synonym before you do this by using DROP SYNONYM Customers. You could then use the following line to move the table back to the dbo schema:

ALTER SCHEMA dbo TRANSFER Companies.Customers

After you have converted your database to SQL Server, you can consider making a list of all your tables, and then create a script to move all your tables into appropriate schemas and simultaneously create appropriate synonyms. You can use sp_tables to get a list of all your existing tables and the TABLE_OWNER property, which is the schema.

INSIDE OUT Using schemas and database roles to manage security

Often in an application you will want to provide users with different permissions on groups of objects in the system. SQL Server has a sophisticated security system with which you can establish fine control of security. One of the simplest approaches to security is to create database roles, and when mapping users, windows groups, or SQL Server logins into your system, assign them an appropriate database role, as demonstrated in the following:

The Execute and Select permissions will allow the ReadOnlyUsers to both view data andexecute procedures on the Schema.


All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

MS Access Topics