Normalization is the formal process for deciding which attributes should be grouped together in a relation. We can use commonsense to decide which fields or attributes should be grouped together, but normalization provides us with a systematic and scientific process for doing this. Before proceeding with the physical design of the database, we need to validate the logical design and normalization serves as a tool for validating and improving the logical design, so that the logical design satisfies certain constraints and avoids unnecessary duplication of data. In the normalization process we analyze and decompose the complex relations and transform them into smaller, simpler and well-structured relations.
The process of normalization was first developed by E.F. Codd. Normalization is often performed as a series of tests on a relation to determine whether it satisfies or violates the requirements of a given normal form. Three normal forms were initially proposed, namely first, second and third normal forms. Subsequently, a stronger definition of the third normal form was introduced by R. Boyce and E.F. Codd, referred to as Boyce-Codd Normal Form (BCNF). All of these normal forms are based on functional dependencies among the attributes of a table.
Higher normal forms that go beyond BCNF were introduced later such as fourth and fifth normal forms. However these later normal forms deal with practical solutions that are very rare.
Data normalization is a methodology to devise a collection of tables that represent real world business records in a database, avoiding any data duplication as storage was expensive. Avoiding data duplication also means that update anomalies cannot happen. Data normalization is very well and widely documented. It starts with a single large table that represents all the properties of a real world business record together with its main identifier (a key), then removes hierarchies (repeating groups) to simplify querying with a language such as SQL. Any duplicate data and functional dependencies in the resulting tables must then also be removed.
To achieve normalization, the single table with all the required properties is split into more tables that are linked through primary and foreign keys. The result of data normalization is that a single business record can be represented in tens or sometimes hundreds of tables. Many artificial keys (and associated indexes) are introduced that do not exist in the real world, but are needed to reconstruct the real world business record. Storing multiple versions of a business record, for example, an order, and then any revisions made to the order, requires versioning all the tables involved which makes querying and maintaining the tables complex. An alternative approach, to conserve storage, is to store deltas only, instead of cascading full versions through the tables, introducing more complexity for programmers.
In 1980, the cost of two MB of storage was roughly equivalent to the cost of one week's work of a computer programmer in the US. In 2010, even one GB of storage represents only a tiny fraction, not even minutes of a computer programmer's time, and the storage price continues to drop. Moreover, memory is becoming plentiful, and the cost (latency) of I/O operations continues to decrease as new kinds of storage –such as solid state disks- are being introduced. With the notable exception of relational databases, storage media is typically used to store non-normalized artifacts, for example in file servers, web servers, content repositories, application servers, and so on.
Contrast relational storage with the tablets, tally sticks, and paper records that were used for record keeping before computer systems were introduced and always stored "as is". For several reasons they were not broken up or converted to a different format for storage purposes. First, storage space was usually plentiful and did not have to be conserved. Second, any conversion (and reconstruction) of these artifacts would have been very expensive. And third, storing these records in their original form made it easy to use and understand them when they were retrieved from storage. The same reasons apply today for storing real world digital business records in a non-normalized form as discussed later in this article.
As the use of paper records increased rapidly in the 19th and 20th century, storage space became a problem for some libraries and archives. This triggered the invention of microfilm and microfiche to reduce the required storage space to between 0.25% and 3% of the original material. However, this is merely a form of compression without representing the information in a conceptually different way. Similarly, digital compression can be applied today to reduce the storage consumption of non-normalized business records.
Normalization as we have seen is accomplished in steps or stages, each of which corresponds to a normal form. A normal form is a state of a table that results from applying simple rules regarding functional dependencies (or relationships between attributes) to that table. We will see an overview of the normal forms in the following section and will discuss each normal form in detail in the next sections.
Before discussing the normal forms, there are two concepts required by normal forms and that are integral to Data Modeling—Keys and Relationships.
A key uniquely identifies a row in a table. There are two types of keys—intelligent keys and non-intelligent keys. An intelligent key is based upon data values such as a date, a last name or a combination of values.A non-intelligent key is completely arbitrary, having no function or meaning other than identification of the row.
Intelligent keys are problematic because the key value must be changed to reflect changing data values. An updateable key presents its own problems, because all related tables must be updated every time the key is updated. As a general rule, intelligent keys must never be used, but it is not always possible to do that.
A primary key is a column in the table whose purpose is to uniquely identify records from the same table. A foreign key is a column in a table that uniquely identifies the records from a different table. Primary and foreign keys therefore have a symbiotic relationship, in which the foreign key is used to store values of another table's primary key to describe the relationship between data from different tables.
There are three possible relationships between related entities or tables. They are:
Relational Databases are designed to work most effectively with one-to-many relationships between tables and expressed using primary and foreign keys. Defining a column as a primary key in the database will ensure that no duplicate values are contained in that column. In other words, a primary key will be unique.
One-to-one relationships are rare, because data elements related in this way are normally placed in the same table. When a one-to-one relationship is desired, it can be expressed simply as a one-to-many relationship using primary and foreign keys.
One-to-many relationships are the most common. A primary key is the 'one' side of the relationship, and the foreign key is the 'many' side of the relationship. A primary key value is unique in its own table. Each primary key may have any number of foreign keys using the same value, in any number of tables. In other words each pair of primary and foreign key columns is a one-to-many relationship. The data values stored in these columns determine which rows from the table containing the foreign key column are associated with an individual row in the table containing the primary key column. The foreign key value refers to one, and only one, value in the table containing the primary key. Multiple foreign keys may reference the primary key value. This is how one-to-many relationships are expressed and maintained by the database. The database can perform special operations on primary and foreign key relationship between tables, including the ability to delete all rows in a table containing a foreign key value when the associated primary key row is deleted or to restrict deletion of a row containing a primary key that has references in another table. Defining primary and foreign key columns in the database promotes data integrity.
Many-to-many relationships are problematic and cannot be adequately expressed directly in a relational database. Many-to-many relationships are usually expressed using intersection tables. An intersection table contains two or more foreign keys, relating the primary key values of two or more tables to each other. The role of an intersection table is to convert the many-to-many relationship into two one-to-many relationships that can be easily handled by the database.
IBM Mainframe Related Interview Questions
|IBM Lotus Notes Interview Questions||IBM-CICS Interview Questions|
|COBOL Interview Questions||Linux Interview Questions|
|IBM-JCL Interview Questions||IBM Mainframe Interview Questions|
|IBM AIX Interview Questions||IBM WAS Administration Interview Questions|
|IBM Lotus Domino Interview Questions||IBM Integration Bus Interview Questions|
|Mainframe DB2 Interview Questions||Unix Production Support Interview Questions|
Ibm Mainframe Tutorial
Introduction To Software Development
Introduction To Ibm Mainframes
Tso And Ispf
Jes2, ]es3 And Sms
Introduction To Job Control Language (jcl)
The Job Statement
The Exec Statement
The Job And Exec Statements
The Dd Statement
Procedures And Symbolic Parameters
Generation Data Groups (gdg), Compile/link-edit And Run Jcls
Access Method Services (ams)
Additional Vsam Commands
Introduction To Rexx
Overview Of Rexx
Introduction To Cics
Exception Handling In Cics
Developing A Cics Application
Cics Programming Techniques
Basic Mapping Support (bms)
Transient Data Control
Temporary Storage Control
Interval And Task Control
Cics Application Design
Recovery And Restart
System Security And Intersystem Communication
Cics Debugging Facilities And Techniques
Bms Map Definition Macros And Copylib Members
Cics Response And Abend Codes
Data, Information And Information Processing
Introduction To Database Management Systems
Introduction To Relational Database Management Systems
Database Architecture And Data Modeling
Overview Of Db2
Structured Query Language (sql)
Data Security And Access
Db2 Application Development
Qmf And Db2i
Db2 Performance Monitoring, Utilities And Recovery/restart
Overview Of Information Management System (ims)
Introduction To Vs Cobol Ii
Overview Of Application Development In Vs Cobol Ii
Overview Of The Cobol Program
Sorting And Merging Files
Coding Cobol Programs That Run Under Cics. Ims, Db2 And Ispf
Compiling The Program
Link-editing The Program
Executing The Program
Improving Program Performance
All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd
Wisdomjobs.com is one of the best job search sites in India.