Integrity and Security - Database system concepts

Integrity constraints ensure that changes made to the database by authorized users do not result in a loss of data consistency. Thus, integrity constraints guard against accidental damage to the database. We have already seen two forms of integrity constraints for the E-R model in Entity Relationship Model:

  • Key declarations the stipulation that certain attributes form a candidate key for a given entity set.
  • Form of a relationship many to many, one to many, one to one.

In general, an integrity constraint can be an arbitrary predicate pertaining to the database. However, arbitrary predicates may be costly to test. Thus, we concentrate on integrity constraints that can be tested with minimal overhead. we study triggers, which are statements that are executed automatically by the system as a side effect of amodification to the database. Triggers are used to ensure some types of integrity.

In addition to protecting against accidental introduction of inconsistency, the data stored in the database need to be protected from unauthorized access and malicious destruction or alteration. we examine ways in which data may be misused or intentionally made inconsistent, and present security mechanisms to guard against such occurrences.

Domain Constraints

We have seen that a domain of possible values must be associated with every attribute. In SQL, we saw a number of standard domain types, such as integer types, character types, and date/time types defined in SQL. Declaring an attribute to be of a particular domain acts as a constraint on the values that it can take. Domain constraints are the most elementary form of integrity constraint. They are tested easily by the system whenever a new data item is entered into the database.

It is possible for several attributes to have the same domain. For example, the attributes customer-name and employee-name might have the same domain: the set of all person names. However, the domains of balance and branch-name certainly ought to be distinct. It is perhaps less clear whether customer-name and branch-name should have the same domain. At the implementation level, both customer names and branchnames are character strings. However, we would normally not consider the query “Find all customers who have the same name as a branch” to be a meaningful query.

Thus, if we view the database at the conceptual, rather than the physical, level, customer-name and branch-name should have distinct domains. From the above discussion, we can see that a proper definition of domain constraints not only allows us to test values inserted in the database, but also permits us to test queries to ensure that the comparisons made make sense. The principle behind attribute domains is similar to that behind typing of variables in programming languages. Strongly typed programming languages allow the compiler to check the program in greater detail.

The create domain clause can be used to define new domains. For example, the statements:

create domain Dollars numeric(12,2)
create domain Pounds numeric(12,2)

define the domains Dollars and Pounds to be decimal numbers with a total of 12 digits, two of which are placed after the decimal point. An attempt to assign a value of type Dollars to a variable of type Pounds would result in a syntax error, although both are of the same numeric type. Such an assignment is likely to be due to a programmer error, where the programmer forgot about the differences in currency. Declaring differentdomains for different currencies helps catch such errors.Values of one domain can be cast (that is, converted) to another domain. If the attribute A or relation r is of type Dollars, we can convert it to Pounds by writing

cast r.A as Pounds

In a real application we would of course multiply r.A by a currency conversion factor before casting it to pounds. SQL also provides drop domain and alter domain clauses to drop or modify domains that have been created earlier.

The check clause in SQL permits domains to be restricted in powerful ways that most programming language type systems do not permit. Specifically, the check clause permits the schema designer to specify a predicate that must be satisfied by any value assigned to a variable whose type is the domain. For instance, a check clause can ensure that an hourly wage domain allows only values greater than a specified value (such as the minimum wage):

create domain HourlyWage numeric(5,2)
constraint wage-value-test check(value >= 4.00)

The domain HourlyWage has a constraint that ensures that the hourly wage is greater than 4.00. The clause constraint wage-value-test is optional, and is used to give the name wage-value-test to the constraint. The name is used to indicate which constraint an update violated.

The check clause can also be used to restrict a domain to not contain any null values:

create domain AccountNumber char(10)
constraint account-number-null-test check(value not null )

As another example, the domain can be restricted to contain only a specified set of values by using the in clause:

create domain AccountType char(10)
constraint account-type-test
check(value in (’Checking’, ’Saving’))

The preceding check conditions can be tested quite easily, when a tuple is inserted or modified. However, in general, the check conditions can be more complex (and harder to check), since subqueries that refer to other relations are permitted in the check condition. For example, this constraint could be specified on the relation deposit:

check (branch-name in (select branch-name from branch))

The check condition verifies that the branch-name in each tuple in the deposit relation is actually the name of a branch in the branch relation. Thus, the condition has to be checked not only when a tuple is inserted or modified in deposit, but also when the relation branch changes (in this case, when a tuple is deleted or modified in relation branch).

The preceding constraint is actually an example of a class of constraints called referential-integrity constraints.We discuss such constraints, along with a simpler way of specifying them in SQL. Complex check conditions can be useful when we want to ensure integrity of data, but we should use them with care, since they may be costly to test.

Referential Integrity

Often, we wish to ensure that a value that appears in one relation for a given set of attributes also appears for a certain set of attributes in another relation. This condition is called referential integrity.

Basic Concepts

Consider a pair of relations r(R) and s(S), and the natural join r ⋈ s. Theremay be a tuple tr in r that does not join with any tuple in s. That is, there is no ts in s such that tr[R ∩ S] = ts[R ∩ S]. Such tuples are called dangling tuples. Depending on the entity set or relationship set being modeled, dangling tuples may or may not be acceptable.

we considered a modified form of join the outer join to operate on relations containing dangling tuples. Here, our concern is not with queries, but rather with when we should permit dangling tuples to exist in the database. Suppose there is a tuple t1 in the account relation with t1[branch-name] = “Lunartown,” but there is no tuple in the branch relation for the Lunartown branch. This situationwould be undesirable.We expect the branch relation to list all bank branches.

Therefore, tuple t1 would refer to an account at a branch that does not exist. Clearly, we would like to have an integrity constraint that prohibits dangling tuples of this sort. Not all instances of dangling tuples are undesirable, however. Assume that there is a tuple t2 in the branch relation with t2[branch-name] =“Mokan,” but there is no tuple in the account relation for the Mokan branch. In this case, a branch exists thathas no accounts. Although this situation is not common, it may arise when a branch is opened or is about to close. Thus, we do not want to prohibit this situation.

The distinction between these two examples arises from two facts:

  • The attribute branch-name in Account-schema is a foreign key referencing the primary key of Branch-schema.
  • The attribute branch-name in Branch-schema is not a foreign key.

In the Lunartown example, tuple t1 in account has a value on the foreign key branch-name that does not appear in branch. In the Mokan-branch example, tuple t2 in branch has a value on branch-name that does not appear in account, but branch-name is not a foreign key. Thus, the distinction between our two examples of dangling tuples is the presence of a foreign key.

Let r1(R1) and r2(R2) be relations with primary keys K1 and K2, respectively.We say that a subset α of R2 is a foreign key referencing K1 in relation r1 if it is required that, for every t2 in r2, there must be a tuple t1 in r1 such that t1[K1] = t2[α]. Requirements of this form are called referential integrity constraints, or subset dependencies.

The latter term arises because the preceding referential-integrity constraint can bewritten as Πα (r2) ⊆ ΠK1 (r1). Note that, for a referential-integrity constraint to make sense, either α must be equal to K1, or α and K1 must be compatible sets of attributes.

Referential Integrity and the E-R Model

Referential-integrity constraints arise frequently. If we derive our relational-database schema by constructing tables from E-R diagrams, as we did in Entity Relashionship Model, then every relation arising from a relationship set has referential-integrity constraints. Figure shows an n-ary relationship set R, relating entity sets E1, E2, . . .,En. Let Ki denote the primary key of Ei. The attributes of the relation schema for relationship set R include K1 ∪ K2 ∪ · · · ∪ Kn. The following referential integrity constraints are then present: For each i, Ki in the schema for R is a foreign key referencing Ki in the relation schema generated from entity set Ei.

Referential Integrity and the E-R Model

Another source of referential-integrity constraints is weak entity sets. Recall from Entity Relationship Model that the relation schema for a weak entity set must include the primary key of the entity set on which the weak entity set depends. Thus, the relation schema for each weak entity set includes a foreign key that leads to a referential-integrity constraint.

Database Modification

Database modifications can cause violations of referential integrity. We list here the test that we must make for each type of database modification to preserve the following referential-integrity constraint:

Πα (r2) ⊆ ΠK (r1)

  • Insert. If a tuple t2 is inserted into r2, the system must ensure that there is a tuple t1 in r1 such that t1[K] = t2[α]. That is, t2[α] ∈ ΠK (r1)
  • Delete. If a tuple t1 is deleted from r1, the system must compute the set of tuples in r2 that reference t1: σα = t1[K] (r2)

If this set is not empty, either the delete command is rejected as an error, or the tuples that reference t1 must themselves be deleted. The latter solution may lead to cascading deletions, since tuples may reference tuples that reference t1, and so on.

  • Update. We must consider two cases for update: updates to the referencing relation (r2), and updates to the referenced relation (r1).
  • If a tuple t2 is updated in relation r2, and the update modifies values for the foreign key α, then a test similar to the insert case is made. Let t2' denote the new value of tuple t2. The system must ensure that
  • t2' [α] ∈ ΠK (r1)
  • If a tuple t1 is updated in r1, and the update modifies values for the primary key (K), then a test similar to the delete case is made. The system must compute
  • σα = t1[K] (r2) using the old value of t1 (the value before the update is applied). If this set is not empty, the update is rejected as an error, or the update is cascaded in a manner similar to delete.

Referential Integrity in SQL

Foreign keys can be specified as part of the SQL create table statement by using the foreign key clause.We illustrate foreign-key declarations by using the SQL DDL definition of part of our bank database, shown in Figure . By default, a foreign key references the primary key attributes of the referenced table. SQL also supports a version of the references clause where a list of attributes of the referenced relation can be specified explicitly. The specified list of attributes must be declared as a candidate key of the referenced relation.

We can use the following short form as part of an attribute definition to declare that the attribute forms a foreign key:

branch-name char(15) references branch

When a referential-integrity constraint is violated, the normal procedure is to reject the action that caused the violation. However, a foreign key clause can specify that if a delete or update action on the referenced relation violates the constraint, then, instead of rejecting the action, the system must take steps to change the tuple in the referencing relation to restore the constraint. Consider this definition of an integrity constraint on the relation account:

Because of the clause on delete cascade associated with the foreign-key declaration, if a delete of a tuple in branch results in this referential-integrity constraint being violated, the system does not reject the delete. Instead, the delete “cascades” to the account relation, deleting the tuple that refers to the branch that was deleted. Similarly, the system does not reject an update to a field referenced by the constraint if itviolates the constraint; instead, the system updates the field branch-name in the referencing tuples in account to the new value as well. SQL also allows the foreign key clause to specify actions other than cascade, if the constraint is violated: The referencing field (here, branch-name) can be set to null (by using set null in place of cascade), or to the default value for the domain (by using set default).

SQL data definition for part of the bank database.

If there is a chain of foreign-key dependencies across multiple relations, a deletion or update at one end of the chain can propagate across the entire chain. An interesting case where the foreign key constraint on a relation references the same relation appears in Exercise . If a cascading update or delete causes a constraint violation that cannot be handled by a further cascading operation, the system aborts the transaction.As a result, all the changes caused by the transaction and its cascading actions are undone.

Null values complicate the semantics of referential integrity constraints in SQL. Attributes of foreign keys are allowed to be null, provided that they have not other wise been declared to be non-null. If all the columns of a foreign key are non-null in a given tuple, the usual definition of foreign-key constraints is used for that tuple. If any of the foreign-key columns is null, the tuple is defined automatically to satisfy the constraint.

This definition may not always be the right choice, so SQL also provides constructs that allow you to change the behavior with null values; we do not discuss the constructs here. To avoid such complexity, it is best to ensure that all columns of a foreign key specification are declared to be non-null.

Transactions may consist of several steps, and integrity constraints may be violated temporarily after one step, but a later step may remove the violation. For instance, suppose we have a relation marriedperson with primary key name, and an attribute spouse, and suppose that spouse is a foreign key on marriedperson. That is, the constraint says that the spouse attributemust contain a name that is present in the person table.

Suppose we wish to note the fact that John and Mary are married to each other by inserting two tuples, one for John and one for Mary, in the above relation. The insertion of the first tuple would violate the foreign key constraint, regardless of which of the two tuples is inserted first. After the second tuple is inserted the foreign key constraint would hold again.

To handle such situations, integrity constraints are checked at the end of a transaction, and not at intermediate steps.1

Assertions

An assertion is a predicate expressing a condition that we wish the database always to satisfy. Domain constraints and referential-integrity constraints are special forms of assertions. We have paid substantial attention to these forms of assertion because they are easily tested and apply to a wide range of database applications. However, there are many constraints that we cannot express by using only these special forms.

Two examples of such constraints are:

  • The sum of all loan amounts for each branch must be less than the sum of all account balances at the branch.
  • Every loan has at least one customer who maintains an account with a minimum balance of $1000.00.

An assertion in SQL takes the form

create assertion <assertion-name> check <predicate>

Here is how the two examples of constraints can be written. Since SQL does not provide a “for all X, P(X)” construct (where P is a predicate), we are forced to implement the construct by the equivalent “not exists X such that not P(X)” construct, which can be written in SQL. We write

When an assertion is created, the system tests it for validity. If the assertion is valid, then any future modification to the database is allowed only if it does not cause that assertion to be violated. This testing may introduce a significant amount of overhead if complex assertions have been made. Hence, assertions should be used with great care. The high overhead of testing and maintaining assertions has led some systemdevelopers to omit support for general assertions, or to provide specialized forms of assertions that are easier to test.

Triggers

A trigger is a statement that the system executes automatically as a side effect of a modification to the database. To design a trigger mechanism, we must meet two requirements:

  1. Specify when a trigger is to be executed. This is broken up into an event that causes the trigger to be checked and a condition that must be satisfied for trigger execution to proceed.
  2. Specify the actions to be taken when the trigger executes. The above model of triggers is referred to as the event-condition-action model for triggers.

The database stores triggers just as if they were regular data, so that they are persistent and are accessible to all database operations. Once we enter a trigger into the database, the database system takes on the responsibility of executing it whenever the specified event occurs and the corresponding condition is satisfied.

Need for Triggers

Triggers are useful mechanisms for alerting humans or for starting certain tasks automatically when certain conditions are met. As an illustration, suppose that, instead of allowing negative account balances, the bank deals with overdrafts by setting the account balance to zero, and creating a loan in the amount of the overdraft. The bank gives this loan a loan number identical to the account number of the overdrawn account.

For this example, the condition for executing the trigger is an update to the account relation that results in a negative balance value. Suppose that Jones’ withdrawal of some money from an account made the account balance negative. Let t denote the account tuple with a negative balance value. The actions to be taken are:

  • Insert a new tuple s in the loan relation with
  • s[loan-number] = t[account-number]
    s[branch-name] = t[branch-name]
    s[amount] = −t[balance]
    (Note that, since t[balance] is negative, we negate t[balance] to get the loan
    amount—a positive number.)

  • Insert a new tuple u in the borrower relation with
  • u[customer-name] = “Jones”
    u[loan-number] = t[account-number]
  • Set t[balance] to 0.

As another example of the use of triggers, suppose a warehouse wishes to maintain a minimum inventory of each item; when the inventory level of an item falls below the minimum level, an order should be placed automatically. This is how the business rule can be implemented by triggers: On an update of the inventory level of an item, the trigger should compare the level with the minimum inventory level for the item, and if the level is at or below the minimum, a new order is added to an orders relation.

Note that trigger systems cannot usually perform updates outside the database, and hence in the inventory replenishment example, we cannot use a trigger to directly place an order in the external world. Instead, we add an order to the orders relation as in the inventory example. We must create a separate permanently running system process that periodically scans the orders relation and places orders. This system process would also note which tuples in the orders relation have been processed and when each order was placed. The process would also track deliveries of orders, and alert managers in case of exceptional conditions such as delays in deliveries.

Triggers in SQL

SQL-based database systems use triggers widely, although before SQL:1999 they were not part of the SQL standard. Unfortunately, each database system implemented its own syntax for triggers, leading to incompatibilities. We outline in Figure the SQL:1999 syntax for triggers (which is similar to the syntax in the IBM DB2 and Oracle database systems).


Example of SQL:1999 syntax for triggers.

This trigger definition specifies that the trigger is initiated after any update of the relation account is executed. An SQL update statement could update multiple tuples of the relation, and the for each row clause in the trigger code would then explicitly iterate over each updated row. The referencing new row as clause creates a variable nrow (called a transition variable), which stores the value of an updated row after the update.

The when statement specifies a condition, namely nrow.balance < 0. The system executes the rest of the trigger body only for tuples that satisfy the condition. The begin atomic . . . end clause serves to collect multiple SQL statements into a single compound statement. The two insert statements with the begin . . . end structure carry out the specific tasks of creating new tuples in the borrower and loan relations to represent the new loan. The update statement serves to set the account balance back to 0 from its earlier negative value.

The triggering event and actions can take many forms:

  • The triggering event can be insert or delete, instead of update.

For example, the action on delete of an account could be to check if the holders of the account have any remaining accounts, and if they do not, to delete them from the depositor relation. You can define this trigger as an exercise . As another example, if a new depositor is inserted, the triggered action could be to send a welcome letter to the depositor. Obviously a trigger cannot directly cause such an action outside the database, but could instead add a tuple to a relation storing addresses to which welcome letters need to be sent. A separate process would go over this table, and print out letters to be sent.

  • For updates, the trigger can specify columns whose update causes the trigger to execute. For instance if the first line of the overdraft trigger were replaced by create trigger overdraft-trigger after update of balance on account then the trigger would be executed only on updates to balance; updates to other attributes would not cause it to be executed.
  • The referencing old row as clause can be used to create a variable storing the old value of an updated or deleted row. The referencing new row as clause can be used with inserts in addition to updates.
  • Triggers can be activated before the event (insert/delete/update) instead of after the event.

Such triggers can serve as extra constraints that can prevent invalid updates. For instance, if we wish not to permit overdrafts, we can create a before trigger that rolls back the transaction if the new balance is negative.As another example, suppose the value in a phone number field of an inserted tuple is blank, which indicates absence of a phone number. We can define a trigger that replaces the value by the null value. The set statement can be used to carry out such modifications.

create trigger setnull-trigger before update on r
referencing new row as nrow
for each row
when nrow.phone-number = ’ ’
set nrow.phone-number = null

  • Instead of carrying out an action for each affected row, we can carry out a single action for the entire SQL statement that caused the insert/delete/update. To do so, we use the for each statement clause instead of the for each row The clauses referencing old table as or referencing new table as can then be used to refer to temporary tables (called transition tables) containing all the affected rows. Transition tables cannot be used with before triggers, but can be used with after triggers, regardless of whether they are statement triggers or row triggers.

A single SQL statement can then be used to carry out multiple actions on the basis of the transition tables. Returning to our warehouse inventory example, suppose we have the following relations:

• inventory(item, level), which notes the current amount (number/weight/volume) of the item in the warehouse

Example of trigger for reordering an item.

  • minlevel(item, level), which notes the minimum amount of the item to be maintained
  • reorder(item, amount), which notes the amount of the item to be ordered when its level falls below the minimum
  • orders(item, amount), which notes the amount of the item to be ordered.

We can then use the trigger shown in Figure for reordering the item. Note that we have been careful to place an order only when the amount falls from above the minimum level to below the minimum level. If we only check that the new value after an update is below the minimum level, we may place an order erroneously when the item has already been reordered.

Many database systems provide nonstandard trigger implementations, or implement only some of the trigger features. For instance, many database systems do not implement the before clause, and the keyword on is used instead of after. Theymay not implement the referencing clause. Instead, they may specify transition tables by using the keywords inserted or deleted. Figure 6.5 illustrates how the overdraft trigger would be written in MS-SQLServer. Read the user manual for the database system you use for more information about the trigger features it supports.

When Not to Use Triggers

There are many good uses for triggers, such as those we have just seen but some uses are best handled by alternative techniques. For example, in the past, system designers used triggers to maintain summary data. For instance, they used triggers on insert/delete/update of a employee relation containing salary and dept attributes to maintain the total salary of each department. However, many database systems today support materialized views , which provide a much easier way to maintain summary data. Designers also used triggers extensively for replicating databases; they used triggers on insert/delete/update of each relation torecord the changes in relations called change or delta relations. A separate process copied over the changes to the replica (copy) of the database, and the system executed the changes on the replica. Modern database systems, however, provide built-in facilities for database replication, making triggers unnecessary for replication in most cases.

Example of trigger in MS-SQL server syntax.

In fact, many trigger applications, including our example overdraft trigger, can be substituted by “encapsulation” features being introduced in SQL:1999. Encapsulation can be used to ensure that updates to the balance attribute of account are done only through a special procedure. That procedure would in turn check for negative balance, and carry out the actions of the overdraft trigger. Encapsulations can replace the reorder trigger in a similar manner.

Triggers should be written with great care, since a trigger error detected at run time causes the failure of the insert/delete/update statement that set off the trigger. Furthermore, the action of one trigger can set off another trigger. In the worst case, this could even lead to an infinite chain of triggering. For example, suppose an insert trigger on a relation has an action that causes another (new) insert on the same relation.

The insert action then triggers yet another insert action, and so on ad infinitum. Database systems typically limit the length of such chains of triggers (for example to 16 or 32), and consider longer chains of triggering an error. Triggers are occasionally called rules, or active rules, but should not be confused with Datalog rules , which are really view definitions.

Security and Authorization

The data stored in the database need protection from unauthorized access and malicious destruction or alteration, in addition to the protection against accidental introduction of inconsistency that integrity constraints provide. In this section, we examine the ways in which data may be misused or intentionally made inconsistent. We then present mechanisms to guard against such occurrences.

Security Violations

Among the forms of malicious access are:

  • Unauthorized reading of data (theft of information)
  • Unauthorized modification of data
  • Unauthorized destruction of data
    Database security refers to protection from malicious access. Absolute protection of the database from malicious abuse is not possible, but the cost to the perpetrator can be made high enough to deter most if not all attempts to access the database without proper authority. To protect the database, we must take security measures at several levels:
  • Database system. Some database-system users may be authorized to access only a limited portion of the database. Other users may be allowed to issue queries, but may be forbidden to modify the data. It is the responsibility of the database system to ensure that these authorization restrictions are not violated.
  • Operating system. No matter how secure the database system is, weakness in operating-system security may serve as a means of unauthorized access to the database.
  • Network. Since almost all database systems allow remote access through terminals or networks, software-level security within the network software is as important as physical security, both on the Internet and in private networks.
  • Physical. Sites with computer systems must be physically secured against armed or surreptitious entry by intruders.
  • Human. Users must be authorized carefully to reduce the chance of any user giving access to an intruder in exchange for a bribe or other favors. Security at all these levels must be maintained if database security is to be ensured. A weakness at a low level of security (physical or human) allows circumvention of strict high-level (database) security measures.

In the remainder of this section, we shall address security at the database-system level. Security at the physical and human levels, although important, is beyond the scope of this text.

Security within the operating system is implemented at several levels, ranging from passwords for access to the system to the isolation of concurrent processes running within the system. The file system also provides some degree of protection. The bibliographical notes reference coverage of these topics in operating-system texts.

Finally, network-level security has gained widespread recognition as the Internet has evolved from an academic research platform to the basis of international electronic commerce. The bibliographic notes list textbook coverage of the basic principles of network security. We shall present our discussion of security in terms of the relational-data model, although the concepts of this chapter are equally applicable to all data models.

Authorization

We may assign a user several forms of authorization on parts of the database. For example,

  • Read authorization allows reading, but not modification, of data.
  • Insert authorization allows insertion of new data, but not modification of existing data.
  • Update authorization allows modification, but not deletion, of data.
  • Delete authorization allows deletion of data.
  • We may assign the user all, none, or a combination of these types of authorization. In addition to these forms of authorization for access to data, we may grant a user authorization to modify the database schema:
  • Index authorization allows the creation and deletion of indices.
  • Resource authorization allows the creation of new relations.
  • Alteration authorization allows the addition or deletion of attributes in a relation.
  • Drop authorization allows the deletion of relations.

The drop and delete authorization differ in that delete authorization allows deletion of tuples only. If a user deletes all tuples of a relation, the relation still exists, but it is empty. If a relation is dropped, it no longer exists.

We regulate the ability to create new relations through resource authorization. A user with resource authorization who creates a new relation is given all privileges on that relation automatically. Index authorization may appear unnecessary, since the creation or deletion of an index does not alter data in relations. Rather, indices are a structure for performance enhancements.However, indices also consume space, and all database modifications are required to update indices. If index authorization were granted to all users, those who performed updates would be tempted to delete indices, whereas those who issued queries would be tempted to create numerous indices. To allow the database administrator to regulate the use of system resources, it is necessary to treat index creation as a privilege.

The ultimate form of authority is that given to the database administrator. The database administrator may authorize new users, restructure the database, and so on. This form of authorization is analogous to that of a superuser or operator for an operating system.

Authorization and Views

In Relational Model, we introduced the concept of views as a means of providing a user with a personalized model of the database. A view can hide data that a user does not need to see. The ability of views to hide data serves both to simplify usage of the system and to enhance security. Views simplify system usage because they restrict the user’s attention to the data of interest. Although a user may be denied direct access to a relation, that user may be allowed to access part of that relation through a view. Thus, a combination of relational-level security and view-level security limits a user’s access to precisely the data that the user needs.

In our banking example, consider a clerk who needs to know the names of all customers who have a loan at each branch. This clerk is not authorized to see information regarding specific loans that the customer may have. Thus, the clerk must be denied direct access to the loan relation. But, if she is to have access to the information needed, the clerk must be granted access to the view cust-loan, which consists of only the names of customers and the branches at which they have a loan. This view can be defined in SQL as follows:

create view cust-loan as

(select branch-name, customer-namefrom borrower, loan
where borrower.loan-number = loan.loan-number)

Suppose that the clerk issues the following SQL query:

select *from cust-loan

Clearly, the clerk is authorized to see the result of this query. However, when the query processor translates it into a query on the actual relations in the database, it produces a query on borrower and loan. Thus, the system must check authorization on the clerk’s query before it begins query processing.

Creation of a view does not require resource authorization. A user who creates a view does not necessarily receive all privileges on that view. She receives only those privileges that provide no additional authorization beyond those that she already had. For example, a user cannot be given update authorization on a view without having update authorization on the relations used to define the view. If a user creates a view on which no authorization can be granted, the system will deny the view creation request. In our cust-loan view example, the creator of the view must have read authorization on both the borrower and loan relations.

Granting of Privileges

A user who has been granted some form of authorization may be allowed to pass on this authorization to other users. However, we must be careful how authorization may be passed among users, to ensure that such authorization can be revoked at some future time.

Consider, as an example, the granting of update authorization on the loan relation of the bank database. Assume that, initially, the database administrator grants update authorization on loan to users U1, U2, and U3, who may in turn pass on this authorization to other users. The passing of authorization from one user to another can be represented by an authorization graph. The nodes of this graph are the users.

The graph includes an edge Ui → Uj if user Ui grants update authorization on loan to Uj . The root of the graph is the database administrator. In the sample graph in Figure , observe that user U5 is granted authorization by both U1 and U2; U4 is granted authorization by only U1.

A user has an authorization if and only if there is a path from the root of the authorization graph (namely, the node representing the database administrator) down to the node representing the user.

Suppose that the database administrator decides to revoke the authorization of user U1. Since U4 has authorization from U1, that authorization should be revoked as well. However, U5 was granted authorization by both U1 and U2. Since the database administrator did not revoke update authorization on loan from U2, U5 retains update authorization on loan. If U2 eventually revokes authorization from U5, then U5 loses the authorization.

A pair of devious users might attempt to defeat the rules for revocation of authorization by granting authorization to each other, as shown in Figure a. If the database administrator revokes authorization from U2, U2 retains authorization through U3, as in Figure b. If authorization is revoked subsequently from U3, U3 appears to retain authorization through U2, as in Figure c. However, when the database administrator revokes authorization from U3, the edges fromU3 to U2 and from U2 to U3 are no longer part of a path starting with the database administrator.

Authorization-grant graph.

Authorization-grant graph.

Attempt to defeat authorization revocation.

Attempt to defeat authorization revocation.

We require that all edges in an authorization graph be part of some path originating with the database administrator. The edges between U2 and U3 are deleted, and the resulting authorization graph is as in Figure .

Notion of Roles

Consider a bank where there are many tellers. Each teller must have the same types of authorizations to the same set of relations. Whenever a new teller is appointed, she will have to be given all these authorizations individually.

A better scheme would be to specify the authorizations that every teller is to be given, and to separately identify which database users are tellers. The system can use these two pieces of information to determine the authorizations of each person who is a teller. When a new person is hired as a teller, a user identifier must be allocated to him, and he must be identified as a teller. Individual permissions given to tellers need not be specified again.

The notion of roles captures this scheme. A set of roles is created in the database. Authorizations can be granted to roles, in exactly the same fashion as they are granted to individual users. Each database user is granted a set of roles (which may be empty) that he or she is authorized to perform.

Authorization graph.

Figure: Authorization graph.

In our bank database, examples of roles could include teller, branch-manager, auditor, and system-administrator. A less preferable alternative would be to create a teller userid, and permit each teller to connect to the database using the teller userid. The problem with this scheme is that it would not be possible to identify exactly which teller carried out a transaction, leading to security risks. The use of roles has the benefit of requiring users to connect to the database with their own userid.

Any authorization that can be granted to a user can be granted to a role. Roles are granted to users just as authorizations are. And like other authorizations, a user may also be granted authorization to grant a particular role to others. Thus, branch managers may be granted authorization to grant the teller role.

Audit Trails

Many secure database applications require an audit trail be maintained. An audit trail is a log of all changes (inserts/deletes/updates) to the database, along with information such as which user performed the change and when the change was performed. The audit trail aids security in several ways. For instance, if the balance on an account is found to be incorrect, the bank may wish to trace all the updates performed on the account, to find out incorrect (or fraudulent) updates, as well as the persons who carried out the updates. The bank could then also use the audit trail to trace all the updates performed by these persons, in order to find other incorrect or fraudulent updates.

It is possible to create an audit trail by defining appropriate triggers on relation updates (using system-defined variables that identify the user name and time). However, many database systems provide built-inmechanisms to create audit trails,which are much more convenient to use. Details of how to create audit trails vary across database systems, and you should refer the database system manuals for details.

Authorization in SQL

The SQL language offers a fairly powerful mechanism for defining authorizations. We describe these mechanisms, as well as their limitations, in this section.

Privileges in SQL

The SQL standard includes the privileges delete, insert, select, andupdate. The select privilege corresponds to the read privilege. SQL also includes a references privilege that permits a user/role to declare foreign keys when creating relations. If the relation to be created includes a foreign key that references attributes of another relation, the user/role must have been granted references privilege on those attributes. The reason that the references privilege is a useful feature is somewhat subtle; we explain the reason later in this section.

The SQL data-definition language includes commands to grant and revoke privileges. The grant statement is used to confer authorization. The basic form of this statement is:

grant <privilege list> on <relation name or view name> to <user/role list>

The privilege list allows the granting of several privileges in one command. The following grant statement grants users U1, U2, and U3 select authorization on the account relation: grant select on account to U1, U2, U3The update authorization may be given either on all attributes of the relation or on only some. If update authorization is included in a grant statement, the list of attributes on which update authorization is to be granted optionally appears in parentheses immediately after the update keyword. If the list of attributes is omitted, the update privilege will be granted on all attributes of the relation.

This grant statement gives users U1, U2, andU3 update authorization on the amount attribute of the loan relation:

grant update (amount) on loan to U1, U2, U3

The insert privilege may also specify a list of attributes; any inserts to the relation must specify only these attributes, and the system either gives each of the remaining attributes default values (if a default is defined for the attribute) or sets them to null. The SQL references privilege is granted on specific attributes in a manner like that for the update privilege. The following grant statement allows user U1 to create relations that reference the key branch-name of the branch relation as a foreign key:

grant references (branch-name) on branch to U1

Initially, it may appear that there is no reason ever to prevent users from creating foreign keys referencing another relation. However, that foreign key constraints restrict deletion and update operations on the eferenced relation.

In the preceding example, if U1 creates a foreign key in a relation r referencing the branch-name attribute of the branch relation, and then inserts a tuple into r pertaining to the Perryridge branch, it is no longer possible to delete the Perryridge branch from the branch relation without also modifying relation r. Thus, the definition of a foreign key by U1 restricts future activity by other users; therefore, there is a need for the references privilege.

The privilege all privileges can be used as a short form for all the allowable privileges. Similarly, the user name public refers to all current and future users of the system. SQL also includes a usage privilege that authorizes a user to use a specified domain (recall that a domain corresponds to the programming-language notion of a type, and may be user defined).

Roles

Roles can be created in SQL:1999 as follows

create role teller

Roles can then be granted privileges just as the users can, as illustrated in this statement:

grant select on account to teller

Roles can be asigned to the users, as well as to some other roles, as these statements show.

grant teller to john
create role manager
grant teller to manager
grant manager to mary

Thus the privileges of a user or a role consist of

  • All privileges dire

All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

Database system concepts Topics