registration
Sap Bo

Sap Bo

This course contains the basics of Sap Bo

Course introduction
Test Your Caliber
Interview Questions
Pragnya Meter Exam

Sap Bo

Resolving join problems in a schema 4

This describes the types of problems that can arise as you create joins between the tables in your schema. It explains how you can detect and resolve these join problems to ensure that the join paths taken by queries run on the universe return correct results

Here the variable $INSTALLDIR is the install root path for the data access files used by Designer and Web Intelligence. This is the Business Objects installation path with the operating system sub directory that contains the Designer executable and the data access drivers.

Under Windows$INSTALLDIR = ... Business Objects Business Objects Enterprise 12.0 < win32_x86. For example C: Program Files Business Objects Business Objects Enterprise 12.0 win32_x86.

What is a join path problem?

A join path is a series of joins that a query can use to access data in the tables linked by the joins.

Join path problems can arise from the limited way that lookup and fact tables are related in a relational database. The three major join path problems that you encounter when designing a schema are the following:

  • loops
  • chasm traps
  • fan traps

You can solve all these problems by creating aliases (a copy of a base table), contexts (a defined join path), and using features available in Designer to separate queries on measures or contexts.

This briefly defines lookup and fact tables, and describes the types of join path problems that you can encounter using these tables. It explains how you can use aliases, contexts, and other Designer features to resolve join path problems in your universe schema.

In Designer, you typically create joins between lookup tables and fact tables.

What is a Lookup Table

A lookup (or dimension) table contains information associated with a particular entity or subject. For example, a lookup table can hold geographical information on customers such as their names, telephone numbers as well as the cities and countries in which they reside. In Designer, dimension and detail objects are typically derived from lookup tables.

What is a Fact Table

A fact table contains statistical information about transactions. For example, it may contain figures such as Sales Revenue or Profit. In a universe, most but not all, measures are defined from fact tables.

What Types of Join Paths Return Incorrect Results?

Queries can return incorrect results due to the limitations in the way that joins are performed in relational databases. Depending on how the lookup and fact tables in your table schema are related, join paths can produce instances where a query returns too few, or too many rows. The following types of join paths can produce incorrect results:

Detecting and Solving Join Problems

Designer provides a number of methods for detecting and solving join problems.

You can use the following methods to detect and solve join path problems:

Most join path problems can be solved by creating an alias or implementing a context. You can use the automatic loop detection tools in Designer to identify loops in the schema, and automatic context detection to identify where Chasm traps occur. However, to resolve fan traps, you have to be able to visually analyze the schema and create aliases and if necessary contexts manually

Defining aliases

Aliases are references to existing tables in a schema. An Alias is a table that is an exact duplicate of the original table (base table), with a different name. The data in the table is exactly the same as the original table, but the different name "tricks" the SQL of a query to accept that you are using two different tables.

The Beach universe schema contains two alias tables; Resort_Country (the alias for the Country table) and Sponsor (the alias for the Customer table).

How are Aliases Used in a Schema?

You use aliases for two main reasons:

  • To use the table more than once in a query. This is the main reason for using aliases, and includes using aliases to solve loops and fan traps. The example Beach universe contains 2 aliases; Resort_Country for Country, and Sponsor for Customer.
  • To abbreviate the table name to save typing when writing freehand SQL.

Tip: Another possible use of aliases is to create an alias for each table as it is inserted into the schema. You then build the schema using the alias tables, not the original base tables. You place the base tables together away from the main universe structure. This allows you to give meaningful names to tables, and prevents the need to rebuild major sections of a universe structure should a base table need to be aliased at a later stage.

Using aliases to solve loops

The most common use of aliases in universe development is to solve potential loops in the use of common tables. A loop is a set of joins that defines a closed path through a set of tables in a schema. Loops occur when joins form multiple paths between lookup tables You use an alias to break a loop by providing alternative table for an original lookup table that is being used for multiple query paths.

Using aliases to solve fan traps

Aliases are also used to solve potential fan traps. These can occur in a serial one-to-many join path that can return inflated results when aggregates are summed at the "many" end of the joins.

Creating Aliases

You can create aliases manually, or let Designer automatically detect potential aliases that will solve a join path loop.

You need to create an alias manually to solve a fan trap. You also create aliases manually if you are creating a schema using only aliases and not the base tables.

Creating an alias manually

To create an alias manually:

  1. Click the table that you want to use to create an alias.
  2. Select Insert > Alias
  3. Or
    Click the Insert Alias button.

    The Creating an Alias box appears. It prompts you to enter a name for the new alias.

  4. Enter a new name for the aliased table, or keep the one proposed.
  5. Note: The name that you give to an alias should be relevant to the role of the alias to distinguish it from the base table. For example, Resort country is an alias for Country. Resort Country is used for queries returning data for resort countries, the base table Country is used in queries returning data for customer countries.

  6. Click OK.
  7. The aliased table appears in the Structure pane.
  8. Create any joins necessary between the alias and other tables in the schema.

Tip: To avoid confusing base tables with aliases, you can display the alias with the name of the base table it represents in the table title as follows: Select Tools > Options > Graphics, and then select the Aliased Name check box.

Renaming an alias

You can rename an alias at any time. Alias and table naming conventions are RDBMS dependent. You can rename an alias directly by renaming the table, or from a list of aliases in the universe.

Renaming an alias directly

To rename an alias directly:

  1. Click a table and select Edit > Rename Table.
  2. Or
    Right click a table and select Rename table from the contextual menu.

    The Rename Table dialog box appears.

  3. Type a new name in the Table Name box.
  4. The availability of the Owner and Qualification fields is database specific. If they are active, then you can modify these as necessary.

  5. Select the Upper case check box if you want the alias name to be shown as all uppercase.
  6. Or
    Select the Lower case check box if you want the alias name to be shown as all lowercase.
  7. Click OK.

Renaming an alias from a list

To rename an alias from a list:

  1. Select Tools > List of Aliases.
  2. The List of Aliases appears. It lists all the aliases in the active universe.
  3. Click an alias name in the list.
  4. Type a new name for the selected alias in the New Name text box.
  5. Click Apply.
  6. Click OK.

Deleting an alias

You delete an alias in the same way that you delete a table. If you have defined objects using the alias, you must modify these objects before you delete the alias, so that they use another table, or delete the objects if they are no longer necessary. If you do not modify or remove the objects using a deleted alias, queries using those objects will generate errors in Web Intelligence.

To delete an alias:

  1. Click an alias and select Edit > Clear.
  2. Or

    Right click an alias and select Clear from the contextual menu.

    Or

    Click an alias and press the DELETE key.

    If any objects use the alias, the following message appears:

    If no objects use the alias, you do not receive a confirmation box. The alias is deleted immediately.

  3. Click Yes.
  4. The alias is deleted from the Structure pane.

Defining contexts

Contexts are a collection of joins which provide a valid query path for Web Intelligence to generate SQL.

How are Contexts Used in a Schema?

You can use contexts in a universe schema for the following purposes:

  • Solving loops.
  • Solving chasm traps.
  • Assisting in some solutions for fan traps.
  • Assisting in detecting incompatibility for objects using aggregate awareness.

Using contexts to solve loops

The most common use of contexts is to separate two query paths, so that one query returns data for one fact table, and the other query returns data for another fact table. You use contexts to direct join paths in a schema which contains multiple fact tables. Aliases are not appropriate in such schema.

Using contexts to solve chasm and fan traps

Contexts are also used to solve potential chasm traps. These can occur when two many-to- one join paths converge on a single table. Multiple rows can be returned for a single dimension causing inflated results. Contexts can split out the query so that the correct number of rows are returned for the dimension. Contexts can also be used with aliases to solve fan traps.

Using contexts to determine AggregateAwareness incompatibility

You can use contexts to exclude objects that are not compatible with an object using the @AggregateAware function in its definition, from being used in a query with the aggregate aware object. .

Creating a Context

You can let Designer automatically detect contexts, or you can create contexts manually.

If you are using a context to resolve a loop or a chasm trap, you should always let Designer detect the contexts. However, for solving a fan trap (another join path problem), you may have to manually build a context.

Note: When you create one or more contexts, all joins must be included in one or multiple contexts. If a table is linked by a join that is not included in a context, the join will not be considered when a query is run.

The following procedures describe how you can create a context automatically an manually.

Creating a context automatically 

To create a context automatically

  1. Select Tools > Automated Detection >Detect Contexts.
  2. The Candidate Contexts box appears. It proposes candidate contexts for your schema. These candidate contexts may be necessary to solve either loops or a chasm trap, as chasm traps exist at the branch where two contexts meet.

  3. Click a context in the Candidate Contexts list and click the Add button.
  4. Repeat step 2 for each candidate context in the list.
  5. Note: Once you have added the candidate context to the Accepted Contexts list, you can rename a context as follows: Click a context and click the Rename button. An edit box appears. Type the new name and click OK.
  6. Click OK.
  7. The contexts are listed in the Contexts pane when List mode (View > List Mode) is active. The context for invoice Line is shown below.

  8. The context for Invoice_Line is shown below.

Creating a context manually

To create a context manually:

  1. Select Insert > Context.
  2. Or
    Click the Insert Context button.

    The New Context box appears.

  3. Type a name for the context in the Context Name text box.
  4. Select all the joins defining the context in the Current Context Joins list.
  5. You have the following options when creating the context:

  6. Click the Detect button to show the joins making up a suggested context with context name.
  7. Select the Show Selected Only check box to see only selected joins.
  8. Click the Check button.
  9. Designer checks the selected joins for any loops.

  10. Type a description of the data the context returns. This is the help text that a Web Intelligence user sees when they run a query that takes the context path. This text should be useful to the end user.
  11. Click OK.
  12. The context is created.

Editing a context

You can use a context editor to modify the following properties of a context:

  • Name
  • Joins included in the context
  • Description
  • You can also check the context for any unresolved loops.

Editing context properties

To edit context properties:

  1. Select View > List Mode.
  2. The List pane appears above the Structure pane. It contains list boxes for all the
    tables, joins, and contexts in the Structure pane.

  3. Double click a context name in the Contexts list pane.
  4. The Edit Context box appears.

  5. Type a new name in the Context Name box if you want to change the context name.
  6. Click a highlighted join to remove it from the context.
  7. Or
    Click a join that is not highlighted to add it to the context.
  8. Type a description for the context.
  9. Click OK.
  10. The modifications appear in the context.

Deleting a context

You can delete a context at any time from the Context list in the List pane. If you are adding or deleting a table or join within a context, you should delete the context before making the modification to the table or join.

Once the modification is complete, you can either manually recreate the context if it is being used to solve a chasm trap, or use Detect Contexts to automatically detect a new context if it is being used to resolve a loop.

Deleting a context from the Context list

To delete a context from the context list:

  1. Ensure that List mode is active (Select View > List Mode).
  2. Right click a context name in the Contexts list box and select Clear from the contextual menu.
  3. Or
    Click a context name in the Context list box and select Edit > Clear.

    The context is removed from the list.

Updating contexts

Contexts are not updated automatically when the universe structure is changed. If you add or remove any tables to the structure, or if you add or remove any joins, you must update all the contexts.

If you have made only a simple change to the structure, you can update the joins that are included in each context manually using either the Edit Context box or the List pane. However, if you have made significant changes to the universe structure, you should delete the current contexts and re-create them.

Join Paths that Prevent Context Detection

A one-to one-cardinality positioned at the end of a join path can prevent Context Detection in Designer from detecting a context. You resolve this problem by changing the cardinality of the table at the end of the join path to one-to-many.

Example: One-to-one cardinality preventing context detection

The schema below shows a table Sales_Extra_Info that contains particular information about each sale. It is joined by a one-to-one join to the Sales table.

When you visually examine the join paths, there are clearly two contexts in this schema; a reservations context, and a sales context. However, when you automatically detect contexts on this type of join path (Tools > Automated Detection >Detect Contexts), you receive the following message:

Designer has not considered the one-to-one join at the end of the join path in the context detection, so does not consider that there are two contexts.

Changing cardinality to allow the context detection

You solve this problem by setting the cardinality of the join linking Sale_Extra_Info to Sales to one-to-many. It can also be many-to-one, the important factor is not to have the one-to -one join at the end of the join path.The schema below now has a one-to-many join at the end of the join path.

When you run Detect Contexts, the two contexts are detected as shown below:

How do Contexts Affect Queries?

Depending on how you allow Web Intelligence users to use the objects defined on schema structures, contexts can lead to three types of queries being run:

  • Ambiguous queries
  • Inferred queries
  • Incompatible queries

You can run these types of queries in Web Intelligence to test the SQL generated by the contexts. If any of these query types produces an error, or returns incorrect data, you need to analyze the concerned join paths.

Ambiguous queries

An end user is prompted to choose between one query path or another. This occurs when a query includes objects that when used together do not give enough information to determine one context or the other.

When a query is ambiguous, Web Intelligence displays a dialog box that prompts the user to select one of two contexts. When the user selects a context, the corresponding tables and joins are inserted into the SQL query.

Example: Running an ambiguous query

A Web Intelligence user runs the following query:

Give me the services used by each age group of visitors for each resort:

When the query is run, a dialog box appears asking the user to choose a context, in this case either the Reservations or Sales context:

The user must choose if they want information for services reserved by age group, or services paid by age group. If they select the Reservations context, the following SQL is generated:

SELECT Service.service, Age_group.age_range, Resort.re
sort FROM Service, Age_group, Resort, Customer, Reserva
tions, Reservation_Line, Service_Line WHERE ( Resort.re
sort_id=Service_Line.resort_id ) AND ( Ser
vice.sl_id=Service_Line.sl_id ) AND ( Customer.age
between Age_group.age_min and Age_group.age_max ) AND
( Customer.cust_id=Reservations.cust_id ) AND (
Reservation_Line.res_id=Reservations.res_id ) AND (
Reservation_Line.service_id=Service.service_id )

The joins referenced by the other context (Sales) do not appear in the SQL.

Inferred queries

A Web Intelligence query is run without prompting an end user to choose a context. The query contains enough information for the correct context to be inferred. For example, a user runs the following query:

Give me the number of future guests by age group for each available service:

When the query is run, the data is returned without prompting the user to select a context. The Future Guests object is a sum on the Reservation_Line table, which is part of the Reservations context. Web Intelligence infers that the Reservation context is the one to use for the query.

Incompatible queries

Objects from two different contexts are combined in a query. The two Select statements are synchronized to display returned data in separate tables.

Example: Running an incompatible query

A Web Intelligence user runs the following query:

Give me the total number of guests company wide by age group and the months that reservations were made.

When the query is run, no prompt appears as Web Intelligence infers the use of both the Sales and Reservations contexts. The Select statements for both contexts are synchronized as follows:

The query is split into two parts:

  • Age Group and Number of Guests
  • Reservation Month

When retrieving the results of the two queries, Web Intelligence combines the results (using Age Group). It then displays the results in two tables in the same report as follows

.

To allow incompatible queries to be run in Web Intelligence, you must select the Multiple SQL statements in Designer for each context option. This is described below.

Selecting Multiple SQL statements for each context

To select Multiple SQL statements for each context:

  1. Select File > Parameters.
  2. The Universe Parameters dialog box appears.

  3. Click the SQL tab.
  4. The SQL page appears.

  5. Select the Multiple SQL statements for each context check box.
  6. Click OK.

Resolving loops

In a relational database schema, a common type of join path that returns too few rows is called a loop.

What is a Loop?

A loop is a set of joins that defines a closed path through a set of tables in a schema. Loops occur when joins form multiple paths between lookup tables. An example of a loop is shown below.

The schema contains two linked sets of information:

These two sets of information are linked in a common join path forming a loop. The lookup table Country can be the country where a resort is situated, or the country in which a customer lives.

Why loops in a universe schema and not in the database?

In a database, multiple paths between tables may be valid and implemented to meet specific user requirements. When each path is included individually in a query it returns a distinct set of results.

However, the schema that you design in Designer often needs to allow queries that include more than one path, which a relational database may not be designed to handle, so the information returned can be incorrect.

The rows that are returned are an intersection of the results for each path, so fewer rows are returned than expected. It is also often difficult to determine the problem when you examine the results.

How Does a Loop Affect Queries?

If you created a universe based on the above structure, any query run against the tables in the loop would return only results where the country values for resorts and the country values for customer origin are equivalent. This double restriction on the shared lookup Country table returns fewer rows than expected.

Example: Loop returns incorrect results

You create the following objects using the schema that contains the above loop:

You run the following query in Web Intelligence:

For each resort country, give me the number of guests from each country that stay at each resort.

You would expect the following type of result:

For the resorts in France and the US, you have the number of German, Japanese, and US visitors staying in resorts in those countries.

However, when you run the query using the universe containing the loop, you receive the following results:

This suggests that only visitors from the US stayed in resorts in the US. No other visitors came from any other country.

What is the loop doing to the query?

The joins in the Structure are used to create the Where clause in the inferred SQL of a query. The purpose of the joins is to restrict the data that is returned by the query. In a loop, the joins apply more restrictions than you anticipate, and the data returned is incorrect.

The Where clause created by the loop is shown below:

WHERE ( Country.country_id=Resort.country_id ) AND
( Resort.resort_id=Service_Line.resort_id ) AND ( Ser
vice_Line.sl_id=Service.sl_id ) AND ( Service.ser
vice_id=Invoice_Line.service_id ) AND (
Sales.inv_id=Invoice_Line.inv_id ) AND ( Cus
tomer.cust_id=Sales.cust_id ) AND ( City.city_id=Cus
tomer.city_id ) AND ( Region.region_id=City.region_id
) AND ( Country.country_id=Region.country_id ) AND
( Service_Line.service_line = 'Accommodation' )

The following two joins are both applying a restriction to the Country table:

  • Country.country_id=Resort.country_id
  • Country.country_id=Region.country_id

Country is serving two purposes:

  • Lookup for the resort country.
  • Lookup for the customer country of origin.

This creates a restriction so that data is returned only when the resort country is the same as the customer country. The resulting report shows only the number of visitors from the US who visited resorts in the US.

Depending on the nature of the loop, you can resolve the loop in Designer using either an alias to break the join path, or a context to separate the two join paths so that a query can only take one path or the other.

How does an alias break a loop?

An alias breaks a loop by using the same table twice in the same query for a different purpose. The alias is identical to the base table with a different name. The data in the alias is exactly the same as the original table, but the different name "tricks" SQL into accepting that you are using two different tables.

Note: You can resolve the loop satisfactorily by creating only one alias table in the example we have been using. The Region join uses the original Country table, while the Showroom join uses the alias table. However, you could create a separate alias table for each join in the original table. In some relational database systems, this is necessary.

Example: Breaking a loop with an alias

The schema below is the same schema that contained the loop above. It shows a join path in which the Country lookup table receives only the "one" ends of two joins, so it can be used for the following two purposes in the join path:

  • Countries for resorts
  • Countries for customers

You create an alias for Country and rename it Country_Region. The two "one" ended joins are now separated as follows:

  • Country keeps a join to the Resort table.
  • Country_Region is joined to the Region table.

The schema now appears as shown below:

When you run the same query that produced too few rows in the previous example:

For each resort country, give me the number of guests from each country that stay at each resort.

The Where clause for this query is now:

WHERE ( City.city_id=Customer.city_id ) AND (
City.region_id=Region.region_id ) AND ( Country.coun
try_id=Region.country_id ) AND ( Resort_Country.coun
try_id=Resort.country_id ) AND ( Cus
tomer.cust_id=Sales.cust_id ) AND ( In
voice_Line.inv_id=Sales.inv_id ) AND ( In
voice_Line.service_id=Service.service_id ) AND ( Re
sort.resort_id=Service_Line.resort_id ) AND ( Ser
vice.sl_id=Service_Line.sl_id ) AND ( Service_Line.ser
vice_line = 'Accommodation' )

There is now one join applying a restriction on the Country table and another join applying a restriction on the Resort_Country table. The loop has been broken.

When the query is run, the following table is returned:

How does a context resolve a loop?

A context resolves a loop by defining a set of joins that specify one specific path through tables in a loop. It ensures that joins are not included from different paths within the same SQL query.

You often use contexts in schema that contain multiple fact tables ("multiple stars") that share lookup tables.

Example:Resolving a loop with a context

The schema below contains statistical information about sales and reservations. The statistics relating to each type of transaction are stored in the fact tables Sales and Reservations. The schema contains a loop as a join path can follow the sales path or the reservations path to get service information.

If you created an alias for the Customer so that you had a Customer to Reservation join and a Customer_Sales to Sales join, you break the loop, but if you want to add a City table to the schema, you end up with a loop again as shown below:

You must continue creating aliases for each new table you add to the schema. This is difficult to maintain, and also ends up proliferating the number of similar objects using each table in the universe.

The only way to resolve this loop is to leave the loop in place, and create a context that specifies one or the other path around the schema. This ensures that queries answer questions for one transaction or the other, such as: Is the customer information needed from the perspective of sales or reservations

In the example, you can follow two different paths from the Customer table to the Service table:

The Reservation_Line context appears :

The Sales_Line context appears:

You then create different sets of objects from the tables in the different contexts. Users can then run either Reservation queries or Sales queries, depending on the objects they select.

Visually Identifying Loops

You can use the following guidelines to help you analyze your schema to determine whether an alias or context is appropriate for resolving loops.These can be useful to understand your schema, but you should use Detect Aliases and Detect Contexts to formally identify and resolve loops.

Automatically Identifying and Resolving Loops

You can use Designer to automatically detect loops and propose candidate aliases and contexts that you can insert in your schema to resolve the loops.

Cardinalities must be set before detecting loops

Before using the automatic loop detection and resolution features, all cardinalities must be set for all joins in the schema.

It is good design practise to either define cardinalities manually, or manually validate each cardinality that Designer proposes when using the automatic routine.

You can set cardinalities in two ways:

  • Manually.
  • Use Detect Cardinalities.

Designer Features to Detect and Resolve loops

You can use the following features in Designer to identify and resolve loops:

General method for identifying and resolving loops

A general procedure for detecting and resolving loops is given below. The sections that describe the step in detail are also given.

  1. Verify that all cardinalities are set.
  2. Run Detect Aliases to identify if your schema needs an alias to solve any loops.
  3. Insert the candidate aliases proposed by Detect Aliases.
  4. Run Detect Contexts to identify if your schema needs a context to solve a loop that could not be solved with an alias only.
  5. Implement the candidate contexts proposed by Detect Contexts.
  6. Test the resolved loop by creating objects and running queries.

Note: If you are resolving loops for a schema that already has objects defined on the tables, then you must redefine any objects that now use an alias and not the base table.

Detecting and creating an alias

You can use Detect Aliases, to automatically detect and indicate the tables causing loops in the active universe. Detect Aliases proposes candidate tables that you can edit, and insert in the schema.

Note: Before using Detect Aliases, verify that all the tables in schema are linked by joins, and that all cardinalities are set.

To detect and create an alias:

  1. Select Tools > Automated Detection > Deetct Aliases.
  2. Or
    Click the Detect Aliases button.

    The "Candidate Alias" dialog box appears. The left pane lists the table or tables that need an alias. The right pane lists proposed aliases that can be inserted to break the loop.

  3. Select a table in the left pane.
  4. A suggested name for the candidate alias is listed in the right pane.

  5. If you want to rename the proposed alias, click Rename and enter a new name in the "Rename" box.
  6. Click Create.
  7. A message box prompts you to confirm the creation of the alias.

  8. Click OK.
  9. The alias appear in the Structure pane.

  10. Repeat steps 2 to 5 for any remaining tables.
  11. Click Close.

Detecting and creating multiple aliases

Sometimes when you create an alias, you need to create additional aliases to accommodate new join paths. When using Detect Alias, if Designer detects the need for further aliases, the following dialog box appears when you click the Create button.

In such a situation, two options are available to you:

  • You can accept that only the first table proposed will be aliased.
  • You can alias all the tables listed.

Detecting and creating a context

You can use Detect Contexts to automatically detect the need for a context.

Detect Contexts also proposes a candidate context. You can edit the candidate context before it is implemented.

To detect and create a context:

  1. Select Tools > Automated Detection > Detect Contexts.
  2. Or
    Click the Detect Contexts button.

    The Candidate Contexts dialog box appears. The proposed contexts appear in the left pane.

  3. Click a context name.
  4. The tables included in the candidate context are highlighted in the schema.

  5. Click the Add button.
  6. The context name appears in the Accepted Contexts pane. You can remove any context from the right pane by selecting it, and then clicking the Remove button.

  7. Repeat steps 3 and 4, if applicable, to add the other contexts.
  8. If you want to rename a context, select it from the right pane, and then click the Rename button.
  9. The Rename Context dialog box appears. Type a new name.

  10. Click the OK button.

The contexts are listed in the Contexts box in the Universe window.

Note:

If your universe contains a loop that could be ambiguous for a user, you should always give a name to the context resolving the loop that is easy for users to understand. It should be clear to a Web Intelligence user what information path is represented by a context

Automatically detecting loops

You can detect loops in your universe using Detect Loops. This is a feature that automatically checks for loops in the schema, and proposes either an alias or context to solve the loop.

Detect Loops is useful to run quick checks for loops in the schema. It also proposes aliases and contexts to resolve detected loops; however, you have less control over the order that the alias and contexts are created than if you used Detect Aliases and Detect Contexts to resolve a loop.

Note: You can also use Check Integrity to automatically check for errors in universe structures, including joins, cardinalities, and loops. Check Integrity proposes solutions to any errors it discovers.

To detect loops in a schema:

  1. Verify that you have set cardinalities for all joins in the schema.
  2. Select Tools > Automated Detection > Detect Loops.
  3. Or
    Click the Detect Loops button.

    The Loop Detection box appears. It indicates how many loops have been detected and proposes a possible solution.

    The detected join path that forms a loop is simultaneously highlighted in the Structure pane as follows:

  4. Click the forward button to display the next loop and proposed solution.
  5. For each loop that Designer detects, the join path is highlighted in the structure pane.
  6. Click Close.

Creating aliases and contexts automatically

Designer proposes a candidate alias or a context to resolve a loop when you run Detect Loop. You can choose to insert the candidate alias or implement the candidate context directly from the Detect Loops box.

To create an alias using Detect Loop:

  1. Select Tools > Automated Detection > Detect Loops.
  2. The Detect Loops box appears. It indicates one or more loops detected in the schema, and proposes a candidate alias or context for each loop.

  3. Click the forward arrow button until the following message appears for a detected loop:
    This loop can be resolved with an alias.
  4. Click the Insert Alias button.

An alias is automatically inserted in the Structure pane. It is joined to the table that table that is causing the loop in the schema.

Creating a context using Detect Loop To create a context using Detect Loops:

  1. Select Tools > Automated Detection > Detect Loops.
  2. The Detect Loops box appears. It indicates one or more loops detected in the schema, and proposes a candidate alias or context for each loop.

  3. Click the forward arrow button until the following message appears for a detected loop:
  4. This loop is not covered by any context.

  5. Click the Candidate context button.
  6. The Candidate Contexts dialog box appears.

  7. Click a context name.
  8. The tables included in the candidate context are highlighted in the schema.

  9. Click the Add button.
  10. The context name appears in the Accepted Contexts pane. You can remove any< context from the right pane by selecting it, and then clicking the Remove button.

  11. Repeat steps 3 and 4, if applicable, to add the other contexts.
  12. Click OK.
  13. A context confirmation box appears.

  14. Click Close.
  15. The contexts are listed in the Contexts box in the Universe window.

Examples of Resolving Loops

The following are worked examples showing you how to do the following:

  • Creating an alias to break a loop caused by shared lookup tables
  • Creating an alias to break a loop caused by shared lookup tables
  • Determining when an alias is not appropriate to break a loop
  • Creating a context to resolve a loop
  • Using an alias and context together to resolve a loop

These schemas are not based on the Beach universe. They use a schema based on a Shipping company and show another perspective of certain loop resolution.

Creating an alias to break a loop caused by shared lookup tables

A sales database holds information about products sold to customers on a worldwide basis. These customers can:

  • Reside anywhere in the world
  • Order products from the company
  • Request that these products be shipped to a destination in any country

For example, a customer residing in the UK can order a vehicle and then ask for it to be shipped to Brazil.

The schema for this type of database is as follows:

You can interpret this schema as follows:

  • Each customer comes from one country.
  • Each customer can place one or more orders for a product.
  • The company ships each product ordered to a destination country, which may not necessarily be the same as the customer's country of residence.

The tables and their columns are shown below:

You run a query to obtain the following information:

  • Names of customers
  • Customer's country of residence
  • Dates of each order
  • Destination country of the shipment

The SQL to extract this data is as follows:

SELECT CUSTOMERS.LAST_NAME, COUNTRY.COUNTRY, OR
DERS.ORDER_ID, ORDERS.ORDER_DATE, COUNTRY.COUNTRY
FROM CUSTOMERS, ORDERS, COUNTRY WHERE
(CUSTOMERS.CUST_ID=ORDERS.CUST_ID) AND
(ORDERS.SHIP_COUNTRY=COUNTRY.COUNTRY_ID) AND
(CUSTOMER.LOC_COUNTRY=COUNTRY.COUNTRY_ID)

When executed, this SQL returns incomplete results; only those customers who requested a shipment to their country of residence are returned. The customers who chose another country for shipment are not returned.

The returned rows are an intersection of both the customer's country of residence and the destination country of the shipment. Instead of generating the full results shown below

The SQL returns only these results:

You can break the loop by inserting an alias. The first step in creating an alias is to identify the lookup table having more than one purpose in the database structure.

Identifying multi-purpose lookup tables

The COUNTRY table is used to look up both the customer's country of residence and the shipment destination. This type of table is called a shared lookup table.

You create an alias in the schema called DESTINATION.

The three original joins still exist but the loop has been broken by the DESTINATION alias so there is no longer a closed join path.

Referencing the shared lookup table and alias in the FROM clause

You now need to reference the table name twice in the From clause, the first time with its ordinary name and the second time with an alias; so the original name is suffixed with an alternative name.

The resulting SQL is as follows:

SELECT CUSTOMER.NAME, COUNTRY.NAME, ORDERS.OR
DER_DATE DESTINATION.NAME FROM CUSTOMER, ORDERS,
COUNTRY, COUNTRY DESTINATION WHERE (CUS
TOMER.CUST_ID=ORDERS.CUST_ID) AND (ORDERS.SHIP_DEST_ID=
DESTINATION.COUNTRY_ID) AND (CUSTOMER.CUST_LOC_ID=COUN
TRY.COUNTRY_ID)

Creating an alias to break a loop caused by shared lookup tables

A sales database holds information about customers living in different countries. These customers can place orders for goods that can be delivered by a number of couriers or shipping companies.

In this database, the names of the countries and shippers have been normalized into lookup tables. Normalization is a process that refines the relationships of tables by removing redundancies.

For structural reasons, rather than two lookup tables, only one lookup table (SYSLOOKUPS) was created with a code, description and type field. The type field indicates the particular type of information the record holds; for example, country or shipper.

Referred to as a "flexible lookup," this type of table often appears in schemas automatically generated by CASE tools.

The schema and table layout are shown below:

The SYSLOOKUPS table serves more than one purpose so you have to create as many aliases as the table has domains (distinct values for the type field). Based on the two purposes that are represented in the SYSLOOKUPS table, you can create two aliases, COUNTRY and SHIPPERS.

The resulting schema is shown below:

In Designer, you create the object Customer's Country defined as COUNTRY.DESCRIPTION and the object Shipper defined as SHIPPERS.DE SCRIPTION.

The corresponding joins would be:

CUSTOMERS.LOC_COUNTRY=COUNTRY.CODE
ORDERS.SHIP_ID=SHIPPERS.CODE

Using self restricting joins to restrict results

Once you have defined the objects, you now need to restrict each alias so that it returns only its own domain information and not that of the others.

For example, if you wanted to know the names of the shippers who dispatched two orders to customer 101, you would expect two rows to be returned.

However, the following SQL

SELECT ORDERS.ORDER_ID, ORDERS.CUST_ID, ORDERS.OR
DER_DATE, SHIPPERS.DESCRIPTION SHIPPER FROM ORDERS,
SYSLOOKUPS SHIPPERS WHERE (ORDERS.SHIP_ID=SHIP
PERS.CODE)

would produce the results below:

The query has returned the names of countries and shippers. Both "Man With a Van" and "USA" share code 1 while "France" and "Parcel Fun" share code 3.

You can correct the error as follows:

  • Apply a new self restricting join to the SHIPPERS alias. In the Edit Join dialog box, you
    set both Table1 and Table2 to SHIPPERS and enter the SQL expression
    SHIPPERS.TYPE='SHIP'.
  • Apply a new self restricting join to the COUNTRY alias. In the Edit Join dialog box, you set both Table1 and Table2 to COUNTRY and enter the SQL expression COUNTRY.TYPE='CTRY'.

Problems using restrictions

When you add the restriction to either the object's Where clause or to the existing join between the alias and the CUSTOMERS/ORDERS table, this can produce the following problems:

  • When you add the restriction to the Where clause of an object, you must also add the same restriction to every object built from the alias. If you are creating a number of objects on an alias that has many columns, you could have problems maintaining the universe.
  • The restriction to the join between the alias and another table only takes effect when the join is invoked. If you run a simple query containing only the Shipper object, every row in the SHIPPERS alias (including the unwanted Country rows) is returned as there is no reason to include the ORDERS table. As the join is not seen as necessary, the restriction is not applied.

Determining when an alias is not appropriate to break a loop

Creating an alias to resolve the loop described above is not the optimal solution. In this case, the use of contexts is a better solution. The following example describes why aliases are not appropriate, and why contexts are a better solution in this case.

If you try to identify the lookup table used for more than one purpose, it is not clear if it is the PRODUCTS table, or the CUSTOMERS table.

If you decide to create two aliases for the PRODUCTS table as shown :

The two aliases are ORDERED_PRODUCTS and LOANED_PRODUCTS.

This could be confusing for users as they are more likely to understand products, and not ordered products or loaned products.

If you also decide to add a COUNTRY table to indicate that the products are manufactured in several different countries you would have to join it directly to the PRODUCTS table.

The resulting schema would be as follows:

In the schema above, it was necessary to create two new aliases, ORDERED_ PRODUCTS _COUNTRY and LOANED _PRODUCTS _COUNTRY. The use of aliases is obviously an unsatisfactory and complicated solution for this particular schema.

In this case, you should create contexts.

Creating a context to resolve a loop

A database contains information about customers who can either buy or rent products. In this database, there are two different ways to present the relationship between the customers and the products:

  • By products that have been ordered by (or sold to) customers.
  • By products that have been rented to customers.

This database has the following type of schema:

If we wanted to run a query that returns only a list of customer names and a list of products, we could use the ORDER and ORDER_LINES table. The result would be a list of products ordered by each customer.

By using the LOANS and LOAN_LINES tables, we would obtain a list of products rented by each customer.

This schema contains a loop that causes any query involving all six joins simultaneously to result in a list made up of both products sold and rented to customers. If a product has been sold, but never rented to a customer or vice-versa, it would not appear in the list of results.

Using an alias and context together to resolve a loop

You can use contexts and aliases to resolve loops in a universe. The following example shows how to use both aliases and contexts together in a loop resolution.

A universe has the following schema:

You can use aliases and contexts to resolve the loops as follows:

  • Create two aliases for the COUNTRY table: CUST_COUNTRY and PROD_COUNTRY
  • Define two contexts to resolve the CUSTOMERS to PRODUCTS loops (Orders and Loans)
  • Ensure that the two joins between CUSTOMERS and CUST_COUNTRY and PRODUCTS and PROD_COUNTRY appear in both contexts.

The resulting schema appears as follows:

Resolving Chasm Traps

A chasm trap is a common problem in relational database schemas in which a join path returns more data than expected.

What is a Chasm Trap?

A chasm trap is a type of join path between three tables when two "many-to-one" joins converge on a single table, and there is no context in place that separates the converging join paths.

The example below shows a part of the Beach universe schema. The three tables have been separated from the rest of the schema to illustrate the chasm trap. It uses the same Club connection for data. The Service table receives the one ends of two one-to-many joins.

You will get incorrect results only when all the following conditions exist:

  • A "many to one to many relationship" exists among three tables in the universe
    structure.
  • The query includes objects based on two tables both at the "many" end of their
    respective joins.
  • There are multiple rows returned for a single dimension.

The following is an example that shows how queries that are run when all the above conditions exist return a Cartesian product.

Example: A chasm trap inflates results without warning

Using the schema above, a Web Intelligence user runs the following separate queries:

The user now runs a query that includes both paid guests and future guests:

The following results are returned:

The number of guests that have used, and future guests who have reserved to use the Sports service has increased considerably. A Cartesian product has been returned and the results are incorrect. This can be a serious problem if undetected. The above example could lead a manager at Island Resorts to think that sporting activities at the resorts are a more attractive service to guests, than the actual figures would indicate.

How does a chasm trap inflate results?

The chasm trap causes a query to return every possible combination of rows for one measure with every possible combination of rows for the other measure. In the example above, the following has occurred:

  • Number of guests transactions *Number of future guest transactions
  • Number of future guest transactions*Number of guests transactions

The following example examines in detail how a chasm trap returns a Cartesian product:

Example: Examining the Cartesian product of a chasm trap.

We need to examine the rows that are returned by the queries to make the aggregated figures. In our example, we can do this by adding the dimensions Days Billed and Days Reserved to the queries to return individual transaction details.

The Number of Guests report appears as follows:

The Number of Future Guests report appears as follows:

The two reports show the following number of transactions:

  • Number of Guests = 3 transactions
  • Number of Future Guests = 2 transactions

When the two dimensions are both added to the query, the following results are returned:

The query returns every possible combination of Number of Guests rows with every possible combination of Number of Future Guests rows: the Number of Guests transactions each appears twice, and the Number of Future Guests transactions each appears three times.

When a sum is made on the returned data, the summed results are incorrect.

Unlike loops, chasm traps are not detected automatically by Designer, however, you can use Detect Contexts (Tools > Detect Contexts) to automatically detect and propose candidate contexts in your schema.

Detect Contexts examines the many to one joins in the schema. It picks up the table that receives converging many to one joins and proposes contexts to separate the queries run on the table. This is the most effective way to ensure that your schema does not have a chasm trap.

You can also detect chasm traps graphically by analyzing the one-to-many join paths in your schema.

If you do not run Detect Contexts, nor spot the chasm trap in the schema, the only way to see the problem is to look at the detail rows. Otherwise there is nothing to alert you to the situation.

Detecting a Chasm Trap

You can find chasm traps by using Detect Contexts to detect and propose candidate contexts, and then examining the table where any two contexts diverge. This point where two contexts intersect is the source of a chasm trap.

If you have two fact tables with many to one joins converging to a single lookup table, then you have a potential chasm trap.

Resolving a Chasm Trap

To resolve a chasm trap you need to make two separate queries and then combine the results. Depending on the type of objects defined for the fact tables, and the type of end user environment, you can use the following methods to resolve a chasm trap:

  • Create a context for each fact table. This solution works in all cases.
  • Modify the SQL parameters for the universe so you can generate separate SQL queries for each measure. This solution only works for measure objects. It does not generate separate queries for dimension or detail objects.

Each of these methods is described in the following sections.

Using contexts to resolve chasm traps

You can define a context for each table at the "many" end of the joins. In our example you could define a context from SERVICE to RESERVATION_LINE and from SERVICE to INVOICE _ LINE.

When you run a query which includes objects from both contexts, this creates two Select statements that are synchronized to produce two separate tables in Web Intelligence, avoiding the creation of a Cartesian product.

When do you use contexts?

Creating contexts will always solve a chasm trap in a universe. When you have dimension objects in one or both fact tables, you should always use a context.

Using contexts to solve a chasm trap

To use contexts to resolve a chasm trap:

  1. Identify the potential chasm trap by analyzing the "one-to-many-to-one" join path relations in the schema.
  2. Select Tools > Detect Contexts.
    The Candidate Contexts box appears.
  3. Select a proposed context in the Candidate Contexts list box and click the Add button to add it to the Accept Contexts list box.
  4. Repeat for other listed contexts.
  5. The new contexts are listed in the Contexts pane of the List View bar.
  6. Select File > Parameters.
  7. The Universe Parameters dialog box appears.
  8. Click the SQL tab.
  9. The SQL page appears.
  10. Select the Multiple SQL statements for each Context check box.
  11. Click OK.

When you run queries on the tables in the chasm trap, the query is separated for measures and dimensions defined on the affected tables.

Using Multiple SQL Statements for Each Measure

If you have only measure objects defined for both fact tables, then you can use the Universe Parameters option Multiple SQL statements for each measure. This forces the generation of separate SQL queries for each measure that appears in the Query pane.

This solution does not work for dimension and detail objects.

The following table describes when you can use Multiple SQL Statements for Each Measure and when you should avoid its use:

To activate Multiple SQL Statements for Each Measure:

  1. Select File > Parameters from the menu bar.
  2. The Universe Parameters dialog box appears.
  3. Click the SQL tab.
  4. Select the Multiple SQL Statements for Each Measure check box in the Multiple Paths
    group box.
  5. Click OK.

Resolving Fan Traps

A fan trap is a less common problem than chasm traps in a relational database schema. It has the same effect of returning more data than expected.

What is a Fan Trap?

A fan trap is a type of join path between three tables when a "one-to-many" join links a table which is in turn linked by another "one-to-many" join. The fanning out effect of "one- to-many" joins can cause incorrect results to be returned when a query includes objects based on both tables.

A simple example of a fan trap is shown below:

When you run a query that asks for the total number of car models sold by each model line, for a particular customer, an incorrect result is returned as you are performing an aggregate function on the table at the "one" end of the join, while still joining to the "many" end.

Example: A fan trap inflates results without warning

Using the schema above, a Web Intelligence user runs the following query:

The following results are returned:

This result is correct. However, the end user adds the dimension Model ID to the query as follows:

The following report is created with the returned results:

The Sale Value aggregate appears twice. Once for each instance of Model_ID. When these results are aggregated in a report, the sum is incorrect. The fan trap has returned a Cartesian product. Wendy bought two cars for a total of $57,092.00, and not 114,184.00 as summed in the report. The inclusion of Model_ID in the query, caused the Sale Value to be aggregated for as many rows as Model_ID.

The fan trap using dimension objects in the query is solved by using an alias and contexts. The following schema is the solution to the fan trap schema:

The original query which returned the Cartesian product for Wendy Craig, now returns the following table when run with the above solution:

How Do You Detect a Fan Trap?

You cannot automatically detect fan traps. You need to visually examine the direction of the cardinalities displayed in the table schema.

If you have two tables that are referenced by measure objects and are joined in a series of many to one joins, then you may have a potential fan trap.

How Do You Resolve a Fan Trap?

There are two ways to solve a fan trap problem.

  • Create an alias for the table containing the initial aggregation, then use Detect Contexts (Tools > Detect Contexts) to detect and propose a context for the alias table and a context for the original table. This is the most effective way to solve the fan trap problem.
  • Altering the SQL parameters for the universe. This only works for measure objects.

Using aliases and contexts to resolve fan traps

You create an alias table for the table producing the aggregation and then detect and implement contexts to separate the query. You can do this as follows:

To use aliases and contexts to resolve a fan trap:

  1. Identify the potential fan trap by analyzing the "one-to-many-to-one-to-many" join path relations in the schema.
  2. Create an alias for the table that is producing the multiplied aggregation.
  3. For example, SaleValue in the previous example is an aggregate of the Sale_Total column in the Sales table. You create an alias called Sale_Total for Sale.

  4. Create a join between the original table and the alias table.
  5. If you create a one-to-one join, Designer does not detect the context, and you must build the context manually. In most cases you can use a one-to-many which allows automatic detection and implementation of contexts.

    For example you create a one-to-many join between Sale and Sale_Total.

  6. Build the object that is causing the aggregation on the alias tables.
  7. For example the original SaleValue object was defined as follows: sum (SALE. SALE_ TOTAL). The new definition for SaleValue is: sum (Sale_ Total. SALE_ TOTAL).

  8. Select Tools > Detect Contexts.
  9. The Candidate Contexts box appears. It proposes the candidate contexts for the join path for the base table and the new join path for the alias table.

    Note: If you have used a one-to-one join between the alias and the base table, then you need to create the context manually.
  10. Click a candidate context and click Add.
  11. Repeat for the other candidate context.
  12. Click OK.
  13. The contexts are created in the schema. You can view them in the Contexts pane when List Mode is active (View > List Mode). The context for the join path CLIENT>SALE>SALE_MODEL appears as follows:

    And a second context for the CLIENT>SALE>SALE_TOTAL join path:

  14. Select File > Parameters.
  15. The Parameters dialog appears.
  16. Click the SQL tab.SQL page.
  17. The SQL page appears.
  18. Select the Multiple SQL Statements for Each Context check box.
  19. Click OK.
  20. Run queries to test the fan trap solution.

Using Multiple SQL Statements for Each Measure

If you have only measure objects defined for both tables at the many end of the serial one- to-many joins, then you can use the Universe Parameters option Multiple SQL Statements for Each Measure. This forces the generation of separate SQL queries for each measure that appears in the Query pane. You cannot use this method to generate multiple queries for dimensions. If an end user can include dimensions from any of the tables that reference the measure objects in the query, then you must use an alias and context to resolve the fan trap.

Detecting join problems graphically

You can visually detect potential chasm and fan traps in your table schema by arranging the tables in the Structure pane so that the "many" ends of the joins are to one side of the pane, and the "one" ends to the other. The example below shows the Beach universe schema arranged with a one to many flow from left to right.

Potential chasm trap

The potential chasm traps are shown below:

Both of these join paths have been separated using the contexts Sales and Reservations.

Potential fan trap

A universe schema for a car sales database is shown below:

The potential fan traps involve the following tables

  • CUSTOMER, LOAN, and LOANLINE
  • CUSTOMER, SALES, and SALELINE
  • VARIETY, PRODUCT, and SALELINE

Tip: Once you have populated your schema with the necessary tables, don't start defining objects immediately. Allow some time to move tables around so that you have the all the one-to-many joins in the same direction. Designer is a graphic tool, so use the visual capabilities of the product to help you design universes. An hour or so moving tables around could save you a lot of time later in the design process.

Checking the universe

As you design your universe, you should test its integrity periodically. You can verify universe integrity as follows:

Checking Universe Integrity Automatically

You can set the following integrity check options in Designer to parse SQL structures at creation, universe export, and universe opening:

Setting automatic universe check options

To set automatic universe check options:

  1. Select Tools > Options.
  2. The Options dialog box opens to the General page.
  3. Select or clear check boxes for appropriate universe automatic check options in the Integrity group box.
  4. Click OK.

Checking Universe Integrity Manually

You can use Check Integrity to test to verify if the design of your active universe is accurate and up-to-date.

Check Integrity detects the following:

  • Errors in the objects, joins, conditions, and cardinalities of your universe.
  • Loops in join paths.
  • Any necessary contexts.
  • Changes to the target database.

Before examining the elements of the universe against those of the database, the function checks whether the connection to the database is valid. If the connection is not valid, the function stops and returns an error message.

Types of errors detected by Check Integrity

Check Integrity can detect:

  • Invalid syntax in the SQL definition of an object, condition, or join.
  • Loops
  • Isolated tables
  • Isolated joins
  • Loops within contexts
  • Missing or incorrect cardinalities

How does Check Integrity determine changes in a connected database?

The Check Integrity function sends a request to the database for a list of tables. It then compares this list with the tables in the universe. It carries out the same action for columns.

In the Structure pane, Check Integrity marks any tables or columns not matching those in the list as not available. These are tables or columns that may have been deleted or renamed in the database.

Note:

The option Check Cardinalities can be slow to run with large amounts of data. If there is ambiguous or missing data, results can also be inaccurate. If your database is large, and may have incomplete data entries, then you should not select the option Check Cardinalities. If you do use this option, then you can optimize the cardinality detection by modifying the PRM file.

Verifying universe integrity with Check Integrity

To verify universe integrity:

  1. Select Tools > Check Integrity.
  2. Or

    Click the Check Integrity button.

    The Integrity Check dialog box appears.

  3. Select check boxes for components to be verified.
  4. Clear check boxes for components not to be verified.
  5. Select the Quick Parsing check box to verify only the syntax of components.
  6. Or

    Select Thorough Parsing check box to verify both the syntax and semantics of components.

  7. Click OK.
  8. A message box displays the universe check progress.

    If Check Integrity encounters no errors, it displays "OK" beside each error type.

  9. Click the plus sign (+) beside the error type to view the list of components in which the error occurred.
  10. You can double click an item in the list to highlight the corresponding components in the Structure pane.

  11. Click the Print button to print the window contents.
  12. Click OK.

Note: Before selecting the Check for Loops check box, ensure that the cardinalities of joins have already been detected. Otherwise, the function erroneously identifies loops in the joins.

Refreshing the Universe Structure

If Check Integrity indicates that the database of your universe connection has been modified, you can use Refresh Structure to update the contents of the Structure pane.

Refresh Structure can modify the universe structure to comply with changes in the database as follows:

Refreshing a universe

To refresh the universe structure:

  • Select View > Refresh Structure.
  • A message box appears informing you of a change in the database, or that no update is needed if no changes have been made.

Searches relevant to you
Top