You can use the following mechanisms for loading a data warehouse:
Loading a Data Warehouse with SQL*Loader
Before any data transformations can occur within the database, the raw data must become accessible for the database. One approach is to load it into the database. Transportation in Data Warehouses, discusses several techniques for transporting data to an Oracle data warehouse. Perhaps the most common technique for transporting data is by way of flat files.
SQL*Loader is used to move data from flat files into an Oracle data warehouse. During this data load, SQL*Loader can also be used to implement basic data transformations. When using direct-path SQL*Loader, basic data manipulation, such as datatype conversion and simple NULL handling, can be automatically resolved during the data load. Most data warehouses use direct-path loading for performance reasons.
The conventional-path loader provides broader capabilities for data transformation than a direct-path loader: SQL functions can be applied to any column as those values are being loaded. This provides a rich capability for transformations during the data load. However, the conventional-path loader is slower than direct-path loader. For these reasons, the conventional-path loader should be considered primarily for loading and transforming smaller amounts of data.
The following is a simple example of a SQL*Loader controlfile to load data into the sales table of the sh sample schema from an external file sh_sales.dat. The external flat file sh_sales.dat consists of sales transaction data, aggregated on a daily level. Not all columns of this external file are loaded into sales. This external file will also be used as source for loading the second fact table of the sh sample schema, which is done using an external table:
The following shows the controlfile (sh_sales.ctl) to load the sales table:
It can be loaded with the following command:
$ sqlldr sh/sh control=sh_sales.ctl direct=true
Loading a Data Warehouse with External Tables
Another approach for handling external data sources is using external tables. Oracle's external table feature enables you to use external data as a virtual table that can be queried and joined directly and in parallel without requiring the external data to be first loaded in the database. You can then use SQL, PL/SQL, and Java to access the external data.
External tables enable the pipelining of the loading phase with the transformation phase. The transformation process can be merged with the loading process without any interruption of the data streaming. It is no longer necessary to stage the data inside the database for further processing inside the database, such as comparison or transformation. For example, the conversion functionality of a conventional load can be used for a direct-path INSERT AS SELECT statement in conjunction with theSELECT from an external table.
The main difference between external tables and regular tables is that externally organized tables are read-only. No DML operations (UPDATE/INSERT/DELETE) are possible and no indexes can be created on them.
External tables are a mostly compliant to the existing SQL*Loader functionality and provide superior functionality in most cases. External tables are especially useful for environments where the complete external source has to be joined with existing database objects or when the data has to be transformed in a complex manner. For example, unlike SQL*Loader, you can apply any arbitrary SQL transformation anduse the direct path insert method.
You can create an external table named sales_transactions_ext, representing the structure of the complete sales transaction data, represented in the external file sh_sales.dat. The product department is especially interested in a cost analysis on product and time. We thus create a fact table named cost in the sales history schema. The operational source data is the same as for the sales fact table. However, because we are not investigating every dimensional information that is provided, the data in the cost fact table has a coarser granularity than in the sales fact table, for example, all different distribution channels are aggregated. We cannot load the data into the cost fact table without applying the previously mentioned aggregation of the detailed information, due to the suppression of some of the dimensions.
The external table framework offers a solution to solve this. Unlike SQL*Loader, where you would have to load the data before applying the aggregation, you can combine the loading and transformation within a single SQL DML statement, as shown in the following. You do not have to stage the data temporarily before inserting into the target table.
The object directories must already exist, and point to the directory containing the sh_sales.dat file as well as the directory containing the bad and log files.
The external table can now be used from within the database, accessing some columns of the external data only, grouping the data, and inserting it into the costs fact table:
Loading a Data Warehouse with OCI and Direct-Path APIs
OCI and direct-path APIs are frequently used when the transformation and computation are done outside the database and there is no need for flat file staging.
Loading a Data Warehouse with Export/Import
Export and import are used when the data is inserted as is into the target system. No complex extractions are possible.
Data Warehousing Related Interview Questions
|Informatica Interview Questions||Data Warehousing Interview Questions|
|Networking Interview Questions||System Administration Interview Questions|
|Hadoop Interview Questions||MYSQL DBA Interview Questions|
|Data modeling Interview Questions||Hadoop Administration Interview Questions|
|Apache Flume Interview Questions||Informatica Admin Interview Questions|
Data Warehousing Tutorial
Data Warehousing Concepts
Physical Design In Data Warehouses
Hardware And I/o Considerations In Data Warehouses
Parallelism And Partitioning In Data Warehouses
Basic Materialized Views
Advanced Materialized Views
Overview Of Extraction, Transformation, And Loading
Extraction In Data Warehouses
Transportation In Data Warehouses
Loading And Transformation
Maintaining The Data Warehouse
Change Data Capture
Schema Modeling Techniques
Sql For Aggregation In Data Warehouses
Sql For Analysis And Reporting
Sql For Modeling
Olap And Data Mining
Using Parallel Execution
All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd
Wisdomjobs.com is one of the best job search sites in India.