SQL*Loader's Control File: Loading Data - Oracle 8i

Loading Logical Records into Tables
This section describes the way in which you specify:

  • Which tables you want to load
  • Which records you want to load into them
  • Default characteristics for the columns in those records

Specifying Table Names
The INTO TABLE keyword of the LOAD DATA statement allows you to identify tables, fields, and datatypes. It defines the relationship between records in the datafile and tables in the database. The specification of fields and datatypes is described in later sections.

INTO TABLE
Among its many functions, the INTO TABLE keyword allows you to specify the table into which you load data. To load multiple tables, you include one INTO TABLE clause for each table you wish to load.

To begin an INTO TABLE clause, use the keywords INTO TABLE, followed by the name of the Oracle table that is to receive the data.

INTO TABLE

The table must already exist. The table name should be enclosed in double quotation marks if it is the same as any SQL or SQL*Loader keyword, if it contains any special characters, or if it is case sensitive.

The user running SQL*Loader should have INSERT privileges on the table. Otherwise, the table name should be prefixed by the username of the owner, as follows:
INTO TABLE SOPHIA.EMP

Table-Specific Loading Method
The INTO TABLE clause may include a table-specific loading method (INSERT, APPEND, REPLACE, or TRUNCATE) that applies only to that table. Specifying one of these methods within the INTO TABLE clause overrides the global table-loading method. The global table-loading method is INSERT, by default, unless a different method was specified before any INTO TABLE clauses.

Table-Specific OPTIONS Keyword
The OPTIONS keyword can be specified for individual tables in a parallel load. It is only valid for a parallel load.)

Choosing Which Records to Load
You can choose to load or discard a logical record by using the WHEN clause to test a condition in the record.

The WHEN clause appears after the table name and is followed by one or more field conditions. The syntax for field_condition is as follows:

Choosing Which Records to Load

For example, the following clause indicates that any record with the value "q" in the fifth column position should be loaded:

WHEN (5) = ’q’

A WHEN clause can contain several comparisons provided each is preceded by AND. Parentheses are optional, but should be used for clarity with multiple comparisons joined by AND. For example

WHEN (DEPTNO = ’10’) AND (JOB = ’SALES’)

To evaluate the WHEN clause, SQL*Loader first determines the values of all fields in the record. Then the WHEN clause is evaluated. A record is inserted into the table only if the WHEN clause is true.

Field conditions are discussed in detail in Specifying Field Conditions.
Case 5: Loading Data into Multiple Tables provides an example of the WHEN clause.

Using The WHEN Clause with LOB Files and Secondary Data Files If a WHEN directive fails on a record, that record is discarded (skipped). The skipped record is assumed to be contained completely in the main datafile; therefore, a secondary data file will not be affected if present.

Specifying Default Data Delimiters
If all data fields are terminated similarly in the datafile, you can use the FIELDS clause to indicate the default delimiters. The syntax is:

Specifying Default Data DelimitersSpecifying Default Data Delimiters

Note: Terminators are strings not limited to a single character. Also, TERMINATED BY EOF applies only to loading LOBs from LOBFILES.

Specifying Default Data Delimiters

Note: Enclosure strings do not have to be a single character.
You can override the delimiter for any given column by specifying it after the column name. Case 3: Loading a Delimited, Free-Format File provides an example.

Handling Short Records with Missing Data
When the control file definition specifies more fields for a record than are present in the record, SQL*Loader must determine whether the remaining (specified) columns should be considered null or whether an error should be generated.

If the control file definition explicitly states that a field’s starting position is beyond the end of the logical record, then SQL*Loader always defines the field as null. If a field is defined with a relative position (such as DNAME and LOC in the following example), and the record ends before the field is found; then SQL*Loader could either treat the field as null or generate an error. SQL*Loader uses the presence or
absence of the TRAILING NULLCOLS clause to determine the course of action.

TRAILING NULLCOLS
TRAILING NULLCOLS tells SQL*Loader to treat any relatively positioned columns that are not present in the record as null columns.

For example, if the following data

10 Accounting is read with the following control file

and the record ends after DNAME, the remaining LOC field is set to null. Without the TRAILING NULLCOLS clause, an error would be generated due to missing data.

Case 7: Extracting Data from a Formatted Report provides an example of TRAILING NULLCOLS.

Index Options
This section describes the SQL*Loader options that control how index entries are created.

SORTED INDEXES Option
The SORTED INDEXES option applies to direct path loads. It tells SQL*Loader that the incoming data has already been sorted on the specified indexes, allowing SQL*Loader to optimize performance. Syntax for this feature is given in High-Level Syntax Diagrams

SINGLEROW Option
The SINGLEROW option is intended for use during a direct path load with APPEND on systems with limited memory, or when loading a small number of records into a large table. This option inserts each index entry directly into the index, one record at a time.

By default, SQL*Loader does not use SINGLEROW to append records to a table. Instead, index entries are put into a separate, temporary storage area and merged with the original index at the end of the load. This method achieves better performance and produces an optimal index, but it requires extra storage space. During the merge, the original index, the new index, and the space for new entries all simultaneously occupy storage space.

With the SINGLEROW option, storage space is not required for new index entries or for a new index. The resulting index may not be as optimal as a freshly sorted one, but it takes less space to produce. It also takes more time because additional UNDO information is generated for each index insert. This option is suggested for use when either of the following situations exists:

  • Available storage is limited
  • The number of records to be loaded is small compared to the size of the table (a ratio of 1:20, or less, is recommended)

Specifying Field Conditions
A field condition is a statement about a field in a logical record that evaluates as true or false. It is used in the NULLIF and DEFAULTIF clauses, as well as in the WHEN clause.

A field condition is similar to the condition in the CONTINUEIF clause, with two important differences. First, positions in the field condition refer to the logical record, not to the physical record. Second, you may specify either a position in the logical record or the name of a column that is being loaded.

Specifying Field ConditionsSpecifying Field Conditions
where:

pos_spec : Specifies the starting and ending position of the comparison
field in the logical record. Either start-end or start:end is
acceptable. If you omit end, the length of the field is determined
by the length of the comparison string. If the lengths are
different, the shorter field is padded. Character strings are
padded with blanks, hexadecimal strings with zeros.
start : Specifies the starting position of the comparison field in the
logical record.
End : Specifies the ending position of the comparison field in the
logical record.
full_fieldname : full_fieldname is the full name of a field specified using dot
notation. If the field col2 is an attribute of a column object col1,
when referring to col2 in one of the directives, you must use the
notation col1.col2. The column_name and the fieldname
referencing or naming the same entity can be different because
column_name never includes the full name of the entity (no dot
notation).
operator : A comparison operator for either equal or not equal.
char_string : A string of characters enclosed in single or double quotation
marks that is compared to the comparison field. If the
comparison is true, the current record is inserted into the table.
X’hex_string’ : A byte string in hexadecimal format that is used in the same way
as char_string.
BLANKS : A keyword denoting an arbitrary number of blanks.

Comparing Fields to BLANKS
The BLANKS keyword makes it possible to determine easily if a field of unknown length is blank.
For example, use the following clause to load a blank field as null:

full_fieldname ... NULLIF column_name=BLANKS

The BLANKS keyword only recognizes blanks, not tabs. It can be used in place of a literal string in any field comparison. The condition is TRUE whenever the column is entirely blank.

The BLANKS keyword also works for fixed-length fields. Using it is the same as specifying an appropriately sized literal string of blanks. For example, the following specifications are equivalent:

fixed_field CHAR(2) NULLIF fixed_field=BLANKS
fixed_field CHAR(2) NULLIF fixed_field=" "

Note: There can be more than one blank in a multibyte character set. It is a good idea to use the BLANKS keyword with these character sets instead of specifying a string of blank characters. The character string will match only a specific sequence of blank characters, while the BLANKS keyword will match combinations of different blank characters.

Comparing Fields to Literals
When a data field is compared to a literal string that is shorter than the data field, the string is padded. Character strings are padded with blanks, for example:

NULLIF (1:4)=" "

This example compares the data in position 1:4 with 4 blanks. If position 1:4 contains 4 blanks, then the clause evaluates as true. Hexadecimal strings are padded with hexadecimal zeros. The clause

NULLIF (1:4)=X’FF’

compares position 1:4 to hexadecimal ’FF000000’.
Specifying Columns and Fields
You may load any number of a table’s columns. Columns defined in the database, but not specified in the control file, are assigned null values (this is the proper way to insert null values).

A column specification is the name of the column, followed by a specification for the value to be put in that column. The list of columns is enclosed by parentheses and separated with commas as follows:

( columnspec, columnspec, ... )

Each column name must correspond to a column of the table named in the INTO TABLE clause. A column name must be enclosed in quotation marks if it is a SQL or SQL*Loader reserved word, contains special characters, or is case sensitive.

If the value is to be generated by SQL*Loader, the specification includes the keyword RECNUM, the SEQUENCE function, or the keyword CONSTANT.

If the column’s value is read from the datafile, the data field that contains the column’s value is specified. In this case, the column specification includes a column name that identifies a column in the database table, and a field specification that describes a field in a data record. The field specification includes position, datatype, null restrictions, and defaults.

It is not necessary to specify all attributes when loading column objects. Any missing attributes will be set to NULL.

Specifying Filler Fields
Filler fields have names but they are not loaded into the table. However, filler fields can be used as arguments to init_specs (for example, NULLIF and DEFAULTIF) as well as to directives (for example, SID, OID, REF, BFILE). Also, filler fields can occur anyplace in the data file. They can be inside of the field list for an object or inside the definition of a VARRAY.

A sample filler field specification looks as follows:

Specifying the Datatype of a Data Field
A field’s datatype specification tells SQL*Loader how to interpret the data in the field. For example, a datatype of INTEGER specifies binary data, while INTEGER EXTERNAL specifies character data that represents a number. A CHAR field, however, can contain any character data.

You may specify one datatype for each field; if unspecified, CHAR is assumed. SQL*Loader Datatypes describes how SQL*Loader datatypes are converted into Oracle datatypes and gives detailed information on each SQL*Loader datatype.

Before the datatype is specified, the field’s position must be specified.

Specifying the Position of a Data Field
To load data from the datafile, SQL*Loader must know a field’s location and its length. To specify a field’s position in the logical record, use the POSITION keyword in the column specification. The position may either be stated explicitly or relative to the preceding field. Arguments to POSITION must be enclosed in parentheses, as follows:

Specifying the Position of a Data Field
where:

start : The starting column of the data field in the logical record. The
first character position in a logical record is 1.
end : The ending position of the data field in the logical record. Either
start-end or start:end is acceptable. If you omit end, the length of
the field is derived from the datatype in the datafile. Note that
CHAR data specified without start or end is assumed to be
length 1. If it is impossible to derive a length from the datatype,
an error message is issued.

* : Specifie that the data field follows immediately after the
previous field. If you use * for the first data field in the control
file, that field is assumed to be at the beginning of the logical
record. When you use * to specify position, the length of the field
is derived from the datatype.

+integer : You can use an on offset, specified as +integer, to offset the
current field from the previous field. A number of characters, as
specified by +integer, are skipped before reading the value for
the current field.
You may omit POSITION entirely. If you do, the position specification for the data field is the same as if POSITION(*) had been used.

For example:

Column ENAME is character data in positions 1 through 20, followed by column EMPNO, which is presumably numeric data in columns 22 through 26. Column ALLOW is offset from the end of EMPNO by +2. Therefore, it starts in column 28 and continues until a slash is encountered.

Using POSITION with Data Containing Tabs
When you are determining field positions, be alert for TABs in the datafile. The following situation is highly likely when using SQL*Loader’s advanced SQL string capabilities to load data from a formatted report:

  • You look at a printed copy of the report, carefully measuring all character positions, and create your control file.
  • The load then fails with multiple "invalid number" and "missing field" errors.

These kinds of errors occur when the data contains tabs. When printed, each tab expands to consume several columns on the paper. In the datafile, however, each Tab is still only one character. As a result, when SQL*Loader reads the datafile, the POSITION specifications are wrong.

To fix the problem, inspect the datafile for tabs and adjust the POSITION specifications, or else use delimited fields.

The use of delimiters to specify relative positioning of fields is discussed in detail in Specifying Delimiters. Especially note how the delimiter WHITESPACE can be used.

Using POSITION with Multiple Table Loads
In a multiple table load, you specify multiple INTO TABLE clauses. When you specify POSITION(*) for the first column of the first table, the position is calculated relative to the beginning of the logical record. When you specify POSITION(*) for the first column of subsequent tables, the position is calculated relative to the last column of the last table loaded.

Thus, when a subsequent INTO TABLE clause begins, the position is not set to the beginning of the logical record automatically. This allows multiple INTO TABLE clauses to process different parts of the same physical record. For an example, see the second example in Extracting Multiple Logical Records.

A logical record may contain data for one of two tables, but not both. In this case, you would reset POSITION. Instead of omitting the position specification or using POSITION(*+n) for the first field in the INTO TABLE clause, use POSITION(1) or POSITION(n).

Examples

If these were the first two column specifications, SITEID would begin in column1, and SITELOC would begin in the column immediately following.

Column ENAME is character data in positions 1 through 20, followed by column EMPNO, which is presumably numeric data in columns 22 through 26. Column ALLOW is offset from the end of EMPNO by +2, so it starts in column 28 and continues until a slash is encountered.

Using Multiple INTO TABLE Statements
Multiple INTO TABLE statements allow you to:

  • Load data into different tables
  • Extract multiple logical records from a single input record
  • Distinguish different input record formats

In the first case, it is common for the INTO TABLE statements to refer to the same table. This section illustrates the different ways to use multiple INTO TABLE statements and shows you how to use the POSITION keyword.

Note: A key point when using multiple INTO TABLE statements is that field scanning continues from where it left off when a new INTO TABLE statement is processed. The remainder of this section details important ways to make use of that behavior. It also describes alternative ways using fixed field locations or the POSITION keyword.

Extracting Multiple Logical Records
Some data storage and transfer media have fixed-length physical records. When the data records are short, more than one can be stored in a single, physical record to use the storage space efficiently.

In this example, SQL*Loader treats a single physical record in the input file as two logical records and uses two INTO TABLE clauses to load the data into the EMP table. For example, assume the data is as follows:

1119 Smith 1120 Yvonne 1121 Albert 1130 Thomas

Then the following control file extracts the logical records:

Relative Positioning
The same record could be loaded with a different specification. The following control file uses relative positioning instead of fixed positioning. It specifies that each field is delimited by a single blank (" "), or with an undetermined number of blanks and tabs (WHITESPACE):

The important point in this example is that the second EMPNO field is found immediately after the first ENAME, although it is in a separate INTO TABLE clause. Field scanning does not start over from the beginning of the record for a new INTO TABLE clause. Instead, scanning continues where it left off.

To force record scanning to start in a specific location, you use the POSITION keyword. That mechanism is described in Distinguishing Different Input Record Formats and in Loading Data into Multiple Tables.

Distinguishing Different Input Record Formats
A single datafile might contain records in a variety of formats. Consider the following data, in which EMP and DEPT records are intermixed:

A record ID field distinguishes between the two formats. Department records have a "1" in the first column, while employee records have a "2". The following control file uses exact positioning to load this data:

Relative Positioning
The records in the previous example could also be loaded as delimited data. In this case, however, it is necessary to use the POSITION keyword. The following control file could be used:

The POSITION keyword in the second INTO TABLE clause is necessary to load this data correctly. This keyword causes field scanning to start over at column 1 when checking for data that matches the second format. Without it, SQL*Loader would look for the RECID field after DNAME.

Loading Data into Multiple Tables
By using the POSITION clause with multiple INTO TABLE clauses, data from a single record can be loaded into multiple normalized tables.

Summary
Multiple INTO TABLE clauses allow you to extract multiple logical records from a single input record and recognize different record formats in the same file.
For delimited data, proper use of the POSITION keyword is essential for achieving the expected results.

When the POSITION keyword is not used, multiple INTO TABLE clauses process different parts of the same (delimited data) input record, allowing multiple tables to be loaded from one record. When the POSITION keyword is used, multiple INTO TABLE clauses can process the same record in different ways, allowing multiple formats to be recognized in one input file.

Generating Data
The functions described in this section provide the means for SQL*Loader to generate the data stored in the database record, rather than reading it from a datafile. The following functions are described:

  • CONSTANT
  • RECNUM
  • SYSDATE
  • SEQUENCE

Loading Data Without Files
It is possible to use SQL*Loader to generate data by specifying only sequences, record numbers, system dates, and constants as field specifications.
SQL*Loader inserts as many records as are specified by the LOAD keyword. The LOAD keyword is required in this situation. The SKIP keyword is not permitted.
SQL*Loader is optimized for this case. Whenever SQL*Loader detects that only generated specifications are used, it ignores any specified datafile—no read I/O is performed.
In addition, no memory is required for a bind array. If there are any WHEN clauses in the control file, SQL*Loader assumes that data evaluation is necessary, and input records are read.

Setting a Column to a Constant Value
This is the simplest form of generated data. It does not vary during the load, and it does not vary between loads.

CONSTANT

To set a column to a constant value, use the keyword CONSTANT followed by a value:
CONSTANT value
CONSTANT data is interpreted by SQL*Loader as character input. It is converted, as necessary, to the database column type.

You may enclose the value within quotation marks, and you must do so if it contains white space or reserved words. Be sure to specify a legal value for the target column. If the value is bad, every record is rejected.

Numeric values larger than 2**32 - 1 (4,294,967,295) must be enclosed in quotation marks.

Note: Do not use the CONSTANT keyword to set a column to null. To set a column to null, do not specify that column at all. Oracle automatically sets that column to null when loading the record. The combination of CONSTANT and a value is a complete column specification.

Setting a Column to the Datafile Record Number
Use the RECNUM keyword after a column name to set that column to the number of the logical record from which that record was loaded. Records are counted sequentially from the beginning of the first datafile, starting with record 1.

RECNUM is incremented as each logical record is assembled. Thus it increments for records that are discarded, skipped, rejected, or loaded. If you use the option SKIP=10, the first record loaded has a RECNUM of 11.

RECNUM
The combination of column name and the RECNUM keyword is a complete column specification.

column_name RECNUM

Setting a Column to the Current Date
A column specified with SYSDATE gets the current system date, as defined by the SQL language SYSDATE function.

SYSDATE
The combination of column name and the SYSDATE keyword is a complete column specification.

column_name SYSDATE

The database column must be of type CHAR or DATE. If the column is of type CHAR, then the date is loaded in the form ’dd-mon-yy.’ After the load, it can be accessed only in that form. If the system date is loaded into a DATE column, then it can be accessed in a variety of forms that include the time and the date.

A new system date/time is used for each array of records inserted in a conventional path load and for each block of records loaded during a direct path load.

Setting a Column to a Unique Sequence Number
The SEQUENCE keyword ensures a unique value for a particular column. SEQUENCE increments for each record that is loaded or rejected. It does not increment for records that are discarded or skipped.

SEQUENCE
The combination of column name and the SEQUENCE function is a complete column specification.

SEQUENCE
where:

column_name : The name of the column in the database to which to assign the
sequence.

EQUENCE : Use the SEQUENCE keyword to specify the value for a column.
integer Specifies the specific sequence number to begin with.

COUNT : The sequence starts with the number of records already in the
table plus the increment.

MAX : The sequence starts with the current maximum value for that

column plus the increment.

incr : The value that the sequence number is to increment after a
record is loaded or rejected.

If a record is rejected (that is, it has a format error or causes an Oracle error), the generated sequence numbers are not reshuffled to mask this. If four rows are assigned sequence numbers 10, 12, 14, and 16 in a particular column, and the row with 12 is rejected; the three rows inserted are numbered 10, 14, and 16, not 10, 12, 14. This allows the sequence of inserts to be preserved despite data errors. When you correct the rejected data and reinsert it, you can manually set the columns to agree with the sequence.

Case 3: Loading a Delimited, Free-Format File provides an example of the SEQUENCE function.

Generating Sequence Numbers for Multiple Tables
Because a unique sequence number is generated for each logical input record, rather than for each table insert, the same sequence number can be used when inserting data into multiple tables. This is frequently useful behavior.

Sometimes, however, you might want to generate different sequence numbers for each INTO TABLE clause. For example, your data format might define three logical records in every input record. In that case, you can use three INTO TABLE clauses, each of which inserts a different part of the record into the same table. Note that, when you use SEQUENCE(MAX), SQL*Loader will use the maximum from each table,which can lead to inconsistencies in sequence numbers.

To generate sequence numbers for these records, you must generate unique numbers for each of the three inserts. There is a simple technique to do so. Use the number of table-inserts per record as the sequence increment and start the sequence numbers for each insert with successive numbers.

Example
Suppose you want to load the following department names into the DEPT table. Each input record contains three department names, and you want to generate the department numbers automatically.

Accounting Personnel Manufacturing Shipping Purchasing Maintenance ...

You could use the following control file entries to generate unique department numbers:

The first INTO TABLE clause generates department number 1, the second number 2, and the third number 3. They all use 3 as the sequence increment (the number of department names in each record). This control file loads Accounting as department number 1, Personnel as 2, and Manufacturing as 3.

The sequence numbers are then incremented for the next record, so Shipping loads as 4, Purchasing as 5, and so on.

SQL*Loader Datatypes
SQL*Loader has a rich palette of datatypes. These datatypes are grouped into portable and nonportable datatypes. Within each of these two groups, the datatypes are subgrouped into length-value datatypes and value datatypes.

The main grouping, portable versus nonportable, refers to the platform dependency of the datatype. This issue arises due to a number of platform specifics such as differences in the byte ordering schemes of different platforms (big-endian versus little-endian), differences in how many bits a particular platform is (16-bit, 32-bit, 64-bit), differences in signed number representation schemes (2’s complement versus 1’s complement), and so on. Note that not all of these problems apply to all nonportable datatypes.

The nonportable datatypes consist of VALUE and LENGTH-VALUE datatypes. VALUE datatypes assume that a datafield has a single part. LENGTH-VALUE datatypes require that the datafield consist of two subfields: length and value. The length subfield specifies the length of the value subfield.

Nonportable Datatypes

VALUE Datatypes LENGTH-VALUE Datatypes INTEGER SMALLINT VARGRAPHIC FLOAT VARCHAR DOUBLE VARRAW BYTEINT LONG VARRAW ZONED (packed) DECIMAL

The syntax for the nonportable datatypes is shown in the syntax diagram for datatype_spec.

INTEGER
The data is a full-word binary integer (unsigned). If you specify start:end in the POSITION clause, end is ignored. The length of the field is the length of a full-word integer on your system. (The datatype is LONG INT in C.) This length cannot be overridden in the control file.

INTEGER

SMALLINT
The data is a half-word binary integer (unsigned). If you specify start:end in the POSITION clause, end is ignored. The length of the field is the length of a half-word integer on your system.

SMALLINT

Additional Information: This is the SHORT INT datatype in the C programming language. One way to determine its length is to make a small control file with no data and look at the resulting log file. This length cannot be overridden in the control file. See your Oracle operating system-specific documentation for details.

FLOAT
The data is a single-precision, floating-point, binary number. If you specify end in the POSITION clause, end is ignored. The length of the field is the length of a single-precision, floating-point binary number on your system. (The datatype is FLOAT in C.) This length cannot be overridden in the control file.

DOUBLE
The data is a double-precision, floating-point binary number. If you specify end in the POSITION clause, end is ignored. The length of the field is the length of a double-precision, floating-point binary number on your system. (The datatype is DOUBLE or LONG FLOAT in C.) This length cannot be overridden in the control file.

DOUBLE

BYTEINT
The decimal value of the binary representation of the byte is loaded. For example, the input character x"1C" is loaded as 28. The length of a BYTEINT field is always 1 byte. If POSITION(start:end) is specified, end is ignored. (The datatype is UNSIGNED CHAR in C.)

The syntax for this datatype is:

ZONED
ZONED data is in zoned decimal format: a string of decimal digits, one per byte, with the sign included in the last byte. (In COBOL, this is a SIGN TRAILING field.) The length of this field is equal to the precision (number of digits) that you specify. The syntax for this datatype is:

ZONED
In this syntax, precision is the number of digits in the number, and scale (if given) is the number of digits to the right of the (implied) decimal point. The following example specifies an 8-digit integer starting at position 32:

sal POSITION(32) ZONED(8),
DECIMAL
DECIMAL data is in packed decimal format: two digits per byte, except for the last byte, which contains a digit and sign. DECIMAL fields allow the specification of an implied decimal point, so fractional values can be represented. The syntax for this datatype is:

DECIMAL
Where:
precision : The number of digits in a value. The character length of the field,
as computed from digits, is (N+1)/2 rounded up.
scale : The scaling factor, or number of digits to the right of the decimal
point. The default is zero (indicating an integer). Scale can be
greater than the number of digits but cannot be negative.
An example is:

sal DECIMAL (7,2)

This example would load a number equivalent to +12345.67. In the data record, this field would take up 4 bytes. (The byte length of a DECIMAL field is equivalent to (N+1)/2, rounded up, where N is the number of digits in the value, and one is added for the sign.)

VARGRAPHIC
The data is a varying-length, double-byte character string. It consists of a length subfield followed by a string of double-byte characters (DBCS).

Additional Information: The size of the length subfield is the size of the SQL*Loader SMALLINT datatype on your system (C type SHORT INT). The length of the current field is given in the first 2 bytes. This length is a count of graphic (double-byte) characters. Therefore, it is multiplied by 2 to determine the number of bytes to read.

The syntax for this datatype is:

Additional Information

A maximum length specified after the VARGRAPHIC keyword does not include the size of the length subfield. The maximum length specifies the number of graphic (double-byte) characters. It is multiplied by 2 to determine the maximum length of the field in bytes.

The default maximum field length is 4Kb graphic characters, or 8 Kb (2 * 4Kb). It is a good idea to specify a maximum length for such fields whenever possible, to minimize memory requirements.

If a position specification is specified (using pos_spec) before the VARGRAPHIC statement, it provides the location of the length subfield, not of the first graphic character. If you specify pos_spec(start:end), the end location determines a maximum length for the field. Both start and end identify single-character (byte) positions in the file. Start is subtracted from (end + 1) to give the length of the field in bytes. If a maximum length is specified, it overrides any maximum length calculated from pos_spec.

If a VARGRAPHIC field is truncated by the end of the logical record before its full length is read, a warning is issued. Because a VARGRAPHIC field’s length is embedded in every occurrence of the input data for that field, it is assumed to be accurate.

VARGRAPHIC data cannot be delimited.

VARCHAR
A VARCHAR field is a length-value datatype. It consists of a binary length subfield followed by a character string of the specified length.

Additional Information: The size of the length subfield is the size of the SQL*Loader SMALLINT datatype on your system (C type SHORT INT).

The syntax for this datatype is:

VARCHAR

A maximum length specified in the control file does not include the size of the length subfield. If you specify the optional maximum length after the VARCHAR keyword, then a buffer of that size is allocated for these fields.

The default buffer size is 4 Kb. Specifying the smallest maximum length that is needed to load your data can minimize SQL*Loader’s memory requirements, especially if you have many VARCHAR fields.

The POSITION clause, if used, gives the location of the length subfield, not of the first text character. If you specify POSITION(start:end), the end location determines a maximum length for the field. Start is subtracted from (end + 1) to give the length of the field in bytes. If a maximum length is specified, it overrides any length calculated from POSITION.

If a VARCHAR field is truncated by the end of the logical record before its full length is read, a warning is issued. Because a VARCHAR field’s length is embedded in every occurrence of the input data for that field, it is assumed to be accurate. VARCHAR data cannot be delimited.

VARRAW
VARRAW is made up of a 2-byte binary length-subfield followed by a RAW string value-subfield.

VARRAW results in a VARRAW with 2 byte length-subfield and a maximum size of 4 Kb (that is, the default). VARRAW(65000) results in a VARRAW with a length subfield of 2 bytes and a maximum size of 65000 bytes.

LONG VARRAW
LONG VARRAW is a VARRAW with a 4-byte length-subfield instead of a 2-byte length-subfield.

LONG VARRAWresults in a VARRAWwith 4-byte length subfield and a maximum size of 4 Kb (that is, the default). LONG VARRAW(300000) results in a VARRAW with a length subfield of 4 bytes and a maximum size of 300000 bytes.

Portable Datatypes VALUE Datatypes LENGTH-VALUE Datatypes CHAR VARCHARC DATE VARRAWC GRAPHIC GRAPHIC EXTERNAL Numeric External (INTEGER, FLOAT, DECIMAL, ZONED) RAW

The character datatypes are CHAR, DATE, and the numeric EXTERNAL datatypes. These fields can be delimited and can have lengths (or maximum lengths) specified in the control file.

CHAR
The data field contains character data. The length is optional and is taken from the POSITION specification if it is not present here. If present, this length overrides the length in the POSITION specification. If no length is given, CHAR data is assumed to have a length of 1. The syntax is:

CHAR

A field of datatype CHAR may also be variable-length that is delimited or enclosed.

Attention: If the column in the database table is defined as LONG or a VARCHAR2, you must explicitly specify a maximum length (maximum for LONG is 2 gigabytes) either with a length specifier on the CHAR keyword or with the POSITION keyword. This guarantees that a large enough buffer is allocated for the value and is necessary even if the data is delimited or enclosed.

DATE
The data field contains character data that should be converted to an Oracle date using the specified date mask. The syntax is:

DATE

For example:

Attention: Whitespace is ignored and dates are parsed from left to right unless delimiters are present.

The length specification is optional, unless a varying-length date mask is specified. In the preceding example, the date mask specifies a fixed-length date format of 11 characters. SQL*Loader counts 11 characters in the mask, and therefore expects a maximum of 11 characters in the field, so the specification works properly. But, with a specification such as

DATE "Month dd, YYYY"

the date mask is 14 characters, while the maximum length of a field such as

September 30, 1991

is 18 characters. In this case, a length must be specified. Similarly, a length is required for any Julian dates (date mask "J"). A field length is required any time the length of the date string could exceed the length of the mask (that is, the count of characters in the mask).

If an explicit length is not specified, it can be derived from the POSITION clause. It is a good idea to specify the length whenever you use a mask, unless you are absolutely sure that the length of the data is less than, or equal to, the length of the mask.

An explicit length specification, if present, overrides the length in the POSITION clause. Either of these overrides the length derived from the mask. The mask may be any valid Oracle date mask. If you omit the mask, the default Oracle date mask of "dd-mon-yy" is used.

The length must be enclosed in parentheses and the mask in quotation marks. Case 3: Loading a Delimited, Free-Format File provides an example of the DATE datatype.
A field of datatype DATE may also be specified with delimiters.
A date field that consists entirely of whitespace produces an error unless NULLIF BLANKS is specified.

GRAPHIC
The data is a string of double-byte characters (DBCS). Oracle does not support DBCS; however, SQL*Loader reads DBCS as single bytes. Like RAW data, GRAPHIC fields are stored without modification in whichever column you specify. The syntax for this datatype is:

GRAPHIC
For GRAPHIC and GRAPHIC EXTERNAL, specifying POSITION(start:end) gives the exact location of the field in the logical record.

If you specify the length after the GRAPHIC (EXTERNAL) keyword, however, then you give the number of double-byte graphic characters. That value is multiplied by 2 to find the length of the field in bytes. If the number of graphic characters is specified, then any length derived from POSITION is ignored. No delimited datafield specification is allowed with GRAPHIC datatype specification.

GRAPHIC EXTERNAL
If the DBCS field is surrounded by shift-in and shift-out characters, use GRAPHIC EXTERNAL. This is identical to GRAPHIC, except that the first and last characters (the shift-in and shift-out) are not loaded. The syntax for this datatype is:

GRAPHIC EXTERNAL

where:

GRAPHIC Data is double-byte characters. EXTERNAL &n

All rights reserved © 2020 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

Oracle 8i Topics