BLOB Types - Firebird

Almost any sort of storable data can be stored in a BLOB: bitmapped graphics images, vector drawings, sound files, video segments, chapter- or book-length documents, or any other kind of multimedia information. Because a BLOB can hold different kinds of information, it requires special processing at the client for reading and writing.

BLOB types can, where practicable, store data files generated by other applications such as word processors, CAD software, or XML editors. The gains can be the benefit of transaction control for dynamic data, protection from external interference, control of versions, and the ability to access externally created data through the medium of SQL statements.

BLOB types cannot be indexed.

Supported BLOB Types

Out of the box, Firebird makes two pre-defined BLOB types available, distinguished by The sub-type attribute (SQL keyword SUB_TYPE), as described in Table.

Pre-Defined BLOB Sub-Types

Pre-Defined BLOB Sub-Types

More About Sub-Types

A BLOB sub-type is a positive or negative integer that identifies the nature of the data contained in the column. Besides the two sub-types pre-defined for general use, Firebird has a number of sub-types that it uses internally. All of these internal subtypes have positive numbers.

Custom sub-types can be added to distinguish and identify special types of data objects, such as HTML, XML, or word-processor documents, JPEG and PNG images, etc.—the choice is up to you. Negative sub-type numbers—from –1 to –32,768—are reserved for assigning to custom sub-types.

BLOB subtyping also allows specific conversion from one sub-type to another. Firebird has extensible support for automatic conversion between a given pair of BLOB sub-types in the form of BLOB filters. Blob filters are a special kind of external function with a single purpose: to take a BLOB object in one format and convert it to a BLOB object in another format. It is possible to create a BLOB filter that converts between a custom (negative) sub-type and a pre-defined sub-type, commonly the TEXT one.

The object code for BLOB filters is placed in shared object libraries. The filter, which is invoked dynamically when required, is recognized at the database (not server) level by way of a declaration in metadata:

Firebird does not check the type or format of BLOB data. When planning to store it, you must make your application code responsible for ensuring that the format of the data agrees with its sub-type, whether pre-defined or custom.

BLOB Segments

BLOB data is stored in a different format from regular column data and apart from it. It is stored as segments in one or more database pages, in a distinct row version that is unlike the format of a row of regular data. Segments are discrete chunks of unformatted data that are usually streamed by the application and passed to the API to be packaged for transmission across the network, one chunk at a time, in contiguous order.

In the regular row structure of the parent row, the BLOB is linked through a BLOB ID that is stored with the regular row data. A BLOB ID is a unique hexadecimal pair that provides a cross-reference between a BLOB and the table it belongs to. On arrival at the server, segments are laid down in the same order as they are received, although not necessarily in chunks of the same size in which they were transported.

Where possible, the BLOB row versions are stored on the same page as parent row. However, large BLOBs can extend across many pages, and this initial “BLOB row” may not contain actual data but an array of pointers to BLOB pages.

Declaration Syntax Examples

The following statement defines two BLOB columns: BLOB1 with sub-type 0 (the default) and BLOB2 with Firebird sub-type 1 (TEXT):

The next statement defines a domain that is a text BLOB to store text in character set ISO_8859_1:

This SQL snippet shows how a local BLOB variable is declared in a PSQL module:

Segment Size

When a BLOB column is defined in a table, the definition can optionally include the expected size of segments that are written to the column. The default—80 bytes—is really quite arbitrary. Mythology says it was chosen because it was exactly the length of one line on a text terminal display!

The segment size setting does not affect Firebird’s performance in processing BLOBs on the server: The server does not use it at all. For DSQL applications—which are what most people write—you can simply ignore it or, if relevant, set it to some size that suits the buffer in which your application stores BLOB data.

For DML operations—SELECT, INSERT, and UPDATE— the length of the segment is specified in an API structure when it is written and can be any size, up to a maximum of 32,767 bytes. Reusable classes for development environments such as Delphi, C++, and Java usually take care of BLOB segmentation in their internal functions and procedures. If you are programming directly to the API, you will need to develop your own routines for constructing segments.

Embedded Applications

In databases for use with embedded applications—here, we’re talking about ESQL apps written for preprocessing by the gpre preprocessor—the segment size must be declared to indicate the maximum number of bytes that an application is expected to write to any segment in the column. Normally, an ESQL application should not attempt to write segments larger than the segment length defined in the table; doing so overflows the internal segment buffer, corrupting memory in the process. It may be advantageous to specify a relatively large segment, to reduce the number of calls to retrieve BLOB data.

The following statement creates two BLOB columns: BLOB1, with a default segment size of 80, and BLOB2, with a specified segment length of 1024.

In this ESQL code fragment, an application inserts a BLOB segment. The segment length is specified in a host variable, segment_length:

Operations on BLOBs

A BLOB is never updated. Every update that “changes” a BLOB causes a new BLOB to be constructed, complete with a new BLOB ID. The original BLOB becomes obsolete once the update is committed.

A BLOB column can be tested for NULL/NOT NULL, but no internal function exists to compare one BLOB with another or to compare a BLOB to a string. Several BLOB UDFs are available from community download sites, including some that compare two BLOBs for equality.

It is not possible to concatenate two BLOBs or to concatenate a string to a BLOB.

String Input to BLOB Columns

When accepting data for input to BLOB columns by way of an INSERT or UPDATE operation, Firebird can take a string as input and transform it into a BLOB, for example:

Note that passing a string to a stored procedure for an input argument that was defined as a BLOB will cause an exception. The following, for example, will fail:

Instead, do one of the following:

  • Define the input argument as a VARCHAR and have your procedure submit the string to the INSERT or UPDATE statement itself.
  • Have your client program take care of converting the string to a text BLOB. This will be the preferred solution if the length of the string is unknown.

When to Use BLOB Types

The BLOB is preferred to character types for storing text data of infinitely variable length. Because it is transported in “mindless chunks,” it is not subject to the 32K length limit of strings, as long as the client application implements the appropriate techniques to pass it in the format required by the server for segmenting it. It is possible to pass a string type as input to a BLOB and leave the server to convert it. Under these conditions, it is not possible to store a BLOB that exceeds the 32K limit for strings.

Because the BLOB is stored apart from regular row data, it is not fetched automatically when row data is selected. Rather, the client requests the data on demand, by way of the BLOB_ID. Consequently, there is big “win” in the time taken to begin fetching rows from a SELECT, compared to the traffic involved when character types are used for storing large text items. On the other hand, some developers may consider it a disadvantage to have to implement “fetch on demand.”

When considering whether to use BLOB for non-text data, other issues arise. The convenience of being able to store images, sound files, and compiled documents has to be balanced against the overhead it adds to backups. It may be an unreasonable design objective to store large numbers of huge objects that are never going to change.

Security

The idea that large binary and text objects are more secure when stored in BLOBs than when stored in filesystem files is, to some extent, an illusion. Certainly, they are more difficult to access from end-user tools. However, database privileges currently do not apply to BLOB and ARRAY types beyond the context of the tables to which they are linked by indirection. It is not absurd to suppose that a malicious hacker who gained access to open the database file could write application code that scanned the file for BLOB_IDs and read the data directly from storage, as external BLOB


All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

Firebird Topics