Skip to content

HowTo: Implement Automatic Schema Management

David Nichols edited this page Jun 19, 2016 · 21 revisions

HowTo: Implement Automatic Schema Management

Automate creation, upgrades, downgrades, and dropping database schemas

If you need to store data in a database for your application or script, you can use the Schema module for managing the schema, which can automate schema creation, schema upgrades, schema downgrades and dropping the schema for multiple database engines from a single schema description.

Note: for a complete code example, see the schema example program

Schema Alignment

The Schema module's approach is based on schema alignment and works as follows:

  1. Implement a class that inherits AbstractSchema providing a description of the schema (and optionally reference data) as Qore data
  2. Call AbstractSchema::align() to align the database with your description

AbstractSchema::align() provides for automatic creation, upgrades, and even downgrades of schemas.

Note: because schemas are described as data; schema descriptions can be hardcoded in the class or generated with application logic

Schema Module Classes

Qore's Schema module provides a high-level API on top of SqlUtil for automatic schema management.

The Schema module provides two main classes for schema management:

  • AbstractSchema: the base class for schema management; all schema classes must inherit this class, which provides the majority of the schema-management functionality in the Schema module
  • AbstractVersionedSchema: extends AbstractSchema by adding version-aware logic to schema management actions

The main difference between AbstractVersionedSchema and AbstractSchema is that AbstractVersionedSchema will verify the schema version before running the schema alignment process on the schema; the schema alignment process is skipped by default if the schema is already at the required version level (but alignment can be forced even in this case). This is performed by storing the schema version string somewhere in the schema itself as reference data. This is a very useful feature for more complex schemas where the alignment check can take a noticeable amount of time to run.

Schema Descriptions

Schema descriptions are made as data structures returned from methods in the AbstractSchema subclass.

Table Descriptions

Tables are defined as table description hashes as returned by AbstractSchema::getTablesImpl()

The table description hashes describe various attributes of the table in various keys as follows:

For example:

# customer type reference table
const T_CustomerType = (
    "columns": (
        "id": c_number(True, "customer type value; 1 = wholesale, 2 = retail; PK"),
        "description": c_varchar(200, "customer type description"),
    ),
    "primary_key": ("name": "pk_customer_type", "columns": "id"),
    "indexes": (
        # Oracle also needs an index to match the PK constraint
        "driver": (
            "oracle": (
                "pk_customer_type": ("columns": ("id"), "unique": True),
            ),
        ),
    ),
);

Note: the driver key is supported everywhere in schema description object hashes; the value of this key is a hash like above keyed by driver name, hashes underneath the driver name will override definitions above the driver key for that particular driver. In the example above, the pk_customer_type index is only created on Oracle (where unique constraints do not automatically cause unique indexes to be created). This allows schema definitions to support multiple databases simultaneously.

Reference Data

Schema reference data can also be managed with the Schema module.

There are different kinds of reference data as in the following table.

Data Type AbstractSchema method Description
Strict Reference Data getStrictReferenceDataHashImpl() Describes the only data that can appear in the table; extra rows are deleted during alignment
Reference Data getReferenceDataHashImpl() Describes reference data that must appear in the table, but additional rows will be left as-is; future alignments will update the rows if the data in columns not in the primary key is changed
Create Only Reference Data getCreateOnlyReferenceDataHashImpl() Describes reference data that is written to the table only when the table is created; later alignments do not affect the table's contents
Insert Only Reference Data getInsertOnlyReferenceDataHashImpl() Describes reference data that will be inserted if it does not exist; in future alignments the data will not be changed as long as the primary key values are present

The reference data hashes are given as keys giving the table names, each value then gives the row data for the given reference table. The row data is given as a list of hashes where the first list value is a hash giving the column names, and the subsequent list values are hashes giving the row values for the given columns.

Versioned Schema Example Class

Example of a versioned schema class using a reference data table:

%new-style
%strict-args
%require-types
%enable-all-warnings

%requires Schema

class ExampleSchema inherits AbstractVersionedSchema {
    public {
        # the schema version
        const Version = "0.1";

        # table containing schema info including the schema version
        const T_SchemaProperties = (
            "columns": (
                "keyname": c_varchar(100, True, "the schema property key name; PK"),
                "value": c_varchar(200, "the schema property value"),
            ),
            "primary_key": ("name": "pk_schema_properties", "columns": "keyname"),
            "indexes": (
                # Oracle also needs an index to match the PK constraint
                "driver": (
                    "oracle": (
                        "pk_schema_properties": ("columns": ("keyname"), "unique": True),
                    ),
                ),
            ),
            );

        # hash of all tables in the schema
        const Tables = (
            "schema_properties": T_SchemaProperties,
            );

        # reference data for the schema_properties table
        const SRD_SchemaProperties = (
            ("keyname", "value"),
            ("version", Version),
            );

        # hash of strict reference data, describing the only data that can appear in the given tables
        const StrictReferenceData = (
            "schema_properties": SRD_SchemaProperties,
            );
    }

    # creates the object
    constructor(AbstractDatasource ds, *string dts, *string its, *hash opts) : AbstractVersionedSchema(ds, dts, its, opts) {
    }

    # returns the name of the schema
    string getNameImpl() {
        return "ExampleSchema";
    }

    # returns the version of the schema
    string getVersionImpl() {
        return Version;
    }

    # returns the name of the table holding the schema version string
    string getVersionTableImpl() {
        return "schema_properties";
    }

    # returns the name of the column holding the schema version string
    string getVersionColumnImpl() {
        return "value";
    }

    # returns the where clause hash defining the row where the schema version string is located
    hash getVersionWhereImpl() {
        return ("keyname": "version");
    }

     # returns the table descriptions for our schema
    private *hash getTablesImpl() {
        return Tables;
    }
}

Other Objects

The Schema module can handle more than just tables, depending on the database, it can handle:

See the Schema module docs and the SqlUtil Schema Management docs for more information.

Aligning the Database with the Schema

Once you have the schema class defined, your schema object must be instantiated. The AbstractSchema::constructor() method requires a database connection in the form of an AbstractDatasource object.

The database driver used by the AbstractDatasource object will determine the SqlUtil-based DB-specific implementation that will provide the underlying schema management functionality driven by the configuration in your schema class using the high-level API implemented by AbstractSchema.

If your database (and the SqlUtil-based DB-specific implementation) supports tablespaces, then you can also specify the data and index tablespace in the AbstractSchema::constructor().

Once your schema object has been created, then aligning the database with your schema description is straightforward; simply call AbstractSchema::align().

If the database is empty (i.e. has none of your objects in it), then all objects will be created.

Otherwise, the alignment behavior depends on the base class for your schema object.

Aligning with AbstractSchema

AbstractSchema is a "dumb" class, in the sense that it does not know how to work with schema versions, and therefore cannot determine on its own if a schema is being upgraded or downgraded.

Therefore all alignments are treated the same; AbstractSchema will attempt to align existing DB objects with the schema definitions.

AbstractVersionedSchema uses the schema version in the DB and the schema version in the schema definition to determine if an upgrade or a downgrade is being performed.

When calling AbstractSchema::align() with AbstractVersionedSchema and AbstractVersionedSchema determines that a schema upgrade is being performed, then the schema upgrade proceeds automatically.

In case AbstractVersionedSchema determines that a downgrade is attempted, then the schema alignment will fail unless the force flag is set to True in the AbstractSchema::align() call.

Dropping a Schema

To drop a schema, call AbstractSchema::drop().

If the schema class inherits AbstractVersionedSchema, then the force flag must be set to True in the AbstractSchema::drop() call if the schema version cannot be determined or is not the expected version as given in the schema definition (as described by AbstractVersionedSchema).

Example

sub main() {
    GetOpt g(Opts);
    *hash o = g.parse3(\ARGV);
    if (o.help)
        usage();

    *string dsstr = shift ARGV;
    if (!dsstr)
        usage();

    Datasource ds(dsstr);

    ExampleSchema schema(ds, o.data_ts, o.index_ts);

    on_success ds.commit();
    on_error ds.rollback();

    if (o.drop) {
        schema.drop(o.force, o.verbose);
        return;
    }

    schema.align(o.force, o.verbose);
}

Note: for a complete code example, see the schema example program

Clone this wiki locally