Library

Tables. Primary keys and generators

NOTICE: This document is the chapter from the book "The InterBase World" which was written by Alexey Kovyazin and Serg Vostrikov.

InterBase is a relational DBMS. In addition, it means that all the data in InterBase is stored as tables. The table as it is realized from SQL point of view is very similar to the ordinary table, which can be drawn by hand on a sheet of paper or created in the program like Microsoft Excel. Tables in InterBase have columns and strings where data is placed. The table necessarily has a name, unique within one database. Tables are the main storage of the information in a database and consequently you should be very careful when creating tables.

There are rules describing how to create tables in the relational database, reflecting the data of the real world and at the same time allowing to organize effective storage of the information in a database. Process of application of these rules for designing a "correct" database is called normalization. We have knowingly quoted the word "correct", because "normalized database " and "optimized database" are not synonyms. You do not have to conform to the rules of normalization unambiguously – always apply the correction for specification of a given problem.

The normalization of tables in a database is considered in detail in the book [14] and therefore we shall not try to embrace unembracable and we shall return to our theme of discussion - to InterBase tables. Let’s consider a syntax of DDL (DDL - Data Definition Language, for more details see the glossary) sentence that allows to create tables:

CREATE TABLE table [EXTERNAL [FILE] ""] ( [, | ...]);

Here table is a name of the created table, - the description of columns (sometimes we will say - fields) of the created table. The option table [EXTERNAL [FILE] ""] means that the so-called external table which is stored not in a shared database file, but in a separate file with a name will be created. As you can see, everything is simple - we define a table name and columns that it contains. Now we will consider in detail how to define columns. The syntax of creating a column is described by the following DDL sentence:

 

= col { datatype | COMPUTED [BY] (< expr>) | domain}
[DEFAULT { literal | NULL | USER}]
[NOT NULL] [ ]
[COLLATE collation]

This is a rather big definition, however in definition of a column only a small part of given sentences is obligatory. Every column in the table must have a name, unique within the table, as well as a data type defined by sentence datatype, or an expression for computing value of a column (for computing columns), or the domain (see below), defined domain. Data types were considered in the chapter "Data types"; therefore, you can easily understand how SQL-expression for creating a table is formed.

Let's be connected to our database FIRSTBASE.gdb created earlier in the chapter "Create a database ", and we will try to work with tables in practice. When it comes to creating, deleting and updating of tables any of InterBase administrative tools – from those listed in the application "InterBase administrator and developer tools", as well as a standard utility isql.exe from a set of supply of any InterBase clone will be right.

Here is an example of a simple table called TABLE_EXAMPLE and containing 3 fields of various types:

CREATE TABLE Table_example ( ID INTEGER, NAME VARCHAR(80), PRICE_1 DOUBLE PRECISION);

This table illustrates the most frequently occurring case in the process of a database development. However, there are also other methods of defining fields. For example, we can set a field type using domains. The domain is a type defined by a user for the convenience of applying certain combinations of type parameters. For example, it is possible to define domain D_ID for specifying fields of identifiers. Having defined the domain, we can use it for setting type of a field:

CREATE DOMAIN D_ID AS INTEGER; CREATE TABLE Тable_example ( ID D_ID, NAME VARCHAR(80), PRICE_1 DOUBLE PRECISION);

Field ID will have the type defined by domain D_ID. Thus, having defined the type of the field in the domain, required checks and constraints, we can apply this domain many times for creating fields of the same function. For example, monetary without tiresome and harboring danger to be mistaken copying of definitions of variable types. The third way to set a column in the table is to define it as calculated (COMPUTED BY) and to specify a condition according to which its value will be computed. For example, we can wish to have a column calculating 10 % from the value of field PRICE_1 in our table. In this case, the following command should be written:

CREATE TABLE Тable_example ( ID INTEGER, NAME VARCHAR(80), PRICE_1 DOUBLE PRECISION, PRICE_10 COMPUTED BY (PRICE_1*0.1));

But do not think, that now as soon as we insert the data into field PRICE_1, in field PRICE_10 there will be a tenth part from the value of this field. No, the process here is more complicated. Actually, we will get a required tenth part only referring to field PRICE_10, for example when executing inquiry SELECT to this table. That is no data is stored in a calculated field, and fetching calculation of the expression connected with a field is performed, and the result is produced as inquiry answer.

So, we have considered 3 main ways of specifying fields in the table. Now let's consider in detail options which can be set when creating a column. The option [DEFAULT {literal | NULL | USER}] - allows to set a column value by default. It is very convenient for automatic data filling. There are 3 ways to set a default. The first is designated as literal and allows to set defaults as text constants, numbers or dates. For example, we can generate the following expressions for creating a column with text defaults: NAME VARCHAR(80) DEFAULT 'Василий Станиславович'

Thus, all the fields inserted into the table will take on defaults, i.e. if the other value has not been defined for field NAME, the string 'Vasily Stanislavovich' will appear. The second way to set a default is to specify DEFAULT NULL in definition of a column. And in records created again the value of this column will be NULL, if the other value, certainly, has not been obviously set. An example:

PRICE_1 DOUBLE PRECISION DEFAULT NULL

The third way to set a default is to specify DEFAULT USER in definition of a column. Thus in records created again this field will contain the name of the current user, i.e. the user who made a connection with InterBase and executed this insert (for more details about users see the chapter " Security in InterBase: users, their functions and rights " (part 4)). For some fields it is essential that the field had a nonempty value. For example, a field that cannot be empty according to problem specification. In order to set a constraint at the level of a database that a field should have a defined value, it is necessary to make the following addition to the description of a column:

NAME VARCHAR(80) NOT NULL

Thus, there will be a field in which nulls cannot be stored. Usually constraint NOT NULL is combined with option DEFAULT that definitely assigns a correct value to this field. But frequently constraint NOT NULL is not enough. For example, in a case of storing prices in a database it is quite clear that they cannot take on negative values (though it would be great if we were paid extra when buying goods). In order to make a server check values of the prices inserted into a database on a condition of positivity, it is necessary to define a column in the following way:

PRICE_1 DOUBLE PRECISION CHECK (PRICE_1>0)

Values inserted into column PRICE_1 will be checked on a condition of positivity. It should be noted, that different consistent options can be combined, and for example, we can set a nonempty value and check on positivity:

PRICE_1 DOUBLE PRECISION NOT NULL CHECK (PRICE_1>0)

When creating the columns some options cannot be combined, for example it is impossible to set NULL by default and simultaneously nonempty value constraint. It should be marked, that checks can perform a set of useful functions on data management in a database. We will consider their usage in detail in the chapter "Database constraints".

So, we have considered the ways of creating tables and fields with different options. However, there are cases when we have to alter the table that already exists. Certainly, we can re-create the table entirely. First, we should execute the command of deleting the table and then creating it again. For example:

DROP TABLE Table_example; CREATE TABLE Table_example(ID NUMERIC(15,2);

But such way of altering tables has significant drawbacks. When deleting a table using DROP command all the data the table contains are deleted and in order not to lose them, it is necessary to copy them to temporary tables. It is rather troublesome. Therefore, there is ALTER TABLE command for easy alteration of tables’ structure that allows to add new fields, to delete existing, as well as to add / delete constraints of reference integrity.

For example, we want to add one more column to the table intended for storing the data about a patronymic of a person:

ALTER TABLE Table_example ADD Patronimic VARCHAR(80);

After execution of this command our table Table_example will have a new column with name Patronimic and VARCHAR (80) type. If we want to delete a column with name NAME from the table, we should execute the following:

ALTER TABLE Table_example DROP Name;

You can see the complete syntax of ALTER TABLE statement in [1]. It is a very useful command, and we will often use it.

And what to do, you will ask, if it is necessary to modify a column? For example, we have decided that for storing names it is better to use field HUMAN_NAME than NAME. In this case we can apply ALTER TABLE

ALTER TABLE Table_example ALTER COLUMN NAME TO HUMAN_NAME;

If we have decided to alter the type of a field, for example to increase a number of characters, stored in a field, we will have to change the domain of this field using the statement ALTER DOMAIN (see the chapter "Data types" above).

So, we have considered creation and modification of tables in InterBase. Now it is time to go deep into the theory of databases a little. InterBase as it has already been told is a relational database. In addition, it means that every record in the table should have a character according to which one record can be distinguished from another. The special mechanism of unique keys serves this purpose.

Primary keys in tables

Certainly, we can create a table, which does not contain any keys. We are not prohibited to do this. But, as it was said earlier, creation of an efficient database is impossible without observing the rules of normalization. Presence of keys is the most important element of normalization. Therefore, though we do not aim at considering the theory and normalization of databases, we should introduce a definition of keys and review their function in InterBase. We will go step by step and we will begin with the most common type of keys – a primary key.

So, what is a primary key? It is one or more fields in the table, uniquely identifying records within this table. It sounds difficult, however actually everything is very simple. Imagine an ordinary table, for example the accounting sheet. What is the very first column? That is right, the serial number - 1, 2, 3 … This number denotes a unique line within the table, and it is enough to know this number to find a string in this table. In this example, it will be a primary key. Overwhelming majority of tables in a relational database necessarily has a primary key (PK - abbreviation from Primary key). The common guideline when creating the tables is to create a primary key. A primary key can be created when creating a table or later. Let’t assume that by the moment of creating a table we have decided that field ID will be our primary key. Then we can add a primary key in the following way:

CREATE TABLE Table_example ( ID INTEGER NOT NULL, NAME VARCHAR(80), PRICE_1 DOUBLE PRECISION, CONSTRAINT pkTable PRIMARY KEY (ID));

What has be done to create a primary key for the table table_example? Let’s search what has varied in definition of the table? First, column ID has got additional definition NOT NULL. It is important, because the primary key should be unique and barred from undefined values. And NULL as you know is an undefined value. Thus, all the fields included in a primary key, should have constraint NOT NULL. To complete the creation of a primary key the following should be written at the end of the table: CONSTRAINT ()

You can find the complete syntax of constraints in the chapter of "Database constraints" þ. 1, and for our example of a primary key, it will look like:

CONSTRAINT pkTable PRIMARY KEY (ID)

Here pkTable is a name of a primary key, and ID are columns it contains. Such way of defining primary keys for tables is convenient when mass creation of tables (for example, when constructing the prototype of a database on the basis of the scripts received by different CASE-resources). But what to do if we need to add / delete a primary key to the table when it already exists and is filled with the data? For this purpose, one more extension of the command – ALTER TABLE – should be applied. An example of adding a primary key to our table:

ALTER TABLE TABLE_EXAMPLE ADD CONSTRAINT FF PRIMARY KEY (ID);

Thus, the table Table_example will have precisely the same primary key as in the previous example when it was created together with the table. In order To delete a primary key the following command should be entered:

ALTER TABLE Table_example DROP CONSTRAINT pkTable;

Thus, the key with the name pkTable will be removed from a database.

Generators – best friends of primary keys

We have to say some words about implementation of a primary key. As it is meant for supporting uniqueness, no two records in one table can have the same values of this key. That is, to meet condition, when inserting a new record into the table InterBase has to check all the records in the table and find out whether the table contains such values or not. For quick search, InterBase has a mechanism of indexes - special InterBase objects, which allow to find a record in the table very quickly. Therefore, when creating and deleting the primary key the index is created or deleted for that field (or fields) that is included in the primary key.

As it was said earlier, the primary key can contain some fields. Thus, we can notice a uniqueness of a combination of values of these fields. For example, if we define a key for fields ID and NAME, the server will control that there are no identical combinations of these fields in the table. That is, combinations of fields ID and 1 and "Ivanov", 2 and "Ivanov" will be correct as they differ in values of field ID.

Thus, the primary key can include some fields of any types. However, in practice the most common type of the key is the counter - an integer field, which contains increasing values. Why is it so? It is the reflection of old dispute between natural and substitute keys. The concept of natural keys says that as a key we shall try to use the values really existing in data domain a database reflects. For example, if we develop a system of registration of people for a passport office, according to this concept a combination of number and a series of the passport should be taken as a primary key. Really, every person must have a unique combination of a number and a series of the passport. However, what to do with the fact that a person may change the passport during his life (in connection with reaching the defined age, marrying, etc.)? In this case, we will have to change the number and the series of the passport set in correspondence with the concrete person, i.e. actually, to change our primary key. It is undesirable from the point of view of development of databases applications: taking into account a ramified system of communication between tables (the following chapter is devoted to this) the developer will have to take great pains to control this situation.

Therefore, a substitute key is used in most cases. Substitute - means artificial, i.e. not existing in data domain that our database describes, and artificially created - for the convenience of development of database applications. As it was said, usually a counter is a primary key. Some DBMS, such as Paradox and MS SQL, have a special type - the counter (auto increment). When adding a new record to the table, a field value automatically increases with this type for the value of an increment - usually per units. In InterBase there is no field of the counter type, however such behavior can be realized. For creating the field, which would be automatically filled when adding a record to the table, the collection of resources is used: first of them is generator.

What is generator? Speaking in a simple way, generator is a named counter. Within a database, we can create a counter, give a unique name to it within this base and control the values of this counter. It will be a generator. Here is an example of DDL statements that will explain it to you:

CREATE GENERATOR g1; SET GENERATOR g1 TO 2445;

In the first line of this example generator with name g1 is created, and in the second line value 2445 is assigned to this generator. Now there is a question how to use the obtained generator. There is a built-in function GEN_ID in InterBase for getting and changing the values of generators. This function takes as parameters the name of the generator and the value of the increment, which is to be applied to the given generator, and returns the integer value corresponding to the value of the generator, received as a result of adding the increment to it. Here is an example of GEN_ID function call in trigger or stored procedure:

Current_value = GEN_ID (g1, 1)

If we want to receive the value of the generator, we can use the following inquiry:

SELECT GEN_ID(g1, 1)FROM RDB$ DATABASE

As table RDB $ Database always contains only one record we will receive a value of the generator g1 as a result of the given inquiry.

Here current_value is a variable (in the following chapters you will find the information how to use variables in InterBase), g1 - a generator, 1- increment. In this example, the value of generator g1 will get to the variable current_value after adding increment 1 to it, i.e. the following value of the generator. Pay attention, that the increment can be not equal to 1! Moreover, it can be even negative: Current_value = GEN_ID (g1, -23)

As a result of executing this function the current value of the generator g1 will be minus 23. As you can see, a range of possible applications of generators is rather wide - it can be used not only for obtaining values of primary keys, but also for watching global changes in a database.

People familiar with databases may ask a question: "what will happen if simultaneously some clients will try to put the data to the same table and will simultaneously "pull" the generators? Will they receive the same or different values of the generator? ". They will unambiguously receive DIFFERENT values of the generator. How "simultaneous" the attempt to receive the value of the generator was, everyone who applied for would receive the unique value. It is guaranteed by "construction" of generators: they work at the lowest level of a server and no processes of recording and inserting influence them – it is often said that generators work "out of the context of transactions ". If you want to know about transactions, read the chapter "Transactions. Parameters of transactions" (part 1); how generators are arranged –"InterBase database structure" (part 4). Well, on behalf of generators we have a reliable mechanism for creating unique primary keys. However, can we use this mechanism? How to put the value received from the generator to a field of the primary key?

For this aim, there are two ways - to insert a primary key on behalf of a client and on behalf of a server. To master the first way, we should refer to the chapter "Usage of main components of FIBPlus " and to understand the second - to the chapter "Triggers" (part 1). Here we will briefly review the main point of both ways.

In a case of creating a primary key on behalf of a client, the following happens. When the record which will be inserted into a database is generated, GEN_ID function call (, 1) is executed and the received value is substituted for this record. Then inserting into the table takes place, and we are guaranteed to receive a unique primary key.

The second way – creating a primary key on behalf of a server - in general it eliminates any care from a client’s part about what value of a primary key will be. In this case when inserting the record the trigger works - a special database object, which can perform any operations when inserting / deleting/updating the records in tables. And in this trigger the following operations are fulfilled: GEN_ID function call, receiving of the required value of the generator and its inserting into the table. The advantage of the second way is that when developing a client application there is no need to worry about the creation of a primary key at all, the only thing you have to do is to write a necessary trigger once. But the disadvantage is that we cannot receive the value of the generated key in the application just right after inserting! If we use the first way, we can receive the value of the primary key though we should care about its creation every time when inserting it. It is difficult to tell for sure which way is better, all depends on a specific problem. Further in this book we will consider possible variants of settling the questions about the work with a primary key.

Conclusion

So, in this chapter we have reviewed how to create and update tables in InterBase, as well as handle the primary keys. Thus, we have considered the main objects in InterBase that can be conditionally called static, because they store only the information and do not perform its conversion. Further, we will talk about the ways of information control and information conversion within a database.