The attributes of a relation provide the data points for each tuple. In order to add structure to a relation, each attribute is assigned a domain that defines what data values can be represented by the attribute. The domain can place restrictions on the type of data that can be represented by an attribute as well as the range of values that an attribute can have. In the previous example, the api attribute is limited to the domain of integers and is said to be of type integer. Additionally, the domain of the api attribute can be further reduced to the set of positive integers (an upper bound can also be defined if the need arises).
The concept of a domain for a relation is important to the relational model as it allows the relation to establish constraints on attribute data. This becomes useful in maintaining data integrity and ensuring that the attributes of a relation are not misused. In the relation depicted in Table 1.1, a string api value could make certain operations difficult or allow operations to produce unpredictable results. Imagine adding a tuple to the os relation that contains a nonnumeric value for the api attribute, then asking the database to return all os versions with an api value that is greater than 19. The results would be unintuitive and possibly misleading.
The number of attributes in a relation is referred to as its degree. The relation in Table 1.1 has a degree of three because it has three attributes. A relation with a degree of one is called a unary relation. Similarly, a relation with a degree of two is binary, and a relation with a degree of three is called ternary. A relation with a degree higher than three is referred to as an n-ary relation.
Tuples are represented by rows in the tabular representation of a relation. They represent the data of the relation containing values for the relation’s attributes. The number of tuples in a relation is called its cardinality. The relation in Table 1.1 has a cardinality of four since it contains four tuples. An important point regarding a relation’s cardinality and its degree is the level of volatility. A relation’s degree helps define its structure and will change infrequently. A change in the degree is a change in the relation itself. In contrast, a relation’s cardinality will change with high frequency. Every time a tuple is added or removed from a relation, the relation’s cardinality changes. In a large-scale database, the cardinality could change several times per second, but the degree may not change for days at a time, or indeed ever.
A relation’s attributes and the attributes’ domains and any other constraints on attribute values define a relation’s intension. A relation’s tuples define its extension. Since intension and extension are related to cardinality and degree respectively, it is easy to see that a relation’s intension will also remain fairly static whereas it extension is dynamic, changing as tuples are added, deleted, and modified. A relation’s degree is a property of its intension, and its cardinality is a property of its extension.
The structure of a relation is defined by its relational schema. A schema is a list of attributes along with the specification of the domain for those attributes. While the tabular form of a relation (Table 1.1) allows us to deduce the schema of a relation, a schema can also be specified in text. Here is the text representation of the schema from Table 1.1: os(version, codename, api) Notice the name of the relation along with the list of the attributes. In addition, the primary key is sometimes indicated with bold column names. Primary keys are discussed later in the chapter.
Properties of a Relation
Each relation in the relational model must follow a set of rules. These rules allow the relation to effectively represent real-world data models as well as address some of the limitations of older database systems. Relations that adhere to the following set of rules conform to a property known as the first normal form:
■■ Unique name: Each relation must have a name that uniquely identifies it. This allows the relation to be identified in the system.
■■ Uniquely named attributes: In addition to a uniquely named relation, each attribute in a relation must have a unique name. Much like the relation name, the attribute’s unique name allows it to be identified.
■■ Single-valued attributes: Each attribute in a relation can have at most one value associated with it per tuple. In the example in Table 1.1, each api level attribute has only a single integer value. Including a tuple that has multiple values (19 and 20) is considered bad form.
■■ Domain-limited attribute values: As discussed previously, the value of each attribute for a tuple must conform to the attribute’s domain. The domain for an attribute defines the attribute’s “legal” values.
■■ Unique tuples: There should be no duplicate tuples in the relation. While there may be parts of a tuple that have common values for a subset of the relation’s attributes, no two tuples should be identical.
■■ Insignificant attribute ordering: The order of the attributes in a relation has no effect on the representation of the relation of the tuples defined in the relation. This is because each attribute has a unique name that is used to refer to that attribute. For example, in Table 1.1, if the column ordering of the codename and api attributes were switched, the relation would remain the same. This is because the attributes are referred to by their unique names rather than their column ordering.
■■ Insignificant tuple ordering: The order of the tuples in a relation has no effect on the relation. While tuples can be added and removed, their ordering has no significance for the relation.
Most conceptual data models require a relational model that contains multiple relations. Fortunately, the relational model allows relationships between multiple relations to be defined to support this. In order to define relationships between two relations, keys must be defined for them. A key is a set of attributes that uniquely identify a tuple in a relation. A key is frequently used to relate one relation to another and allows for complex data models to be represented as a relational model.
■■ Superkey: A superkey is a set of attributes that uniquely identify a tuple in a relation. There are no limits placed on the number of attributes used to form a superkey. This means that the set of all attributes should define a superkey that is used for all tuples.
■■ Candidate key: A candidate key is the smallest set of attributes that uniquely identify a tuple in a relation. A candidate key is like a superkey with a constraint placed on the maximum number of attributes. No subset of attributes from a candidate key should uniquely identify a tuple. There may be multiple candidate keys in a relation.
■■ Primary key: The primary key is a candidate key that is chosen to be the primary key. It holds all the properties of a candidate key but has the added distinction of being the primary key. While there may be multiple candidate keys in a relation that all uniquely identify a single row, there can be only one primary key.
■■ Foreign key: A foreign key is a set of attributes in a relation that map to a candidate key in another relation.