Attribute
The attributes of a relation provide the data points for each tuple. In order to add structure to a relation, each attribute is assigned a domain that defines what data values can
be represented by the attribute. The domain can place restrictions on the type of data
that can be represented by an attribute as well as the range of values that an attribute
can have. In the previous example, the api attribute is limited to the domain of integers
and is said to be of type integer. Additionally, the domain of the api attribute can be
further reduced to the set of positive integers (an upper bound can also be defined if the
need arises).
The concept of a domain for a relation is important to the relational model as it allows
the relation to establish constraints on attribute data. This becomes useful in maintaining
data integrity and ensuring that the attributes of a relation are not misused. In the relation
depicted in Table 1.1, a string api value could make certain operations difficult or allow
operations to produce unpredictable results. Imagine adding a tuple to the os relation that
contains a nonnumeric value for the api attribute, then asking the database to return all
os versions with an api value that is greater than 19. The results would be unintuitive and
possibly misleading.
The number of attributes in a relation is referred to as its degree. The relation in
Table 1.1 has a degree of three because it has three attributes. A relation with a degree
of one is called a unary relation. Similarly, a relation with a degree of two is binary, and
a relation with a degree of three is called ternary. A relation with a degree higher than
three is referred to as an n-ary relation.
Tuples
Tuples are represented by rows in the tabular representation of a relation. They represent
the data of the relation containing values for the relation’s attributes.
The number of tuples in a relation is called its cardinality. The relation in Table 1.1
has a cardinality of four since it contains four tuples.
An important point regarding a relation’s cardinality and its degree is the level of
volatility. A relation’s degree helps define its structure and will change infrequently.
A change in the degree is a change in the relation itself.
In contrast, a relation’s cardinality will change with high frequency. Every time a tuple
is added or removed from a relation, the relation’s cardinality changes. In a large-scale
database, the cardinality could change several times per second, but the degree may not
change for days at a time, or indeed ever.
Intension/Extension
A relation’s attributes and the attributes’ domains and any other constraints on attribute
values define a relation’s intension. A relation’s tuples define its extension. Since intension and extension are related to cardinality and degree respectively, it is easy to see that a
relation’s intension will also remain fairly static whereas it extension is dynamic, changing
as tuples are added, deleted, and modified. A relation’s degree is a property of its intension,
and its cardinality is a property of its extension.
Schema
The structure of a relation is defined by its relational schema. A schema is a list of
attributes along with the specification of the domain for those attributes. While the tabular
form of a relation (Table 1.1) allows us to deduce the schema of a relation, a schema can
also be specified in text. Here is the text representation of the schema from Table 1.1:
os(version, codename, api)
Notice the name of the relation along with the list of the attributes. In addition, the
primary key is sometimes indicated with bold column names. Primary keys are discussed
later in the chapter.
Properties of a Relation
Each relation in the relational model must follow a set of rules. These rules allow the
relation to effectively represent real-world data models as well as address some of the
limitations of older database systems. Relations that adhere to the following set of rules
conform to a property known as the first normal form:
■■ Unique name: Each relation must have a name that uniquely identifies it. This
allows the relation to be identified in the system.
■■ Uniquely named attributes: In addition to a uniquely named relation, each
attribute in a relation must have a unique name. Much like the relation name, the
attribute’s unique name allows it to be identified.
■■ Single-valued attributes: Each attribute in a relation can have at most one value
associated with it per tuple. In the example in Table 1.1, each api level attribute has
only a single integer value. Including a tuple that has multiple values (19 and 20) is
considered bad form.
■■ Domain-limited attribute values: As discussed previously, the value of each
attribute for a tuple must conform to the attribute’s domain. The domain for an
attribute defines the attribute’s “legal” values.
■■ Unique tuples: There should be no duplicate tuples in the relation. While there
may be parts of a tuple that have common values for a subset of the relation’s
attributes, no two tuples should be identical.
■■ Insignificant attribute ordering: The order of the attributes in a relation has no
effect on the representation of the relation of the tuples defined in the relation. This
is because each attribute has a unique name that is used to refer to that attribute.
For example, in Table 1.1, if the column ordering of the codename and api
attributes were switched, the relation would remain the same. This is because the
attributes are referred to by their unique names rather than their column ordering.
■■ Insignificant tuple ordering: The order of the tuples in a relation has no effect
on the relation. While tuples can be added and removed, their ordering has no
significance for the relation.
Relationships
Most conceptual data models require a relational model that contains multiple relations.
Fortunately, the relational model allows relationships between multiple relations to be
defined to support this. In order to define relationships between two relations, keys must
be defined for them. A key is a set of attributes that uniquely identify a tuple in a relation.
A key is frequently used to relate one relation to another and allows for complex data
models to be represented as a relational model.
■■ Superkey: A superkey is a set of attributes that uniquely identify a tuple in a
relation. There are no limits placed on the number of attributes used to form a
superkey. This means that the set of all attributes should define a superkey that is
used for all tuples.
■■ Candidate key: A candidate key is the smallest set of attributes that uniquely
identify a tuple in a relation. A candidate key is like a superkey with a constraint
placed on the maximum number of attributes. No subset of attributes from a
candidate key should uniquely identify a tuple. There may be multiple candidate
keys in a relation.
■■ Primary key: The primary key is a candidate key that is chosen to be the primary
key. It holds all the properties of a candidate key but has the added distinction of
being the primary key. While there may be multiple candidate keys in a relation that
all uniquely identify a single row, there can be only one primary key.
■■ Foreign key: A foreign key is a set of attributes in a relation that map to a
candidate key in another relation.