The relational database model is one of the more popular models for databases today. Android comes with a built-in database called SQLite that is designed around the relational database model. This chapter covers some of the basic concepts of a relational database. It starts with a brief history of databases, then moves to a discussion of the relational model. Finally, it covers the evolution of database languages. This chapter is meant for the reader who is largely unfamiliar with the concept of a relational database. Readers who feel comfortable with the concepts of a relational database can safely move on to chapters that discuss the unique features of the SQLite database system that comes bundled with Android. History of Databases Like other aspects of the world of computing, modern databases evolved over time.
While we tend to talk about NoSQL and relational databases nowadays, it is sometimes important to know “how we got here” to understand why things work the way they do. This section of the chapter presents a little history of how the database evolved into what it is today. Note This section of the chapter presents information that may be of interest to some but seem superfluous to others. Feel free to move on to the next section to get into the details of how databases work on Android. The problem of storing, managing, and recalling data is not a new one
Even decades before computers, people were storing, managing, and recalling data. It is easy to think of a paper-based system where important data was manually written, then organized and stored in a filing cabinet until it would need to be recalled. I need only to look in the corner of my basement to be reminded of the days when this was a common paradigm for data storage. The paper-based approach has obvious limitations, the main one being its ability to scale as the amount of data grows. As the amount of data increases, so does the amount of time it takes to both manage the data store and recall data from the data store. ptg18221911 2 Chapter 1 Relational Databases A paper-based approach also implies a highly manual process for data storage and retrieval, making it slow and error prone as well taking up a lot of space. Early attempts to offload some of this process onto machines followed a very similar approach. The difference was that instead of using hard copies of the data written on paper, data was stored and organized electronically.
In a typical electronic-file-based system, a single file would contain multiple entries of data that was somehow related to other data in the file. While this approach did offer benefits over older approaches, it still had many problems. Typically, these file stores were not centralized. This led to large amounts of redundant data, which made processing slow and took large amounts of storage space. Additionally, problems with incompatible file formats were also frequent because there was rarely a common system in charge of controlling the data. In addition, there were often difficulties in changing the structure of the data as the usage of the data evolved over time. Databases were an attempt to address the problems of decentralized file stores.
Database technology is relatively new when compared to other technological fields, or even other areas of computer science. This is primarily because the computer itself had to evolve to a point where databases provided enough utility to justify their expense. It wasn’t until the early to mid-1960s that computers became cheap enough to be owned by private entities as well as possess enough power and storage capacity to allow the concept of a database to be useful. The first databases used models that are different from the relational model discussed in this chapter. In the early days, the two main models in widespread use were the network model and the hierarchical model.
In the hierarchical model data is organized into a tree structure. The model maintains a one-to-many relationship between child and parent records with each child node having no more than one parent. However, each parent node may have multiple children. An initial implementation of the hierarchical model was developed jointly by IBM and Rockwell in the 1960s for the Apollo space program. This implementation was named the IBM Information Management System (IMS). In addition to providing a database, IMS could be used to generate reports. The combination of these two features made IMS one of the major software applications of its time and helped establish IBM as a major player in the computer world. IMS is still a widely used hierarchical database system on mainframes.
The network model was another popular early database model. Unlike the hierarchical model, the network model formed a graph structure that removed the limitation of the one-to-many parent/child node relationship. This structure allowed the model to represent more complex data structures and relations. In addition, the network model was standardized by the Conference on Data Systems Language (CODASYL) in the late 1960s.
The Introduction of the Relational Model
The relational database model was introduced by Edgar Codd in 1970 in his paper “A Relational Model of Data for Large Shared Data Banks.” The paper outlined some of the problems of the models of the time as well as introduced a new model for efficiently storing data. Codd went into details about how a relational model solved some of the shortcomings of the current models and discussed some areas where a relational model needed to be enhanced. This was viewed as the introduction to relational databases and caused the idea to be improved and evolve into the relational database systems that we use today. While very few, if any, modern database systems strictly follow the guidelines that Codd outlined in his paper, they do implement most of his ideas and realize many of the benefits.
The Relational Model
The relational model makes use of the mathematical concept of a relation to add structure to data that is stored in a database. The model has a foundation based in set theory and first-order predicate logic. The cornerstone of the relational model is the relation.
In the relational model, conceptual data (the modeling of real-world data and its relationships) is mapped into relations. A relation can be thought of as a table with rows and columns. The columns of a relation represent its attributes, and the rows represent an entry in the table or a tuple. In addition to having attributes and tuples, the relational model mandates that the relation have a formal name. Let’s consider an example of a relation that can be used to track Android OS versions. In the relation, we want to model a subset of data from the Android dashboard (https://developer.android.com/about/dashboards/index.html). We will name this relation os. The relation depicted in Table 1.1 has three attributes—version, codename, and api— representing the properties of the relation. In addition, the relation has four tuples tracking Android OS versions 5.1, 5.0, 4.4, and 4.3. Each tuple can be thought of as an entry in the relation that has properties defined by the relation attributes.