Data is here to stay and growing at an unprecedented rate. Every day, 2.5 quintillion bytes of data are created. By 2025, it’s estimated that the world’s output of data will increase by 163 zettabytes per year (one zettabyte equals one trillion gigabytes). That is a ton of data.
Developers depend on data and use it every day. Data drives the business and thus ultimately drives software. Understanding how it is stored will help you build better software. Developers don’t have to know everything, but they should know the basics.
Most databases are built using what is called the relational model. The relational model refers to the underlying theory of how to logically organize data within a database. Relational models focus on reducing data duplication and increasing reliability.
Let’s take a look at what the relational model is, why developers should care, and how it can work for them.
Getting Relational
The relational model features tables that hold data. The “relational” part of the name comes from how the records in the model are linked together.
Each table has relationships with other tables, usually to show ownership in some way. For example, an invoice has a relationship to a customer and to a set of products.
Let’s take a look at an analogy to help illustrate how relational databases work. Think of the plates and silverware in your kitchen. You have a place for your plates, bowls, and cups. You have your silverware divided with knives, spoons, and forks each in their assigned locations.
Then it’s time to have company over for dinner. You go to the cupboard and grab a plate. You take out a spoon, a knife, and a fork. You join them together to create a place setting. Then you do the same thing for each person and fill up the dining room table.
Relational models work the same way. You have various related entities stored in separate tables. The relations between them allow you to join records together in order to retrieve the complete “place setting,” so to speak. Going back to the invoice example, you can create the invoice by joining together the customer and the products they purchased.
The relational model gives you the tool to build a rich model of data for your application, based on the entities that support your business.
Why Developers Need to Know the Relational Model
You might be sitting there reading this post and saying “Yeah, so what?” While you may not be an expert in data modeling, you need to understand how databases work. It helps you to be a better developer.
In today’s DevOps world, you may need to become a “full-stack” developer, learning and using more skills than just developing application code. Some companies, especially startups, require developers who know how to write code and build databases.
Building a good relational model results in reducing the space required to hold data. Your data model also weighs heavily on the performance of your application. Poorly constructed data models lead to major performance issues for your application. Don’t let a lack of understanding cause major problems down the road.
When you know the relational model well and use it effectively, it helps your application to run more smoothly. You can also become better at building database queries that your code uses to retrieve data from the database.
Learning these skills will make you a better developer and a more marketable one. So let’s practice building an effective relational model.
Building a Model
Here, we’re going to build a relational model for invoices. Invoices are a good place to start because they are both simple and good at illustrating the definitive concepts in creating a good model.
At its heart, a relational model is all about reducing duplication of data. You keep one record that is the source of truth for an entity. Then you reference that record where you need it without writing the same information all over again. Let’s see how that works.
First, you create a table for the customers who will receive the invoice. You’ll need to know what fields you care about in relation to the customer. We’ll keep it simple and just have a name, email address, and phone number.
The Id field is necessary to uniquely identify each record. That way, if there are two John Smiths in the database, the computer will know which one you need when you request it.
You can’t have an invoice without products, so let’s create that table next.
The interesting part of modeling data with relational tables is building relationships between tables. However, we can’t simply add a relationship directly between customers and products. Why not?
A customer can buy many different products from you. The customer may buy a couple of products one day and then come back the next day and buy more products. Also, a single product can be bought by multiple customers.
In relational modeling, this is referred to as a many-to-many relationship. We need a table in between that will tell us how customers relate to products over time. That new table is the invoice table.
We also want to reduce the amount of space necessary to store the invoices. We don’t want to replicate the name, email address, and phone number of the customer every time they buy a product.
We can do this by using the power of referential integrity, which is the principle that an id can be used to refer to one and only one entity in a database. That id can then be stored in different places by itself and used to refer to a record without replicating the data within that record.
We’ll see what that looks like in a minute. Before that, let’s create our invoice table.
Here, we see referential integrity and reduction in data size all in one. For each invoice record, we have a CustomerId that holds a single customer’s identifier. We have the same for a product. Now, when we need to pull back an invoice for a customer, we can join the invoice record with the customer and product records to put together a complete picture.
This is similar to our kitchen example, in which we joined together a plate, knife, fork, and spoon to create a complete place setting. The entire model now looks like this:
Let’s say we have a customer named John Smith with an id of “1.” We have a product named “chainsaw” with an id of “5.” The invoice record will have its own id, a “1” in the CustomerId field, and a “5” in the ProductId field, along with the date of the transaction.
When we need to see an invoice, we can go to the customer table, look for the record with an id of “1,” and find John Smith’s information. The data can then be passed to the user interface for display.
Build Your Relationships
The invoice example is simple but illustrates the power of the relational model. Several smaller tables linked together are easier to understand. Data is not replicated unnecessarily. Data integrity is maintained with ease.
To help with your first model, here are some tips:
- Model what makes sense. Don’t over complicate your model.
- Think in terms of “entities” within your application. Users, products, and invoices are entities in the above example. Each entity should become a table.
- Store only the attributes that are necessary for the business to function. Large database tables will take up too much space and be hard to manage over time.
Data is the heart of software. In today’s world, software is the heart of business. Learn how to use the relational model effectively and every business will value your skills.
Now go out and build your next model!