We talked about relationships in the relationship blog. We talked a little about how to structure relationships in a database, but this blog will be dedicated to specifically that.
One to One
One to one relationships over multiple tables are not very common. This is because a one to one relationship can be stored in one table.
Think of the example of a user table and a social security number (SSN) attribute. This would be unnecessary:
This is unnecessary because we are not adding any features to our database. The user still has only one SSN and one SSN belongs to one user. This is a one to one relationship and does not need to be stored across multiple tables. Not to mention, the SSN is probably not a good thing to use as a primary key (just throwing that out there).
Now, keep in mind though that a social security number is not necessarily 100% unique. I believe there have been cases of repeating social security numbers in America. This means that, potentially, a social security can be used by more than one person. You may say that this is a one to many to relationship, but when we design relationships we want to worry more about the column meaning, not the chance that a value may be repeated. To explain farther, in a one to one relationship between the user and the SSN, we would practically have 100% unique social security numbers with 1 social security number being assigned to one user. Implementing a one to many relationship would change the whole meaning of the social security number, saying that it is proper for a social security number to be assigned to many people.
Because the social security number is a one to one relationship, we could store it like this:
The only thing that is different is that ssn is no longer a foreign key pointing to another table.
The only time we would need to store a one to one relationship over multiple tables is if we had columns describing the second entity. If, for some reason, we had attributes describing the social security number, they would need to be stored in a separate table because they are not attributes about the user, but the social security number. Let’s say we had a simplified credit card company database with a one to one relationship of a person, and an issued credit card. The company only issues an individual one credit card and that credit card can be issued to only one person. Here is a one to to one relationship over two tables:
Here is an example of a one to one relationship.
This would be necessary because if we stored max_amount, issue_date, and expiration_date in the card-holder table, we would have a table that stores information about people and credit cards. We have already learned that we must follow the rule of one (one table stores information about only one entity).
One to Many
One to many relationships are stored over multiple tables. The primary key is the one side and the foreign key is the many side. Let’s update our credit card database to allow a card holder to have multiple cards.
Here you can see the previous example updated with a one to many relationship.
One of the differences in this model compared to the previous is that the end of the blue line breaks off into three lines:
Here you can see the crow feet telling us the relationship between the two tables.
This is called Crow’s foot notation. There is more to crows foot notation than just this, but we will explain everything in an blog dedicated to crow’s foot notation. Anyways…
We also removed the card_id from the card_holder table because we no longer have only one card and it is unnecessary to store this information in the card_holder table. If we wanted to store this information in the card holder_table, we would have this incorrect table:
Better solution: Thanks to the holder_id in the card table, we know who owns every single credit credit card. We can use SQL to generate a table telling us all of the cards owned by a specific user using a WHERE clause. We can select all of the cards where holder_id = 543, for example. This would create a new view giving us every credit card owned by the holder with the holder id of 543. We could also do the same search using first name and last name with a join statement.
Once again, we can see that storing a one to many relationship in two tables makes sense. If we stored it in one table, we would have this incorrect table:
We haven’t studied normal forms, but you will soon learn that this is a violation.
This is incorrect because we are assuming that every card holder will have four cards. If a card holder had only 3, one column would be null. Or, if a card holder had 5 cards, we would have to create a new column. The third and final incorrect solution is to have a column “card_ids” which stores multiple values. All three of these methods are incorrect! That is why we store one to many relationships over two tables.
When it comes to one to many relationships, the one side is the parent and the many side is the child.
Let’s look back at our credit card database:
The one side of the relationship is the parent. The many side of the relationship is the child.
One card holder can have many credit cards. Each credit card has one holder. The parent can have multiple children but the children can only have one parent.
Many to Many
Many to many relationships can not correctly be created in a database. This is because we have a never ending parent-child loop.
Imagine if we allowed a card holder to have multiple credit cards and a credit card can be co-signed by multiple people. This would mean that the credit card is the parent and the card holders are the children (one credit card is owned by many people). The card holder is also the parent and the cards are the children(one card holder can have multiple cards). This would mean both entities are the parents and both entities are the children….freakkkyyyy! This is said to be an unresolved relationship.
If we tried to force a many to many relationship into a database, we would have to have either multiple columns for multiple key values or a single key column containing multiple values. This is not going to work.
The solution to storing a many to many relationship in a database is to break it up into two one to many relationships. We use an intermediary table to connect the two one to many relationships. Let’s illustrate this with the credit card company allowing co-ownership from multiple people:
Here we can see that the holder_id and card_id of the table card_holder_card are both foreign keys to the other two tables.
Although this setup can be improved some, I will explain what is going on here. The intermediary table (card_holder_card) holds two foreign keys, one pointing to each table. The natural key of this table is a compound key of holder_id+card_id. A row of this table will contain a key from each table, telling us who is the owner of which card. Let’s insert some values:
We can tell that some cards are not owned (because they are not in the intermediary table) and some people do not own cards (because they are not in the intermediary table). The fact that some people who do not own cards are in the card_holder table indicates that we may want to update the meaning of our table or remove those people who do not own any cards… but that is a bit off topic. Looking at the intermediary table we can figure out that:
- Jimmy John owns one card ( card 4)
- Jill Johnson owns one card (card 4)
- Carla Jackson owns 3 cards (cards 3, 2, and 4).
This solves the parent child relationship problem, too:
The foreign keys in card_holder_card are both children of the primary keys in other tables.
This eliminates repeating data because every row in the intermediary table is unique and every column only stores one value. If we wanted to add more information about the card we would put it in the card table, if we wanted to store information about the card holder we would store it in the card holder table, but if we want to store information about the card_holder having the actual card, we must store it in the intermediary table. For example, a date of when the card was issued to a specific person is not directly related to the person or the card, but rather it is related to the process of the person owning the card. Therefore, it must go in the intermediary table. This is called a shared attribute and only makes sense when we have a relationship between two tables. The shared attribute stores values pertaining to the card and the person who owns the card, but not either one individually.
The issue of whether or not the intermediary table requires a surrogate primary key is a big one and will be addressed as necessary in blogs to come. For now, the combination of holder_id and card_id can be a compound natural key.
- The relationships between two tables are known as binary relationships.
- We can have relationships between more than two tables (logically…not physically), which we will explain when the time comes.
- One to one relationships usually do not need two tables and therefore are not a relationship between two tables at all.
- The only time a one to one relationship is required (two tables) is when we want to store information that has functional dependencies upon another column.
- one to many relationships require two tables.
- The primary key within a one to many relationship is on the one side and is the parent in the relationship.
- The foreign key is on the many side and is the child in the relationship.
- A many to many relationship has an unresolved parent-child relationship and must be broken up into three tables.
- These three tables consist of two parent-child relationships.
- The intermediary table is the solution to a many to many table.
- an intermediary table has multiple names, including junction table, resolving table, and so forth.
- Data about the many to many relationship needs to be put into the intermediary table.
- In some instances, the intermediary table may require a surrogate key, but for now the combination of the two foreign keys makes up the compound natural key.