Before building a database for yourself or a corporation, you need to understand a bit about database structure.
We are going to go in some depth on how to design a database, which deals with structuring your database the best way possible. However, I thought it would be appropriate to give an overview of the Db2 structure and start using the database right away.
Once we have the fundamentals down, we are going to then go back to database design to learn the details. This will allow us to more concretely discuss database design because we’ve already had some experience with the fundamentals.
The very first thing that you need to know is that data is organized into different tables. We first discussed tables in the Introduction to Db2. Each table describes one type of thing (for example, users). You should see the table as the blueprint describing how the data should be structured. Each row is an instance of whatever the table describes.
When we discuss database design, we will fully understand how to break up our data into different tables. For now, the important thing to understand is that a database consists of multiple tables related to one another.
In a Db2 database, tables are organized into schemas. A schema is just a fancy word for group. This allows us to organize our database in a more logical way when working with a large database. Not only are tables organized in different schemas, but all objects are.
The things we create in our database are called database objects and they are all going to have a name. These objects include tables, indexes, functions, stored procedures, triggers, etc.
When we create an object, we are going to give it a custom name to identify it. This name is called the identifier.
The schema name in combination with custom identifier is the fully qualified name of an object.
As an example, if we have a table named users, and this table is inside the db2inst1 (the default) schema, the fully qualified name of our object would be db2inst1.users.
For a minute I want you to try to visualize data. If you are looking at the structure of a company’s data from a thousand miles away, you will see a database or multiple databases. The database is the biggest piece containing our data. If we zoom into one database, we see schemas. These are the next biggest piece, second to the database. Zooming into one schema, we begin to see our database objects, such as tables. Zooming into one table we begin to see our columns and rows. Well, if you zoom in on one column, you will find even more detail.
If that made no sense, I’m sorry. Google Maps doesn’t offer the data-architecture zoom feature any more… Essentially, what I am trying to say is that we not only have to think about the structure from the big perspective, but also the very small details all the way down to the column level.
Every column is going to have a data type, which determines the type of data (was that too obvious?) each row can store in that column.
So for example, is the column going to store numbers, strings, a character, or binary? The column’s data type determines what is allowed.
Data types are a huge topic, so we cannot cover it all in this introduction. Let’s talk about the basic data types we’ll be using in the early blogs. First, stare at this image from the Db2 docs diagramming the types of data types:
As you can see from this chart, there are a lot of data types grouped in different categories.
Later on we will go in depth on the different data types, but for now we are going to learn just a few that we will be using immediately in the beginning of this series.
The integer data type (INT) is used for whole numbers and can be used to give each record an ID. Later we will learn about primary keys (unique identifiers), but for now know that the int data type is very important.
CHAR and VARCHAR
CHAR and VARCHAR are used for storing string data. A string is a sequence of characters, where a character is any acceptable letter/number/symbol. Strings are often signified by opening and closing quotes: “This is a string” is a perfect example.
The CHAR data type is used for strings that are supposed to be the same length for every single row. The VARCHAR data type is used for variable length strings. Think VARCHAR = VARiable CHARacter.
The BOOLEAN data type is used to store data that is in two states, usually true or false.
DATE, TIME, and TIMESTAMP
The DATE, TIME, and TIMESTAMP data types are all various data types used to stored dates and time. Later on we will get into the specifics of how this data is formatted, but for now just know that they exist and should be used when working with data relating to time and events.
The important thing to take away from this blog is a general understanding of how data is structured in a database and that we have to be explicit when it comes to the type of data. We control the type of data by giving each column a data type.
IMPORTANT: In the next blog we are going to begin using Db2. This means that you should have an edition of Db2 installed and ready to go. To get the Community Edition, use this link. The installation is pretty straightforward and will hopefully not give you any problems!