An entity is some sort of “thing” that a database keeps track of. Entities that are some part of the domain being modeled are called kernel entities or independent entities. There are other entities called associative entities that help relate these kernel entities to one another.
Entities have a set of attributes. Each attribute is a single data element that is part of the entity. A single set of attributes is called an instance.
To clarify this, consider a table of people with first name, last name, phone, address, city, state and zip. The people table is an entity. Each of the first name, last name and so on are attributes. And an instance is a row in the table, with one of each attribute.
Cardinality and Modality
Cardinality defines how many instances in an entity can relate to a single instance in another entity. There are three possible types of relations: one-to-many, many-to-many and one-to-one. Modality defines the minimum number of times that two entities can relate to each other. This is either zero or one.
One-To-Many Relationships
One-to-many relationships are relationships between two entities, where one instance (row) on the “one” side can relate to multiple instances on the “many” side. Modality can be either zero or one on either side.
One-to-many relationships are the “bread and butter” of a database design. In a fully normalized database, all entity relationships (with the exception of perhaps a few one-to-one relationships for reasons mentioned above) have a one-to-many cardinality. (As explained further in the next section, many-to-many relationships are reduced to two one-to-many relationships by inserting an associative entity between them.)
For example, suppose we have a “trees” entity and a “yards” entity. A yard can have any number of trees in it, including none. A tree is either in one yard or somewhere else, such as in a park. These two entities would have a one-to-many relationship, with trees on the “many” side (a yard can have many trees) and yards on the “one” side (a tree can only be in one yard). Also, both sides have a modality of zero: a tree doesn’t have to be in a yard, and a yard doesn’t have to have any trees.
Now, suppose we have a “teachers” entity and a “classes” entity. A teacher can teach many classes, and each class is taught by one teacher. This is another one-to-many relationship, with classes on the “many” side and teachers on the “one” side. Furthermore, a teacher has to teach at least one class, or he isn’t a teacher. And a class has to have a teacher, or it can’t be a class. Therefore, in this relationship, both sides have a modality of one.
Many-To-Many Relationships
Suppose we now want to model the relationships between students and classes. A student can go to more than one class, and a class can have more than one student. So, both sides are a “many” side. This is a many-to-many relationship.
To model many-to-many relationships, we create an associative entity (or join entity, link entity, bridge entity, or any of a number of other terms) that is in one-to-many relationship with each of the two entities. In the case of students and classes, for example, we can create a “seats” entity, each instance of which represents a seat for one student in one class. Seats are the “many” side of the relationship with both students and classes (each student has many seats, one for each class, and each class has many seats, one for each student), while both students and classes are on the “one” side (a seat is for one class and one student).
As for modality, a student has to have a seat in at least one class, and a class has to have at least one seat available. Each seat has to have at least one student and one class. So, both entity relationships (student-seat and class-seat) have a modality of one on both sides.
But what about empty seats? Not every class is full, right? Most classrooms have empty desks in them, after all. Shouldn’t there be a modality of zero on the student side of the student-seat relation? That sounds logical, but if there were a class full of empty desks it wouldn’t run. A “seat” is not a “desk” or available spot in the class (at some point before enrollment is complete it would be, but we’re modeling a completed schedule here); it’s a desk with a student sitting at it when a class is going on. By this definition, there’s a modality of one on both sides.
Now, suppose students have optional extracurricular activities in the form of clubs. Now, we have a many-to-many relationship between students and clubs (clubs have more than one student, students can join more than one club), and we can create an associative entity called “memberships” that is in one-to-many relationship with both students and clubs. A “membership” instance is one student’s membership in one club; both clubs and students can have many memberships.
Since, presumably, a club with no students belonging to it isn’t a club, we have a modality of one on both sides of the club-membership relation. (A club has to have one or more memberships; a membership has to have one club.) However, while a membership must have one and only one student, a student can have zero or more club memberships. So, in the student-membership relation, the student side has a modality of one (a membership must have one student), while the membership side has a modality of zero.
Why is this different from students and seats? How can memberships have a modality of zero here, when the seats entity has to have a modality of one? The answer lies in how we have defined what a “student” is in the business rules. As we have said, if a student isn’t in at least one class, he isn’t a student, but he doesn’t have to join any clubs to be a student. So, a modality of one with seats, and zero with memberships.
Database Design Steps
A basic database design consists of three steps:
- Identifying entities and their attributes
- Identifying one-to-many relationships
- Identifying many-to-many relationships, and resolving them with associative entities
These steps are essential. Beyond them, splitting entities into one-to-one relations for some of the reasons I have described can improve the design.
Applying these steps will solve most database design problems. Relational Database Design: One-to-One Relationships and When to Use Them explains how to use one-to-one relationships to improve efficiency and performance.