To see how the FROM clause and JOIN subclause fit into the SELECT statement, please refer back to Chapter 2.
The FROM Clause
The FROM clause has this basic syntax:
The FROM clause defines the set of data from which the resultset will be selected. If the FROM clause is used, at least one from_item or one select_statement is required.
The FROM clause in its simplest form refers to a single table. That table may be joined to other tables with a join subclause to create a composite of data from multiple tables, from which data can be selected.
The FROM clause can also contain a SELECT statement of its own, called a subquery. The results of the subquery become the dataset from which the outside SELECT statement selects its data. This process can be infinitely repeated, although performance degrades pretty quickly with multiple nested subqueries. (Chapter 5 covers subqueries in more detail.)
The Join Subclause
Joins are defined in the JOIN clause, which is a subclause of the FROM clause. There are several types of joins, and multiple tables can be joined with multiple JOIN clauses in a single FROM clause. (For an explanation of how the JOIN clause fits into the FROM clause syntactically, see Chapter 2.)
The syntax for the JOIN subclause is:
The join_type can be any of these:
[ INNER ] JOINLEFT [ OUTER ] JOINRIGHT [ OUTER ] JOINFULL [ OUTER ] JOINCROSS JOIN
A join expression is an expression with a result of type boolean that specifies how a row from one table matches a row from the other. It is preceded by the ON keyword, so it is sometimes called the ON clause.
In join expressions, the terms INNER and OUTER are usually omitted in practice: JOIN, LEFT JOIN, RIGHT JOIN and FULL JOIN. For the purpose of this explanation, I’ll leave them in.
It can be helpful to express the different types of joins in terms of set theory. In terms of set theory, and in terms of two sets A and B:
- An inner join is the intersection of
AandB, orA ∩ B. - A left outer join is the union of the intersection of
AandBandA‘s relative complement ofB(i.e. those members ofAthat are not included inB), or(A ∩ B) ∪ (A - B). - A right outer join is the the union of the intersection of
AandBandB‘s relative complement ofA, or(A ∩ B) ∪ (B - A). - A full outer join is the union of
AandB, orA ∪ B. - A cross join is the Cartesian product of
AandB, orA × B.
To give an example of each type of join, I’ll use these two tables:
fruits
| id | name | color_id |
|---|---|---|
| 1 | orange | 4 |
| 2 | cherry | 1 |
| 3 | lemon | 2 |
| 4 | date | null |
| 5 | pear | 2 |
colors
| id | color |
|---|---|
| 1 | red |
| 2 | yellow |
| 3 | blue |
| 4 | orange |
INNER JOIN
INNER JOIN includes only rows that have matching values as specified in the ON clause (in this case fruits.color_id and colors.id).
This SQL statement:
|
1 2 3 4 5 |
SELECT name, color FROM fruits f INNER JOIN colors c ON f.color_id = c.id; |
Will have this result:
| name | color |
|---|---|
| orange | orange |
| cherry | red |
| lemon | yellow |
| pear | yellow |
LEFT OUTER JOIN
LEFT OUTER JOIN includes all the rows that INNER JOIN does, as well as all other rows in the table that the FROM clause references (the “FROM table,” fruits in this case). Rows in the FROM table that do not have matching values in the JOIN table (the colors table in this case) have null values for all columns in the JOIN table.
This SQL statement:
|
1 2 3 4 5 |
SELECT name, color FROM fruits f LEFT OUTER JOIN colors c ON f.color_id = c.id; |
Will have this result:
| name | color |
|---|---|
| orange | orange |
| cherry | red |
| lemon | yellow |
| date | |
| pear | yellow |
RIGHT OUTER JOIN
RIGHT OUTER JOIN includes all the rows that INNER JOIN does, as well as any other rows in the table that the JOIN clause references. Rows in the JOIN table that do not have matching values in the FROM table have null values for all columns in the FROM table.
This SQL statement:
|
1 2 3 4 5 |
SELECT name, color FROM fruits f RIGHT OUTER JOIN colors c ON f.color_id = c.id; |
Will have this result:
| name | color |
|---|---|
| orange | orange |
| cherry | red |
| lemon | yellow |
| pear | yellow |
| blue |
FULL OUTER JOIN
FULL OUTER JOIN returns a combination of the rows returned by LEFT OUTER JOIN and RIGHT OUTER JOIN.
This SQL statement:
|
1 2 3 4 5 |
SELECT name, color FROM fruits f FULL OUTER JOIN colors c ON f.color_id = c.id; |
Will have this result:
| name | color |
|---|---|
| orange | orange |
| cherry | red |
| lemon | yellow |
| date | |
| pear | yellow |
| blue |
CROSS JOIN
CROSS JOIN returns the specified columns from the Cartesian product of the two tables. A Cartesian product is the set of all possible combinations of each of the elements in two sets. For example, a deck of cards is the Cartesian product of a set of ranks and a set of suits.
This SQL statement:
|
1 2 3 4 5 |
SELECT name, color FROM fruits CROSS JOIN colors; |
Will have this result:
| name | color |
|---|---|
| orange | red |
| orange | yellow |
| orange | blue |
| orange | orange |
| cherry | red |
| cherry | yellow |
| cherry | blue |
| cherry | orange |
| lemon | red |
| lemon | yellow |
| lemon | blue |
| lemon | orange |
| date | red |
| date | yellow |
| date | blue |
| date | orange |
| pear | red |
| pear | yellow |
| pear | blue |
| pear | orange |
Using Cross Joins to Get Missing Combinations
Cross joins can be used to get a list of all members of two entities in many-to-many relationship that have no relation to each other.
A practical example will show why this is useful. Suppose I have some customers in a customers table, and I have some products in a products table. I also keep a list of which customers have bought which products in a customers_products table. I want a list of customers and the products that they haven’t bought yet. In other words, I want a list of all the combinations of customers and products that aren’t in the customers_products table.
Here are tables with some sample data:
| id | name |
|---|---|
| 1 | Anne |
| 2 | Bob |
| 3 | John |
| id | name |
|---|---|
| 1 | widget |
| 2 | frammis |
| 3 | gewgaw |
| cust_id | prod_id |
|---|---|
| 1 | 1 |
| 1 | 2 |
| 2 | 3 |
| 3 | 1 |
| 3 | 2 |
| 3 | 3 |
Now, this query:
|
1 2 3 4 5 6 7 8 9 10 |
SELECT c.name AS cust, p.name AS prod FROM customers c CROSS JOIN products p WHERE NOT EXISTS ( SELECT * FROM customers_products WHERE cust_id = c.id AND p.id = prod_id ); |
Will have this result:
| cust | prod |
|---|---|
| Anne | gewgaw |
| Bob | widget |
| Bob | frammis |
Which is what we want: all of the products that each customer hasn’t bought. We are selecting the Cartesian product of customers and products, minus the combinations that exist in the customers_products table.
For more on the WHERE NOT EXISTS predicate, see Chapter 6.
