To see how the FROM
clause and JOIN
subclause fit into the SELECT
statement, please refer back to Chapter 2.
The FROM
Clause
The FROM
clause has this basic syntax:
The FROM
clause defines the set of data from which the resultset will be selected. If the FROM
clause is used, at least one from_item or one select_statement is required.
The FROM
clause in its simplest form refers to a single table. That table may be joined to other tables with a join subclause to create a composite of data from multiple tables, from which data can be selected.
The FROM
clause can also contain a SELECT
statement of its own, called a subquery. The results of the subquery become the dataset from which the outside SELECT
statement selects its data. This process can be infinitely repeated, although performance degrades pretty quickly with multiple nested subqueries. (Chapter 5 covers subqueries in more detail.)
The Join Subclause
Joins are defined in the JOIN
clause, which is a subclause of the FROM
clause. There are several types of joins, and multiple tables can be joined with multiple JOIN
clauses in a single FROM
clause. (For an explanation of how the JOIN
clause fits into the FROM
clause syntactically, see Chapter 2.)
The syntax for the JOIN
subclause is:
The join_type can be any of these:
[ INNER ] JOIN
LEFT [ OUTER ] JOIN
RIGHT [ OUTER ] JOIN
FULL [ OUTER ] JOIN
CROSS JOIN
A join expression is an expression with a result of type boolean
that specifies how a row from one table matches a row from the other. It is preceded by the ON
keyword, so it is sometimes called the ON
clause.
In join expressions, the terms INNER
and OUTER
are usually omitted in practice: JOIN
, LEFT JOIN
, RIGHT JOIN
and FULL JOIN
. For the purpose of this explanation, I’ll leave them in.
It can be helpful to express the different types of joins in terms of set theory. In terms of set theory, and in terms of two sets A
and B
:
- An inner join is the intersection of
A
andB
, orA ∩ B
. - A left outer join is the union of the intersection of
A
andB
andA
‘s relative complement ofB
(i.e. those members ofA
that are not included inB
), or(A ∩ B) ∪ (A - B)
. - A right outer join is the the union of the intersection of
A
andB
andB
‘s relative complement ofA
, or(A ∩ B) ∪ (B - A)
. - A full outer join is the union of
A
andB
, orA ∪ B
. - A cross join is the Cartesian product of
A
andB
, orA × B
.
To give an example of each type of join, I’ll use these two tables:
fruits
id | name | color_id |
---|---|---|
1 | orange | 4 |
2 | cherry | 1 |
3 | lemon | 2 |
4 | date | null |
5 | pear | 2 |
colors
id | color |
---|---|
1 | red |
2 | yellow |
3 | blue |
4 | orange |
INNER JOIN
INNER JOIN
includes only rows that have matching values as specified in the ON
clause (in this case fruits.color_id
and colors.id
).
This SQL statement:
1 2 3 4 5 |
SELECT name, color FROM fruits f INNER JOIN colors c ON f.color_id = c.id; |
Will have this result:
name | color |
---|---|
orange | orange |
cherry | red |
lemon | yellow |
pear | yellow |
LEFT OUTER JOIN
LEFT OUTER JOIN
includes all the rows that INNER JOIN
does, as well as all other rows in the table that the FROM
clause references (the “FROM
table,” fruits
in this case). Rows in the FROM
table that do not have matching values in the JOIN
table (the colors
table in this case) have null values for all columns in the JOIN
table.
This SQL statement:
1 2 3 4 5 |
SELECT name, color FROM fruits f LEFT OUTER JOIN colors c ON f.color_id = c.id; |
Will have this result:
name | color |
---|---|
orange | orange |
cherry | red |
lemon | yellow |
date | |
pear | yellow |
RIGHT OUTER JOIN
RIGHT OUTER JOIN
includes all the rows that INNER JOIN
does, as well as any other rows in the table that the JOIN
clause references. Rows in the JOIN
table that do not have matching values in the FROM
table have null values for all columns in the FROM
table.
This SQL statement:
1 2 3 4 5 |
SELECT name, color FROM fruits f RIGHT OUTER JOIN colors c ON f.color_id = c.id; |
Will have this result:
name | color |
---|---|
orange | orange |
cherry | red |
lemon | yellow |
pear | yellow |
blue |
FULL OUTER JOIN
FULL OUTER JOIN
returns a combination of the rows returned by LEFT OUTER JOIN
and RIGHT OUTER JOIN
.
This SQL statement:
1 2 3 4 5 |
SELECT name, color FROM fruits f FULL OUTER JOIN colors c ON f.color_id = c.id; |
Will have this result:
name | color |
---|---|
orange | orange |
cherry | red |
lemon | yellow |
date | |
pear | yellow |
blue |
CROSS JOIN
CROSS JOIN
returns the specified columns from the Cartesian product of the two tables. A Cartesian product is the set of all possible combinations of each of the elements in two sets. For example, a deck of cards is the Cartesian product of a set of ranks and a set of suits.
This SQL statement:
1 2 3 4 5 |
SELECT name, color FROM fruits CROSS JOIN colors; |
Will have this result:
name | color |
---|---|
orange | red |
orange | yellow |
orange | blue |
orange | orange |
cherry | red |
cherry | yellow |
cherry | blue |
cherry | orange |
lemon | red |
lemon | yellow |
lemon | blue |
lemon | orange |
date | red |
date | yellow |
date | blue |
date | orange |
pear | red |
pear | yellow |
pear | blue |
pear | orange |
Using Cross Joins to Get Missing Combinations
Cross joins can be used to get a list of all members of two entities in many-to-many relationship that have no relation to each other.
A practical example will show why this is useful. Suppose I have some customers in a customers
table, and I have some products in a products
table. I also keep a list of which customers have bought which products in a customers_products
table. I want a list of customers and the products that they haven’t bought yet. In other words, I want a list of all the combinations of customers
and products
that aren’t in the customers_products
table.
Here are tables with some sample data:
id | name |
---|---|
1 | Anne |
2 | Bob |
3 | John |
id | name |
---|---|
1 | widget |
2 | frammis |
3 | gewgaw |
cust_id | prod_id |
---|---|
1 | 1 |
1 | 2 |
2 | 3 |
3 | 1 |
3 | 2 |
3 | 3 |
Now, this query:
1 2 3 4 5 6 7 8 9 10 |
SELECT c.name AS cust, p.name AS prod FROM customers c CROSS JOIN products p WHERE NOT EXISTS ( SELECT * FROM customers_products WHERE cust_id = c.id AND p.id = prod_id ); |
Will have this result:
cust | prod |
---|---|
Anne | gewgaw |
Bob | widget |
Bob | frammis |
Which is what we want: all of the products that each customer hasn’t bought. We are selecting the Cartesian product of customers and products, minus the combinations that exist in the customers_products
table.
For more on the WHERE NOT EXISTS
predicate, see Chapter 6.