Skip to content

Joins

Overview

JOIN combines rows from two tables or other table-like sources (views, table functions, etc.) into a new combined row or rows.

Joins are particularly useful when data in tables is organized relationally and the data in the different tables is linked together.

For example:

sql
WITH
planets AS (SELECT 1 AS id, 'Jupiter' AS name, 142984 as diameter, 'km' as  m_units
       UNION ALL
       SELECT 2 AS id, 'Saturn' AS name, 120500 as diameter, 'km' as  m_units
       UNION ALL 
       SELECT 3 AS id, 'Uranus' AS name, 51118 as diameter, 'km' as  m_units
       UNION ALL 
       SELECT 4 as id, 'Earth' AS name, 12742 as diameter, 'km' as  m_units
       UNION ALL 
       SELECT 5 as id, 'Venus' AS name, 12104 as diameter, 'km' as  m_units
      ), 
top2_moons AS (   SELECT 1 AS planet_id, 'Ganymede' AS name, 5268.2 AS diameter, 'km' as  m_units
                  UNION ALL 
                  SELECT 1 AS planet_id, 'Callisto' AS name, 4820.6 as diameter, 'km' as  m_units
                  UNION ALL 
                  SELECT 2 AS planet_id, 'Titan' AS name, 5149.46 AS diameter, 'km' as  m_units
                  UNION ALL 
                  SELECT 2 AS planet_id, 'Rhea' AS name, 1527.6 as diameter, 'km' as  m_units
                  UNION ALL 
                  SELECT 3 AS planet_id, 'Titania' AS name, 1576.8 AS diameter, 'km' as  m_units
                  UNION ALL 
                  SELECT 3 AS planet_id, 'Oberon' AS name, 1522.8 as diameter, 'km' as  m_units
                  UNION ALL 
                  SELECT 4 AS planet_id, 'Moon' AS name, 3474.8 as diameter, 'km' as  m_units
                  UNION ALL
                  SELECT 6 as planet_id, 'Phobos' AS name, 22.2 as diameter, 'km' as m_units
                  UNION ALL
                  SELECT 6 as planet_id, 'Deimos' AS name, 12.6 as diameter, 'km' as m_units
                  )
SELECT p.name, p.diameter as planet_diameter, m.name as moon_name,  m.diameter as moon_diameter
    FROM planets p INNER JOIN top2_moons m
        ON p.id= m.planet_id
    ORDER BY 2 DESC, 4 DESC;

txt
+---------+-----------------+-----------+---------------+
| NAME    | PLANET_DIAMETER | MOON_NAME | MOON_DIAMETER |
+---------+-----------------+-----------+---------------+
| Jupiter | 142984          | Ganymede  | 5268.20       |
| Jupiter | 142984          | Callisto  | 4820.60       |
| Saturn  | 120500          | Titan     | 5149.46       |
| Saturn  | 120500          | Rhea      | 1527.60       |
| Uranus  | 51118           | Titania   | 1576.80       |
| Uranus  | 51118           | Oberon    | 1522.80       |
| Earth   | 12742           | Moon      | 3474.80       |
+---------+-----------------+-----------+---------------+

The related tables (or table-like objects) (in our example these are Common Table Expression (CTE)s named planets and top2_moons) can contain one or more columns (in our example these are column id in the table planets which contains values ​​that match the column planet_id in the table top2_moons) whose values ​​are related.

These columns can be used to join rows from one object to rows from the other object.

The JOIN statement mostly contains information about which criterion or criteria are decisive for which rows from one object are linked to the rows from the other object by specifying the names of the common columns (in our example it is ON p.id = m.planet_id). See CROSS JOIN for an example of a join without an ON clause and explicitly specified join condition.

A single JOIN statement can only link 2 table-like objects. However, several JOIN statements can be chained together to create complex linked rows that can contain information from several objects. See Multiple JOIN example for an example of possible syntax.

The result of a JOIN statement is a table-like object.

For more information about the JOIN syntax, see JOIN in the 'Query Syntax' section.

JOIN compatible Objects

Any table-like objects can be joined with the JOIN statement. For example:

  • tables
  • views
  • an expression such as a result set returned by a subquery or table function. It is important that the expression evaluates to the equivalent of a table.

Types of Joins

The following types of joins are supported by MetaKraftwerk:

  • INNER join
  • OUTER (FULL, RIGHT or LEFT) join
  • CROSS join

We create the objects planets and top2_moons as tables and use them in the further examples:

sql
CREATE TABLE IF NOT EXISTS planets (id INTEGER, name STRING, diameter DECIMAL, m_units STRING); 
INSERT INTO planets VALUES 
                            (1,'Jupiter',142984, 'km'),
                            (2,'Saturn', 120500, 'km'),
                            (3,'Uranus', 51118 , 'km'),
                            (4,'Earth',  12742 , 'km'),
                            (5,'Venus',  12104 , 'km');

SELECT * FROM planets;
txt
+--------+---------+----------+---------+
| ID     | NAME    | DIAMETER | M_UNITS |
+--------+---------+----------+---------+
| 1      | Jupiter | 142984   | km      |
| 2      | Saturn  | 120500   | km      |
| 3      | Uranus  | 51118    | km      |
| 4      | Earth   | 12742    | km      |
| 5      | Venus   | 12104    | km      |
+--------+---------+----------+---------+
sql
CREATE TABLE IF NOT EXISTS top2_moons (planet_id INTEGER, name STRING, diameter DECIMAL, m_units STRING); 
INSERT INTO top2_moons VALUES 
(1, 'Ganymede', 	5268.2 , 'km'),
(1, 'Callisto', 	4820.6 , 'km'),
(2, 'Titan', 	    5149.46, 'km'),
(2, 'Rhea', 		1527.6 , 'km'),
(3, 'Titania', 	    1576.8 , 'km'),
(3, 'Oberon', 	    1522.8 , 'km'),
(4, 'Moon', 		3474.8 , 'km'),
(6, 'Phobos', 	    22.2   , 'km'),
(6, 'Deimos', 	    12.6   , 'km');
SELECT * FROM top2_moons;
txt
+-----------+----------+----------+---------+
| PLANET_ID | NAME     | DIAMETER | M_UNITS |
+-----------+----------+----------+---------+
| 1         | Ganymede | 5268     | km      |
| 1         | Callisto | 4821     | km      |
| 2         | Titan    | 5149     | km      |
| 2         | Rhea     | 1528     | km      |
| 3         | Titania  | 1577     | km      |
| 3         | Oberon   | 1523     | km      |
| 4         | Moon     | 3475     | km      |
| 6         | Phobos   | 22       | km      |
| 6         | Deimos   | 13       | km      |
+-----------+----------+----------+---------+

INNER Join

An INNER JOIN joins each row in first object with the matching row or rows in the second object.

In our example above we used an INNER JOIN to join the CTE's by planet id and each row from planets was "extended" by one or more rows from top2_moons (compare Earth and other three planets).

The output of an INNER JOIN includes only rows that match the join condition in both joined objects. The example above does not contain rows for the moons of Venus (contained in in plantets) and Mars (contained in top2_moons with planet_id = 6) in the output because no corresponding rows were found in the other CTE.

See INNER JOIN description and INNER JOIN example for more information.

OUTER Joins

An outer join lists all rows in the specified table, in contrast to INNER JOIN.

See OUTER join descriptions and OUTER join examples for more information.

LEFT OUTER Join

LEFT OUTER JOIN retains all rows of the "left" object and fill the the row attributes of the "right" object, that has no related rows in the "left" object with NULLs:

Compare this results with the results of INNER JOIN:

sql
SELECT p.name, p.diameter as planet_diameter, m.name as moon_name,  m.diameter as moon_diameter
    FROM planets p LEFT OUTER JOIN top2_moons m
        ON p.id= m.planet_id
    ORDER BY 2 DESC, 4 DESC;

txt
+---------+-----------------+-----------+---------------+
| NAME    | PLANET_DIAMETER | MOON_NAME | MOON_DIAMETER |
+---------+-----------------+-----------+---------------+
| Jupiter | 142984          | Ganymede  | 5268          |
| Jupiter | 142984          | Callisto  | 4821          |
| Saturn  | 120500          | Titan     | 5149          |
| Saturn  | 120500          | Rhea      | 1528          |
| Uranus  | 51118           | Titania   | 1577          |
| Uranus  | 51118           | Oberon    | 1523          |
| Earth   | 12742           | Moon      | 3475          |
| Venus   | 12104           | null      | null          |
+---------+-----------------+-----------+---------------+

RIGHT OUTER join

RIGHT OUTER JOIN retains all rows of the right object and fill the the row attributes of the left object, that has no related rows in the right object with NULLs:

You can compare the results with the results of the INNER JOIN and LEFT OUTER JOIN.

sql
SELECT p.name, p.diameter as planet_diameter, m.name as moon_name,  m.diameter as moon_diameter
    FROM planets p RIGHT OUTER JOIN top2_moons m
        ON p.id= m.planet_id
    ORDER BY 2 DESC, 4 DESC;

txt
+---------+-----------------+-----------+---------------+
| NAME    | PLANET_DIAMETER | MOON_NAME | MOON_DIAMETER |
+---------+-----------------+-----------+---------------+
| null    | null            | Phobos    | 22            |
| null    | null            | Deimos    | 13            |
| Jupiter | 142984          | Ganymede  | 5268          |
| Jupiter | 142984          | Callisto  | 4821          |
| Saturn  | 120500          | Titan     | 5149          |
| Saturn  | 120500          | Rhea      | 1528          |
| Uranus  | 51118           | Titania   | 1577          |
| Uranus  | 51118           | Oberon    | 1523          |
| Earth   | 12742           | Moon      | 3475          |
+---------+-----------------+-----------+---------------+

FULL OUTER join

FULL OUTER join causes all rows of the "right" and "left" object to be displayed in the result. As with LEFT or RIGHT OUTER JOIN, the missing information in the rows is completed with NULL.

sql
SELECT p.name, p.diameter as planet_diameter, m.name as moon_name,  m.diameter as moon_diameter
    FROM planets p FULL OUTER JOIN top2_moons m
        ON p.id= m.planet_id
    ORDER BY 2 DESC, 4 DESC;

txt
+---------+-----------------+-----------+---------------+
| NAME    | PLANET_DIAMETER | MOON_NAME | MOON_DIAMETER |
+---------+-----------------+-----------+---------------+
| null    | null            | Phobos    | 22            |
| null    | null            | Deimos    | 13            |
| Jupiter | 142984          | Ganymede  | 5268          |
| Jupiter | 142984          | Callisto  | 4821          |
| Saturn  | 120500          | Titan     | 5149          |
| Saturn  | 120500          | Rhea      | 1528          |
| Uranus  | 51118           | Titania   | 1577          |
| Uranus  | 51118           | Oberon    | 1523          |
| Earth   | 12742           | Moon      | 3475          |
| Venus   | 12104           | null      | null          |
+---------+-----------------+-----------+---------------+

CROSS JOIN

A CROSS JOIN creates a "Cartesian product" of the rows of both objects. These are all possible combinations of the rows.

Such results often occur when the join condition is not specified and cannot be implicitly inferred. Results of CROSS JOIN can contain a very large number of rows and place a heavy load on the system.

The number of resulting rows of CROSS JOIN can be calculated using the formula "N x M", where N is the number of rows in the first object and M is the number of rows in the second object.

See CROSS join description and CROSS join example for more information.

For example:

sql
WITH
left_t AS (SELECT 1 AS id, 'one' AS name
       UNION ALL
       SELECT 2 AS id, 'two' AS name
       UNION ALL 
       SELECT 3 AS id, 'three' AS name
      ), 
right_t AS (SELECT 1 AS id, 'ten' AS name
       UNION ALL
       SELECT 2 AS id, 'eleven' AS name
       UNION ALL 
       SELECT 3 AS id, 'twelve' AS name
      ) 
SELECT * FROM left_t CROSS JOIN right_t;   

-- OR 

WITH
left_t AS (SELECT 1 AS id, 'one' AS name
       UNION ALL
       SELECT 2 AS id, 'two' AS name
       UNION ALL 
       SELECT 3 AS id, 'three' AS name
      ), 
right_t AS (SELECT 1 AS id, 'ten' AS name
       UNION ALL
       SELECT 2 AS id, 'eleven' AS name
       UNION ALL 
       SELECT 3 AS id, 'twelve' AS name
      ) 
SELECT * FROM left_t JOIN right_t;   

Without ON-clause the following table results:

txt
+----+-------+----+--------+
| ID | NAME  | ID | NAME   |
+----+-------+----+--------+
| 1  | one   | 1  | ten    |
| 1  | one   | 2  | eleven |
| 1  | one   | 3  | twelve |
| 2  | two   | 1  | ten    |
| 2  | two   | 2  | eleven |
| 2  | two   | 3  | twelve |
| 3  | three | 1  | ten    |
| 3  | three | 2  | eleven |
| 3  | three | 3  | twelve |
+----+-------+----+--------+

CROSS JOIN can be useful and helpful if, for example, you use the WHERE clause in conjunction with it to filter the results.

The output in this case is equal to an INNER JOIN with an ON clause.

sql
WITH
left_t AS (SELECT 1 AS id, 'one' AS name
       UNION ALL
       SELECT 2 AS id, 'two' AS name
       UNION ALL 
       SELECT 3 AS id, 'three' AS name
      ), 
right_t AS (SELECT 1 AS id, 'ten' AS name
       UNION ALL
       SELECT 2 AS id, 'eleven' AS name
       UNION ALL 
       SELECT 3 AS id, 'twelve' AS name
      ) 
SELECT * FROM left_t JOIN right_t   
WHERE left_t.id = right_t.id;       
txt
+----+-------+----+--------+
| ID | NAME  | ID | NAME   |
+----+-------+----+--------+
| 1  | one   | 1  | ten    |
| 2  | two   | 2  | eleven |
| 3  | three | 3  | twelve |
+----+-------+----+--------+

Compare with:

sql
WITH
left_t AS (SELECT 1 AS id, 'one' AS name
       UNION ALL
       SELECT 2 AS id, 'two' AS name
       UNION ALL 
       SELECT 3 AS id, 'three' AS name
      ), 
right_t AS (SELECT 1 AS id, 'ten' AS name
       UNION ALL
       SELECT 2 AS id, 'eleven' AS name
       UNION ALL 
       SELECT 3 AS id, 'twelve' AS name
      ) 
SELECT * FROM left_t INNER JOIN right_t   
    ON  left_t.id = right_t.id;       
txt
+----+-------+----+--------+
| ID | NAME  | ID | NAME   |
+----+-------+----+--------+
| 1  | one   | 1  | ten    |
| 2  | two   | 2  | eleven |
| 3  | three | 3  | twelve |
+----+-------+----+--------+

Important Notes

Using JOINS in conjunction with WHERE and FROM... ON... clauses can produce different outputs under certain conditions. Thus, the procedures are not identical.

Example 1

For example, WHERE clause rows that contain NULL values ​​in the columns relevant to the WHERE clause are filtered out:

Compare:

sql
WITH
left_t AS (SELECT 1 AS id, 'one' AS name
       UNION ALL
       SELECT 2 AS id, 'two' AS name
       UNION ALL 
       SELECT NULL AS id, 'three' AS name
      ), 
right_t AS (SELECT 1 AS id, 'ten' AS name
       UNION ALL
       SELECT 2 AS id, 'eleven' AS name
       UNION ALL 
       SELECT 3 AS id, 'twelve' AS name
      ) 
SELECT * FROM left_t LEFT OUTER JOIN right_t   
    ON  left_t.id = right_t.id;       
txt
+------+-------+------+--------+
| ID   | NAME  | ID   | NAME   |
+------+-------+------+--------+
| 1    | one   | 1    | ten    |
| 2    | two   | 2    | eleven |
| null | three | null | null   |
+------+-------+------+--------+

And

sql
WITH
left_t AS (SELECT 1 AS id, 'one' AS name
       UNION ALL
       SELECT 2 AS id, 'two' AS name
       UNION ALL 
       SELECT NULL AS id, 'three' AS name
      ), 
right_t AS (SELECT 1 AS id, 'ten' AS name
       UNION ALL
       SELECT 2 AS id, 'eleven' AS name
       UNION ALL 
       SELECT 3 AS id, 'twelve' AS name
      ) 
SELECT * FROM left_t JOIN right_t   
    WHERE  left_t.id = right_t.id;       
txt
+------+-------+------+--------+
| ID   | NAME  | ID   | NAME   |
+------+-------+------+--------+
| 1    | one   | 1    | ten    |
| 2    | two   | 2    | eleven |
+------+-------+------+--------+

Example 2

If the rows of the “right” table after the join contain only NULL values, the WHERE clause filters out such rows from the output:

Compare:

sql
WITH
left_t AS (SELECT 1 AS id, 'one' AS name
       UNION ALL
       SELECT 2 AS id, 'two' AS name
       UNION ALL 
       SELECT 3 AS id, 'three' AS name
      ), 
right_t AS (SELECT 1 AS id, 'ten' AS name
       UNION ALL
       SELECT 3 AS id, 'twelve' AS name
      ) 
SELECT * FROM left_t LEFT OUTER JOIN right_t   
    ON  left_t.id = right_t.id;       
txt
+----+-------+------+--------+
| ID | NAME  | ID   | NAME   |
+----+-------+------+--------+
| 1  | one   | 1    | ten    |
| 2  | two   | null | null   |
| 3  | three | 3    | twelve |
+----+-------+------+--------+

And

sql
WITH
left_t AS (SELECT 1 AS id, 'one' AS name
       UNION ALL
       SELECT 2 AS id, 'two' AS name
       UNION ALL 
       SELECT 3 AS id, 'three' AS name
      ), 
right_t AS (SELECT 1 AS id, 'ten' AS name
       UNION ALL
       SELECT 3 AS id, 'twelve' AS name
      ) 
SELECT * FROM left_t JOIN right_t   
    WHERE  left_t.id = right_t.id;       
txt
+----+-------+----+--------+
| ID | NAME  | ID | NAME   |
+----+-------+----+--------+
| 1  | one   | 1  | ten    |
| 3  | three | 3  | twelve |
+----+-------+----+--------+

Syntax spelling

There are in general two ways to join tables:

  • ... FROM left_obj JOIN right_obj ... ON ... : this form is more flexible and has no problem with accidentally filtering rows that contain NULL values.
  • ... FROM left_obj JOIN right_obj ... WHERE ....

The following additional spellings are allowed:

  • left_obj LEFT OUTER JOIN right_obj ON ...;= left_obj LEFT JOIN ... ON ; (the same applies to RIGHT ... and FULL OUTER JOIN)
  • left_obj INNER JOIN right_obj ON ...; = left_obj JOIN right_obj ON ...;
  • left_obj CROSS JOIN right_obj ...; = left_obj JOIN right_obj ...;
  • left_obj CROSS JOIN right_obj WHERE...; = left_obj JOIN right_obj WHERE ...;
  • left_obj CROSS JOIN right_obj WHERE ...; = left_obj, right_obj WHERE...;