什么是SQL JOIN,有哪些不同的类型?
当前回答
定义:
join是一种同时从多个表中查询合并数据的方法。
join的类型:
对于RDBMS来说,有5种类型的连接:
Equi-Join: Combines common records from two tables based on equality condition. Technically, Join made by using equality-operator (=) to compare values of Primary Key of one table and Foreign Key values of another table, hence result set includes common(matched) records from both tables. For implementation see INNER-JOIN. Natural-Join: It is enhanced version of Equi-Join, in which SELECT operation omits duplicate column. For implementation see INNER-JOIN Non-Equi-Join: It is reverse of Equi-join where joining condition is uses other than equal operator(=) e.g, !=, <=, >=, >, < or BETWEEN etc. For implementation see INNER-JOIN. Self-Join:: A customized behavior of join where a table combined with itself; This is typically needed for querying self-referencing tables (or Unary relationship entity). For implementation see INNER-JOINs. Cartesian Product: It cross combines all records of both tables without any condition. Technically, it returns the result set of a query without WHERE-Clause.
根据SQL的关注和进展,有3种类型的连接,所有的RDBMS连接都可以使用这些类型的连接来实现。
INNER-JOIN: It merges(or combines) matched rows from two tables. The matching is done based on common columns of tables and their comparing operation. If equality based condition then: EQUI-JOIN performed, otherwise Non-EQUI-Join. OUTER-JOIN: It merges(or combines) matched rows from two tables and unmatched rows with NULL values. However, can customized selection of un-matched rows e.g, selecting unmatched row from first table or second table by sub-types: LEFT OUTER JOIN and RIGHT OUTER JOIN. 2.1. LEFT Outer JOIN (a.k.a, LEFT-JOIN): Returns matched rows from two tables and unmatched from the LEFT table(i.e, first table) only. 2.2. RIGHT Outer JOIN (a.k.a, RIGHT-JOIN): Returns matched rows from two tables and unmatched from the RIGHT table only. 2.3. FULL OUTER JOIN (a.k.a OUTER JOIN): Returns matched and unmatched from both tables. CROSS-JOIN: This join does not merges/combines instead it performs Cartesian product.
注意:根据需要,Self-JOIN可以通过INNER-JOIN、OUTER-JOIN和CROSS-JOIN来实现,但是表必须与自身连接。
欲了解更多信息:
例子:
1.1: INNER-JOIN:等价连接实现
SELECT *
FROM Table1 A
INNER JOIN Table2 B ON A.<Primary-Key> =B.<Foreign-Key>;
1.2: INNER-JOIN:自然连接实现
Select A.*, B.Col1, B.Col2 --But no B.ForeignKeyColumn in Select
FROM Table1 A
INNER JOIN Table2 B On A.Pk = B.Fk;
1.3:带非等连接实现的INNER-JOIN
Select *
FROM Table1 A INNER JOIN Table2 B On A.Pk <= B.Fk;
1.4:内部连接与自我连接
Select *
FROM Table1 A1 INNER JOIN Table1 A2 On A1.Pk = A2.Fk;
2.1: OUTER JOIN(完全外部连接)
Select *
FROM Table1 A FULL OUTER JOIN Table2 B On A.Pk = B.Fk;
2.2:左连接
Select *
FROM Table1 A LEFT OUTER JOIN Table2 B On A.Pk = B.Fk;
2.3:右连接
Select *
FROM Table1 A RIGHT OUTER JOIN Table2 B On A.Pk = B.Fk;
3.1:交叉连接
Select *
FROM TableA CROSS JOIN TableB;
3.2:交叉连接-自连接
Select *
FROM Table1 A1 CROSS JOIN Table1 A2;
/ / / /
Select *
FROM Table1 A1,Table1 A2;
其他回答
定义:
join是一种同时从多个表中查询合并数据的方法。
join的类型:
对于RDBMS来说,有5种类型的连接:
Equi-Join: Combines common records from two tables based on equality condition. Technically, Join made by using equality-operator (=) to compare values of Primary Key of one table and Foreign Key values of another table, hence result set includes common(matched) records from both tables. For implementation see INNER-JOIN. Natural-Join: It is enhanced version of Equi-Join, in which SELECT operation omits duplicate column. For implementation see INNER-JOIN Non-Equi-Join: It is reverse of Equi-join where joining condition is uses other than equal operator(=) e.g, !=, <=, >=, >, < or BETWEEN etc. For implementation see INNER-JOIN. Self-Join:: A customized behavior of join where a table combined with itself; This is typically needed for querying self-referencing tables (or Unary relationship entity). For implementation see INNER-JOINs. Cartesian Product: It cross combines all records of both tables without any condition. Technically, it returns the result set of a query without WHERE-Clause.
根据SQL的关注和进展,有3种类型的连接,所有的RDBMS连接都可以使用这些类型的连接来实现。
INNER-JOIN: It merges(or combines) matched rows from two tables. The matching is done based on common columns of tables and their comparing operation. If equality based condition then: EQUI-JOIN performed, otherwise Non-EQUI-Join. OUTER-JOIN: It merges(or combines) matched rows from two tables and unmatched rows with NULL values. However, can customized selection of un-matched rows e.g, selecting unmatched row from first table or second table by sub-types: LEFT OUTER JOIN and RIGHT OUTER JOIN. 2.1. LEFT Outer JOIN (a.k.a, LEFT-JOIN): Returns matched rows from two tables and unmatched from the LEFT table(i.e, first table) only. 2.2. RIGHT Outer JOIN (a.k.a, RIGHT-JOIN): Returns matched rows from two tables and unmatched from the RIGHT table only. 2.3. FULL OUTER JOIN (a.k.a OUTER JOIN): Returns matched and unmatched from both tables. CROSS-JOIN: This join does not merges/combines instead it performs Cartesian product.
注意:根据需要,Self-JOIN可以通过INNER-JOIN、OUTER-JOIN和CROSS-JOIN来实现,但是表必须与自身连接。
欲了解更多信息:
例子:
1.1: INNER-JOIN:等价连接实现
SELECT *
FROM Table1 A
INNER JOIN Table2 B ON A.<Primary-Key> =B.<Foreign-Key>;
1.2: INNER-JOIN:自然连接实现
Select A.*, B.Col1, B.Col2 --But no B.ForeignKeyColumn in Select
FROM Table1 A
INNER JOIN Table2 B On A.Pk = B.Fk;
1.3:带非等连接实现的INNER-JOIN
Select *
FROM Table1 A INNER JOIN Table2 B On A.Pk <= B.Fk;
1.4:内部连接与自我连接
Select *
FROM Table1 A1 INNER JOIN Table1 A2 On A1.Pk = A2.Fk;
2.1: OUTER JOIN(完全外部连接)
Select *
FROM Table1 A FULL OUTER JOIN Table2 B On A.Pk = B.Fk;
2.2:左连接
Select *
FROM Table1 A LEFT OUTER JOIN Table2 B On A.Pk = B.Fk;
2.3:右连接
Select *
FROM Table1 A RIGHT OUTER JOIN Table2 B On A.Pk = B.Fk;
3.1:交叉连接
Select *
FROM TableA CROSS JOIN TableB;
3.2:交叉连接-自连接
Select *
FROM Table1 A1 CROSS JOIN Table1 A2;
/ / / /
Select *
FROM Table1 A1,Table1 A2;
来自W3schools的一个例子:
有趣的是,大多数其他答案都存在以下两个问题:
它们只关注连接的基本形式 他们(ab)使用维恩图,这是一个不准确的工具来可视化连接(他们更适合于联合)。
我最近写了一篇关于这个主题的文章:关于在SQL中连接表的许多不同方法的可能不完整的全面指南,我将在这里总结。
首先也是最重要的:join是笛卡尔积
这就是为什么维恩图解释得如此不准确,因为JOIN在两个连接的表之间创建了一个笛卡尔积。维基百科很好地说明了这一点:
笛卡尔积的SQL语法是CROSS JOIN。例如:
SELECT *
-- This just generates all the days in January 2017
FROM generate_series(
'2017-01-01'::TIMESTAMP,
'2017-01-01'::TIMESTAMP + INTERVAL '1 month -1 day',
INTERVAL '1 day'
) AS days(day)
-- Here, we're combining all days with all departments
CROSS JOIN departments
它将一个表中的所有行与另一个表中的所有行组合在一起:
来源:
+--------+ +------------+
| day | | department |
+--------+ +------------+
| Jan 01 | | Dept 1 |
| Jan 02 | | Dept 2 |
| ... | | Dept 3 |
| Jan 30 | +------------+
| Jan 31 |
+--------+
结果:
+--------+------------+
| day | department |
+--------+------------+
| Jan 01 | Dept 1 |
| Jan 01 | Dept 2 |
| Jan 01 | Dept 3 |
| Jan 02 | Dept 1 |
| Jan 02 | Dept 2 |
| Jan 02 | Dept 3 |
| ... | ... |
| Jan 31 | Dept 1 |
| Jan 31 | Dept 2 |
| Jan 31 | Dept 3 |
+--------+------------+
如果我们只是写一个逗号分隔的表列表,我们会得到相同的结果:
-- CROSS JOINing two tables:
SELECT * FROM table1, table2
内部连接(Theta-JOIN)
INNER JOIN只是一个经过过滤的CROSS JOIN,其中过滤器谓词在关系代数中称为Theta。
例如:
SELECT *
-- Same as before
FROM generate_series(
'2017-01-01'::TIMESTAMP,
'2017-01-01'::TIMESTAMP + INTERVAL '1 month -1 day',
INTERVAL '1 day'
) AS days(day)
-- Now, exclude all days/departments combinations for
-- days before the department was created
JOIN departments AS d ON day >= d.created_at
注意关键字INNER是可选的(在MS Access中除外)。
(请参阅文章中的结果示例)
均匀加入
一种特殊的Theta-JOIN是我们最常用的equi JOIN。谓词将一个表的主键与另一个表的外键连接起来。如果我们使用Sakila数据库进行说明,我们可以这样写:
SELECT *
FROM actor AS a
JOIN film_actor AS fa ON a.actor_id = fa.actor_id
JOIN film AS f ON f.film_id = fa.film_id
这结合了所有演员和他们的电影。
或者,在一些数据库中:
SELECT *
FROM actor
JOIN film_actor USING (actor_id)
JOIN film USING (film_id)
USING()语法允许指定必须出现在JOIN操作表两侧的列,并在这两列上创建一个相等谓词。
自然的加入
其他答案单独列出了这个“JOIN类型”,但这没有意义。它只是equi JOIN的语法糖形式,它是Theta-JOIN或INNER JOIN的一种特殊情况。NATURAL JOIN简单地收集被连接的表和USING()连接这些列所共有的所有列。这几乎没什么用,因为会出现意外匹配(比如Sakila数据库中的LAST_UPDATE列)。
语法如下:
SELECT *
FROM actor
NATURAL JOIN film_actor
NATURAL JOIN film
外连接
现在,OUTER JOIN与INNER JOIN有点不同,因为它创建了几个笛卡尔积的UNION。我们可以写成:
-- Convenient syntax:
SELECT *
FROM a LEFT JOIN b ON <predicate>
-- Cumbersome, equivalent syntax:
SELECT a.*, b.*
FROM a JOIN b ON <predicate>
UNION ALL
SELECT a.*, NULL, NULL, ..., NULL
FROM a
WHERE NOT EXISTS (
SELECT * FROM b WHERE <predicate>
)
没有人想要编写后者,所以我们编写OUTER JOIN(通常由数据库更好地优化)。
与INNER一样,关键字OUTER在这里也是可选的。
OUTER JOIN有三种口味:
LEFT [OUTER] JOIN: JOIN表达式的左表被添加到联合中,如上所示。 RIGHT [OUTER] JOIN:将JOIN表达式的右边表添加到联合中,如上图所示。 FULL [OUTER] JOIN:如上所示,JOIN表达式的两个表都被添加到联合中。
所有这些都可以与关键字USING()或NATURAL组合(我最近实际上有一个NATURAL FULL JOIN的真实用例)
替代语法
在Oracle和SQL Server中有一些历史悠久的,已弃用的语法,在SQL标准有此语法之前,它们已经支持OUTER JOIN:
-- Oracle
SELECT *
FROM actor a, film_actor fa, film f
WHERE a.actor_id = fa.actor_id(+)
AND fa.film_id = f.film_id(+)
-- SQL Server
SELECT *
FROM actor a, film_actor fa, film f
WHERE a.actor_id *= fa.actor_id
AND fa.film_id *= f.film_id
话虽如此,但不要使用这种语法。我只是在这里列出它,以便您可以从旧的博客文章/遗留代码中识别它。
分区OUTER连接
很少有人知道这一点,但是SQL标准指定了分区OUTER JOIN (Oracle实现了它)。你可以这样写:
WITH
-- Using CONNECT BY to generate all dates in January
days(day) AS (
SELECT DATE '2017-01-01' + LEVEL - 1
FROM dual
CONNECT BY LEVEL <= 31
),
-- Our departments
departments(department, created_at) AS (
SELECT 'Dept 1', DATE '2017-01-10' FROM dual UNION ALL
SELECT 'Dept 2', DATE '2017-01-11' FROM dual UNION ALL
SELECT 'Dept 3', DATE '2017-01-12' FROM dual UNION ALL
SELECT 'Dept 4', DATE '2017-04-01' FROM dual UNION ALL
SELECT 'Dept 5', DATE '2017-04-02' FROM dual
)
SELECT *
FROM days
LEFT JOIN departments
PARTITION BY (department) -- This is where the magic happens
ON day >= created_at
结果如下:
+--------+------------+------------+
| day | department | created_at |
+--------+------------+------------+
| Jan 01 | Dept 1 | | -- Didn't match, but still get row
| Jan 02 | Dept 1 | | -- Didn't match, but still get row
| ... | Dept 1 | | -- Didn't match, but still get row
| Jan 09 | Dept 1 | | -- Didn't match, but still get row
| Jan 10 | Dept 1 | Jan 10 | -- Matches, so get join result
| Jan 11 | Dept 1 | Jan 10 | -- Matches, so get join result
| Jan 12 | Dept 1 | Jan 10 | -- Matches, so get join result
| ... | Dept 1 | Jan 10 | -- Matches, so get join result
| Jan 31 | Dept 1 | Jan 10 | -- Matches, so get join result
这里的重点是,无论join是否匹配“join的另一端”上的任何内容,来自连接的已分区一侧的所有行都将在结果中结束。长话短说:这就是在报告中填充稀疏数据。非常有用!
半连接
严重吗?没有其他答案了吗?当然不是,因为它在SQL中没有原生语法,很不幸(就像下面的ANTI JOIN一样)。但我们可以使用IN()和EXISTS(),例如,找到所有在电影中演出过的演员:
SELECT *
FROM actor a
WHERE EXISTS (
SELECT * FROM film_actor fa
WHERE a.actor_id = fa.actor_id
)
WHERE a.actor_id = fa。Actor_id谓词充当半连接谓词。如果你不相信,看看执行计划,比如Oracle。您将看到数据库执行了一个SEMI JOIN操作,而不是EXISTS()谓词。
反加入
这与SEMI JOIN正好相反(注意不要使用not IN,因为它有一个重要的警告)
以下是所有没有拍过电影的演员:
SELECT *
FROM actor a
WHERE NOT EXISTS (
SELECT * FROM film_actor fa
WHERE a.actor_id = fa.actor_id
)
有些人(尤其是MySQL的人)也会这样写ANTI - JOIN:
SELECT *
FROM actor a
LEFT JOIN film_actor fa
USING (actor_id)
WHERE film_id IS NULL
我认为历史原因是表现。
横向连接
天哪,这个太酷了。只有我一个人提起这件事?这是一个很酷的问题:
SELECT a.first_name, a.last_name, f.*
FROM actor AS a
LEFT OUTER JOIN LATERAL (
SELECT f.title, SUM(amount) AS revenue
FROM film AS f
JOIN film_actor AS fa USING (film_id)
JOIN inventory AS i USING (film_id)
JOIN rental AS r USING (inventory_id)
JOIN payment AS p USING (rental_id)
WHERE fa.actor_id = a.actor_id -- JOIN predicate with the outer query!
GROUP BY f.film_id
ORDER BY revenue DESC
LIMIT 5
) AS f
ON true
它将找出每位演员收入最高的5部电影。每当你需要一个TOP-N-per-something查询时,LATERAL JOIN将是你的朋友。如果您是SQL Server使用者,那么您应该知道这个JOIN类型的名称是APPLY
SELECT a.first_name, a.last_name, f.*
FROM actor AS a
OUTER APPLY (
SELECT f.title, SUM(amount) AS revenue
FROM film AS f
JOIN film_actor AS fa ON f.film_id = fa.film_id
JOIN inventory AS i ON f.film_id = i.film_id
JOIN rental AS r ON i.inventory_id = r.inventory_id
JOIN payment AS p ON r.rental_id = p.rental_id
WHERE fa.actor_id = a.actor_id -- JOIN predicate with the outer query!
GROUP BY f.film_id
ORDER BY revenue DESC
LIMIT 5
) AS f
好吧,也许这是欺骗,因为LATERAL JOIN或APPLY表达式实际上是一个产生几行的“相关子查询”。但是如果我们允许“相关子查询”,我们还可以讨论……
多重集
这实际上只由Oracle和Informix实现(据我所知),但它可以在PostgreSQL中使用数组和/或XML以及在SQL Server中使用XML进行模拟。
MULTISET生成一个相关的子查询,并将结果行集嵌套在外部查询中。下面的查询选择所有演员,并为每个演员在嵌套集合中收集他们的电影:
SELECT a.*, MULTISET (
SELECT f.*
FROM film AS f
JOIN film_actor AS fa USING (film_id)
WHERE a.actor_id = fa.actor_id
) AS films
FROM actor
正如您所看到的,除了通常提到的“无聊的”INNER、OUTER和CROSS JOIN之外,还有更多类型的JOIN。更多细节请见我的文章。请不要再用维恩图来说明它们了。
我要推一下我最讨厌的:USING关键字。
如果JOIN两边的表都有正确命名的外键(即,相同的名称,而不仅仅是“id”),那么可以使用:
SELECT ...
FROM customers JOIN orders USING (customer_id)
我发现这是非常实用的,可读的,但不经常使用。
在我看来,我创造了一个比文字更能解释的插图:
推荐文章
- 如何在SQL中选择表的最后一条记录?
- SQL在Oracle中连接多行列值的查询
- 在单个查询中计算空值和非空值
- 在存储过程中使用“SET XACT_ABORT ON”有什么好处?
- SQL to LINQ工具
- 如何从一个查询插入多行使用雄辩/流利
- 如何连接列在Postgres选择?
- 有人可以对SQL查询进行版权保护吗?
- 如何知道MySQL表最近一次更新?
- 如何转储一些SQLite3表的数据?
- 如何创建一个SQL Server函数“连接”多行从一个子查询到一个单独的分隔字段?
- 在MySQL中的一个查询中更新多个具有不同值的行
- 在SQL中更新多个列
- 如何删除表中特定列的第一个字符?
- MySQL OR与IN性能