SQL中的EXISTS子句和IN子句有什么区别?

什么时候应该使用EXISTS,什么时候应该使用IN?


当前回答

EXISTS is much faster than IN when the subquery results is very large. IN is faster than EXISTS when the subquery results is very small. CREATE TABLE t1 (id INT, title VARCHAR(20), someIntCol INT) GO CREATE TABLE t2 (id INT, t1Id INT, someData VARCHAR(20)) GO INSERT INTO t1 SELECT 1, 'title 1', 5 UNION ALL SELECT 2, 'title 2', 5 UNION ALL SELECT 3, 'title 3', 5 UNION ALL SELECT 4, 'title 4', 5 UNION ALL SELECT null, 'title 5', 5 UNION ALL SELECT null, 'title 6', 5 INSERT INTO t2 SELECT 1, 1, 'data 1' UNION ALL SELECT 2, 1, 'data 2' UNION ALL SELECT 3, 2, 'data 3' UNION ALL SELECT 4, 3, 'data 4' UNION ALL SELECT 5, 3, 'data 5' UNION ALL SELECT 6, 3, 'data 6' UNION ALL SELECT 7, 4, 'data 7' UNION ALL SELECT 8, null, 'data 8' UNION ALL SELECT 9, 6, 'data 9' UNION ALL SELECT 10, 6, 'data 10' UNION ALL SELECT 11, 8, 'data 11' Query 1 SELECT FROM t1 WHERE not EXISTS (SELECT * FROM t2 WHERE t1.id = t2.t1id) Query 2 SELECT t1.* FROM t1 WHERE t1.id not in (SELECT t2.t1id FROM t2 ) If in t1 your id has null value then Query 1 will find them, but Query 2 cant find null parameters. I mean IN can't compare anything with null, so it has no result for null, but EXISTS can compare everything with null.

其他回答

如果你可以用where in代替where exists,那么where in可能更快。

使用where in或where exists 将遍历父结果的所有结果。不同之处在于where exists将导致大量依赖子查询。如果你可以防止依赖子查询,那么where in将是更好的选择。

例子

假设我们有10,000家公司,每家公司有10个用户(因此我们的用户表有100,000个条目)。现在假设您希望通过用户名或公司名查找用户。

下面使用were exists查询的执行时间为141ms:

select * from `users` 
where `first_name` ='gates' 
or exists 
(
  select * from `companies` 
  where `users`.`company_id` = `companies`.`id`
  and `name` = 'gates'
)

这是因为对每个用户执行一个依赖子查询:

然而,如果我们避免exists查询并使用:

select * from `users` 
where `first_name` ='gates' 
or users.company_id in  
(
    select id from `companies` 
    where  `name` = 'gates'
)

然后避免依赖子查询,查询将在0,012毫秒内运行

EXISTS is much faster than IN when the subquery results is very large. IN is faster than EXISTS when the subquery results is very small. CREATE TABLE t1 (id INT, title VARCHAR(20), someIntCol INT) GO CREATE TABLE t2 (id INT, t1Id INT, someData VARCHAR(20)) GO INSERT INTO t1 SELECT 1, 'title 1', 5 UNION ALL SELECT 2, 'title 2', 5 UNION ALL SELECT 3, 'title 3', 5 UNION ALL SELECT 4, 'title 4', 5 UNION ALL SELECT null, 'title 5', 5 UNION ALL SELECT null, 'title 6', 5 INSERT INTO t2 SELECT 1, 1, 'data 1' UNION ALL SELECT 2, 1, 'data 2' UNION ALL SELECT 3, 2, 'data 3' UNION ALL SELECT 4, 3, 'data 4' UNION ALL SELECT 5, 3, 'data 5' UNION ALL SELECT 6, 3, 'data 6' UNION ALL SELECT 7, 4, 'data 7' UNION ALL SELECT 8, null, 'data 8' UNION ALL SELECT 9, 6, 'data 9' UNION ALL SELECT 10, 6, 'data 10' UNION ALL SELECT 11, 8, 'data 11' Query 1 SELECT FROM t1 WHERE not EXISTS (SELECT * FROM t2 WHERE t1.id = t2.t1id) Query 2 SELECT t1.* FROM t1 WHERE t1.id not in (SELECT t2.t1id FROM t2 ) If in t1 your id has null value then Query 1 will find them, but Query 2 cant find null parameters. I mean IN can't compare anything with null, so it has no result for null, but EXISTS can compare everything with null.

如果使用IN操作符,SQL引擎将扫描从内部查询中获取的所有记录。另一方面,如果我们使用EXISTS, SQL引擎将在找到匹配项后立即停止扫描过程。

EXISTS的性能比in快。 如果大多数过滤条件在子查询中,那么最好使用in,如果大多数过滤条件在主查询中,那么最好使用EXISTS。

我假设您知道它们的作用,因此使用方式不同,所以我将把您的问题理解为:什么时候重写SQL以使用IN而不是EXISTS是一个好主意,或者反之亦然。

这个假设合理吗?


编辑:我问的原因是,在许多情况下,您可以重写基于in的SQL来使用EXISTS,反之亦然,对于某些数据库引擎,查询优化器将区别对待这两者。

例如:

SELECT *
FROM Customers
WHERE EXISTS (
    SELECT *
    FROM Orders
    WHERE Orders.CustomerID = Customers.ID
)

可以改写为:

SELECT *
FROM Customers
WHERE ID IN (
    SELECT CustomerID
    FROM Orders
)

或者用join:

SELECT Customers.*
FROM Customers
    INNER JOIN Orders ON Customers.ID = Orders.CustomerID

所以我的问题仍然存在,原来的海报想知道什么是IN和EXISTS,因此如何使用它,或者他问是否重写SQL使用IN使用EXISTS,反之亦然,将是一个好主意?