我想知道以下内容在性能上是否有区别

SELECT ... FROM ... WHERE someFIELD IN(1,2,3,4)

SELECT ... FROM ... WHERE someFIELD between 0 AND 5

SELECT ... FROM ... WHERE someFIELD = 1 OR someFIELD = 2 OR someFIELD = 3 ... 

或者MySQL会像编译器优化代码一样优化SQL吗?


EDIT

根据评论中说明的原因,将AND改为OR。


当前回答

我认为对sunseeker的观察的一个解释是,MySQL实际上对in语句中的值进行排序,如果它们都是静态值,并且使用二进制搜索,这比普通或替代方法更有效。我不记得在哪里读到过,但圣汐者的结果似乎是一个证明。

其他回答

OR是最慢的。IN还是BETWEEN更快取决于你的数据,但我希望BETWEEN通常更快,因为它可以简单地从索引中获取一个范围(假设someField被索引)。

我敢打赌它们是一样的,你可以通过执行以下操作来运行测试:

循环遍历“in(1,2,3,4)”500次,看看需要多长时间。循环“=1 or= 2 or=3…”版本500次,看看它运行了多长时间。

你也可以尝试一个连接的方式,如果someField是一个索引,你的表很大,它可以更快…

SELECT ... 
    FROM ... 
        INNER JOIN (SELECT 1 as newField UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4) dt ON someFIELD =newField

我在我的SQL Server上尝试了上面的join方法,它几乎与in(1,2,3,4)相同,它们都导致了一个聚集索引查找。我不确定MySQL将如何处理它们。

就在你以为安全的时候…

你的eq_range_index_dive_limit值是多少?特别地,你在In子句中有更多还是更少的条目?

这将不包括基准测试,但将深入了解内部工作原理。让我们使用一个工具来查看发生了什么——Optimizer Trace。

查询:SELECT * FROM canada WHERE id…

使用3个OR值,部分跟踪看起来像这样:

       "condition_processing": {
          "condition": "WHERE",
          "original_condition": "((`canada`.`id` = 296172) or (`canada`.`id` = 295093) or (`canada`.`id` = 293626))",
          "steps": [
            {
              "transformation": "equality_propagation",
              "resulting_condition": "(multiple equal(296172, `canada`.`id`) or multiple equal(295093, `canada`.`id`) or multiple equal(293626, `canada`.`id`))"
            },

...

              "analyzing_range_alternatives": {
                "range_scan_alternatives": [
                  {
                    "index": "id",
                    "ranges": [
                      "293626 <= id <= 293626",
                      "295093 <= id <= 295093",
                      "296172 <= id <= 296172"
                    ],
                    "index_dives_for_eq_ranges": true,
                    "chosen": true

...

        "refine_plan": [
          {
            "table": "`canada`",
            "pushed_index_condition": "((`canada`.`id` = 296172) or (`canada`.`id` = 295093) or (`canada`.`id` = 293626))",
            "table_condition_attached": null,
            "access_type": "range"
          }
        ]

注意颅内压是如何给予ORs的。这意味着OR没有转换成IN, InnoDB将通过ICP执行一堆=测试。(我觉得不值得考虑MyISAM。)

(这是Percona的5.6.22-71.0-log;Id是二级索引。)

现在来看带有几个值的IN()

Eq_range_index_dive_limit = 10;有8个值。

        "condition_processing": {
          "condition": "WHERE",
          "original_condition": "(`canada`.`id` in (296172,295093,293626,295573,297148,296127,295588,295810))",
          "steps": [
            {
              "transformation": "equality_propagation",
              "resulting_condition": "(`canada`.`id` in (296172,295093,293626,295573,297148,296127,295588,295810))"
            },

...

              "analyzing_range_alternatives": {
                "range_scan_alternatives": [
                  {
                    "index": "id",
                    "ranges": [
                      "293626 <= id <= 293626",
                      "295093 <= id <= 295093",
                      "295573 <= id <= 295573",
                      "295588 <= id <= 295588",
                      "295810 <= id <= 295810",
                      "296127 <= id <= 296127",
                      "296172 <= id <= 296172",
                      "297148 <= id <= 297148"
                    ],
                    "index_dives_for_eq_ranges": true,
                    "chosen": true

...

        "refine_plan": [
          {
            "table": "`canada`",
            "pushed_index_condition": "(`canada`.`id` in (296172,295093,293626,295573,297148,296127,295588,295810))",
            "table_condition_attached": null,
            "access_type": "range"
          }
        ]

注意,IN似乎没有被转换成OR。

附注:注意,常数值是排序的。这在两个方面是有益的:

通过更少的跳跃,可能会有更好的缓存,更少的I/O来获取所有的值。 如果两个类似的查询来自不同的连接,并且它们位于事务中,则更有可能导致延迟,而不是由于重叠列表而导致死锁。

最后,有很多值的IN()

      {
        "condition_processing": {
          "condition": "WHERE",
          "original_condition": "(`canada`.`id` in (293831,292259,292881,293440,292558,295792,292293,292593,294337,295430,295034,297060,293811,295587,294651,295559,293213,295742,292605,296018,294529,296711,293919,294732,294689,295540,293000,296916,294433,297112,293815,292522,296816,293320,293232,295369,291894,293700,291839,293049,292738,294895,294473,294023,294173,293019,291976,294923,294797,296958,294075,293450,296952,297185,295351,295736,296312,294330,292717,294638,294713,297176,295896,295137,296573,292236,294966,296642,296073,295903,293057,294628,292639,293803,294470,295353,297196,291752,296118,296964,296185,295338,295956,296064,295039,297201,297136,295206,295986,292172,294803,294480,294706,296975,296604,294493,293181,292526,293354,292374,292344,293744,294165,295082,296203,291918,295211,294289,294877,293120,295387))",
          "steps": [
            {
              "transformation": "equality_propagation",
              "resulting_condition": "(`canada`.`id` in (293831,292259,292881,293440,292558,295792,292293,292593,294337,295430,295034,297060,293811,295587,294651,295559,293213,295742,292605,296018,294529,296711,293919,294732,294689,295540,293000,296916,294433,297112,293815,292522,296816,293320,293232,295369,291894,293700,291839,293049,292738,294895,294473,294023,294173,293019,291976,294923,294797,296958,294075,293450,296952,297185,295351,295736,296312,294330,292717,294638,294713,297176,295896,295137,296573,292236,294966,296642,296073,295903,293057,294628,292639,293803,294470,295353,297196,291752,296118,296964,296185,295338,295956,296064,295039,297201,297136,295206,295986,292172,294803,294480,294706,296975,296604,294493,293181,292526,293354,292374,292344,293744,294165,295082,296203,291918,295211,294289,294877,293120,295387))"
            },

...

              "analyzing_range_alternatives": {
                "range_scan_alternatives": [
                  {
                    "index": "id",
                    "ranges": [
                      "291752 <= id <= 291752",
                      "291839 <= id <= 291839",
                      ...
                      "297196 <= id <= 297196",
                      "297201 <= id <= 297201"
                    ],
                    "index_dives_for_eq_ranges": false,
                    "rows": 111,
                    "chosen": true

...

        "refine_plan": [
          {
            "table": "`canada`",
            "pushed_index_condition": "(`canada`.`id` in (293831,292259,292881,293440,292558,295792,292293,292593,294337,295430,295034,297060,293811,295587,294651,295559,293213,295742,292605,296018,294529,296711,293919,294732,294689,295540,293000,296916,294433,297112,293815,292522,296816,293320,293232,295369,291894,293700,291839,293049,292738,294895,294473,294023,294173,293019,291976,294923,294797,296958,294075,293450,296952,297185,295351,295736,296312,294330,292717,294638,294713,297176,295896,295137,296573,292236,294966,296642,296073,295903,293057,294628,292639,293803,294470,295353,297196,291752,296118,296964,296185,295338,295956,296064,295039,297201,297136,295206,295986,292172,294803,294480,294706,296975,296604,294493,293181,292526,293354,292374,292344,293744,294165,295082,296203,291918,295211,294289,294877,293120,295387))",
            "table_condition_attached": null,
            "access_type": "range"
          }
        ]

旁注:我需要这个,由于跟踪的体积:

@@global.optimizer_trace_max_mem_size = 32222;

我还为未来的谷歌员工做了一个测试。返回的结果总数为7264 / 10000

SELECT * FROM item WHERE id = 1 OR id = 2 ... id = 10000

该查询耗时0.1239秒

SELECT * FROM item WHERE id IN (1,2,3,...10000)

这个查询花费了0.0433秒

IN比OR快3倍

我知道,只要在Field上有一个索引,BETWEEN就会使用它快速找到一端,然后遍历到另一端。这是最有效的。

我看过的每一个解释都是“IN(…)”和“……”或…”是可互换的,同样有效。这是您所期望的,因为优化器无法知道它们是否包含一个间隔。它也等价于单个值上的UNION ALL SELECT。