在MySQL中有一个很好的方法来复制SQL Server函数ROW_NUMBER()?
例如:
SELECT
col1, col2,
ROW_NUMBER() OVER (PARTITION BY col1, col2 ORDER BY col3 DESC) AS intRow
FROM Table1
然后,例如,我可以添加一个条件,将intRow限制为1,以获得每个(col1, col2)对的col3最高的单行。
在MySQL中有一个很好的方法来复制SQL Server函数ROW_NUMBER()?
例如:
SELECT
col1, col2,
ROW_NUMBER() OVER (PARTITION BY col1, col2 ORDER BY col3 DESC) AS intRow
FROM Table1
然后,例如,我可以添加一个条件,将intRow限制为1,以获得每个(col1, col2)对的col3最高的单行。
当前回答
我总是遵循这个模式。给定这个表格:
+------+------+
| i | j |
+------+------+
| 1 | 11 |
| 1 | 12 |
| 1 | 13 |
| 2 | 21 |
| 2 | 22 |
| 2 | 23 |
| 3 | 31 |
| 3 | 32 |
| 3 | 33 |
| 4 | 14 |
+------+------+
你可以得到这样的结果:
+------+------+------------+
| i | j | row_number |
+------+------+------------+
| 1 | 11 | 1 |
| 1 | 12 | 2 |
| 1 | 13 | 3 |
| 2 | 21 | 1 |
| 2 | 22 | 2 |
| 2 | 23 | 3 |
| 3 | 31 | 1 |
| 3 | 32 | 2 |
| 3 | 33 | 3 |
| 4 | 14 | 1 |
+------+------+------------+
通过运行这个不需要定义任何变量的查询:
SELECT a.i, a.j, count(*) as row_number FROM test a
JOIN test b ON a.i = b.i AND a.j >= b.j
GROUP BY a.i, a.j
其他回答
我发现最好的解决方案是使用这样的子查询:
SELECT
col1, col2,
(
SELECT COUNT(*)
FROM Table1
WHERE col1 = t1.col1
AND col2 = t1.col2
AND col3 > t1.col3
) AS intRow
FROM Table1 t1
分区BY列只是用'='进行比较,并用and分隔。ORDER BY列将与'<'或'>'进行比较,并以or分隔。
我发现这是非常灵活的,即使它有点昂贵。
这并不是最健壮的解决方案——但是如果您只是想在一个只有几个不同值的字段上创建一个分区的秩,当逻辑上有许多变量时,使用某些情况可能并不笨拙。
这样的方法在过去对我很有效:
SELECT t.*,
CASE WHEN <partition_field> = @rownum1 := @rownum1 + 1
WHEN <partition_field> = @rownum2 := @rownum2 + 1
...
END AS rank
FROM YOUR_TABLE t,
(SELECT @rownum1 := 0) r1, (SELECT @rownum2 := 0) r2
ORDER BY <rank_order_by_field>
;
希望这对你有帮助!
我没有看到任何简单的答案涵盖“PARTITION BY”部分,所以我的答案是:
SELECT
*
FROM (
select
CASE WHEN @partitionBy_1 = l THEN @row_number:=@row_number+1 ELSE @row_number:=1 END AS i
, @partitionBy_1:=l AS p
, t.*
from (
select @row_number:=0,@partitionBy_1:=null
) as x
cross join (
select 1 as n, 'a' as l
union all
select 1 as n, 'b' as l
union all
select 2 as n, 'b' as l
union all
select 2 as n, 'a' as l
union all
select 3 as n, 'a' as l
union all
select 3 as n, 'b' as l
) as t
ORDER BY l, n
) AS X
where i > 1
The ORDER BY clause must reflect your ROW_NUMBER need. Thus there's already a clear limitation: you can't have several ROW_NUMBER "emulation" of this form at the same time. The order of the "computed column" matters. If you have mysql compute those column in another order, it might not work. In this simple example I only put one but you can have several "PARTITION BY" parts CASE WHEN @partitionBy_1 = part1 AND @partitionBy_2 = part2 [...] THEN @row_number:=@row_number+1 ELSE @row_number:=1 END AS i , @partitionBy_1:=part1 AS P1 , @partitionBy_2:=part2 AS P2 [...] FROM ( SELECT @row_number:=0,@partitionBy_1:=null,@partitionBy_2:=null[...] ) as x
重要提示:请考虑升级到MySQL 8+,并使用已定义和文档化的ROW_NUMBER()函数,并抛弃老黑客绑定到一个功能有限的古老版本的MySQL
下面是其中一个小技巧:
这里大部分/全部使用查询变量的答案似乎忽略了一个事实,即文档说(转述):
不要依赖于SELECT列表中的项按从上到下的顺序求值。不要在一个SELECT项中分配变量,而在另一个SELECT项中使用它们
因此,他们有可能会得出错误的答案,因为他们通常会做一个
select
(row number variable that uses partition variable),
(assign partition variable)
如果从下向上计算,行号将停止工作(没有分区)
所以我们需要使用一些有执行顺序保证的东西。输入CASE当:
SELECT
t.*,
@r := CASE
WHEN col = @prevcol THEN @r + 1
WHEN (@prevcol := col) = null THEN null
ELSE 1 END AS rn
FROM
t,
(SELECT @r := 0, @prevcol := null) x
ORDER BY col
作为大纲ld, prevcol的赋值顺序很重要——在我们从当前行为其赋值之前,必须将prevcol与当前行的值进行比较(否则它将是当前行的col值,而不是前一行的col值)。
以下是它们之间的联系:
The first WHEN is evaluated. If this row's col is the same as the previous row's col then @r is incremented and returned from the CASE. This return led values is stored in @r. It's a feature of MySQL that assignment returns the new value of what is assigned into @r into the result rows. For the first row on the result set, @prevcol is null (it is initialised to null in the subquery) so this predicate is false. This first predicate also returns false every time col changes (current row is different to previous row). This causes the second WHEN to be evaluated. The second WHEN predicate is always false, and it exists purely to assign a new value to @prevcol. Because this row's col is different to the previous row's col (we know this because if it were the same, the first WHEN would have been used), we have to assign the new value to keep it for testing next time. Because the assignment is made and then the result of the assignment is compared with null, and anything equated with null is false, this predicate is always false. But at least evaluating it did its job of keeping the value of col from this row, so it can be evaluated against the next row's col value Because the second WHEN is false, it means in situations where the column we are partitioning by (col) has changed, it is the ELSE that gives a new value for @r, restarting the numbering from 1
我们会遇到这样的情况:
SELECT
t.*,
ROW_NUMBER() OVER(PARTITION BY pcol1, pcol2, ... pcolX ORDER BY ocol1, ocol2, ... ocolX) rn
FROM
t
具有一般形式:
SELECT
t.*,
@r := CASE
WHEN col1 = @pcol1 AND col2 = @pcol2 AND ... AND colX = @pcolX THEN @r + 1
WHEN (@pcol1 := pcol1) = null OR (@pcol2 := col2) = null OR ... OR (@pcolX := colX) = null THEN null
ELSE 1
END AS rn
FROM
t,
(SELECT @r := 0, @pcol1 := null, @pcol2 := null, ..., @pcolX := null) x
ORDER BY pcol1, pcol2, ..., pcolX, ocol1, ocol2, ..., ocolX
脚注:
The p in pcol means "partition", the o in ocol means "order" - in the general form I dropped the "prev" from the variable name to reduce visual clutter The brackets around (@pcolX := colX) = null are important. Without them you'll assign null to @pcolX and things stop working It's a compromise that the result set has to be ordered by the partition columns too, for the previous column compare to work out. You can't thus have your rownumber ordered according to one column but your result set ordered to another You might be able to resolve this with subqueries but I believe the docs also state that subquery ordering may be ignored unless LIMIT is used and this could impact performance I haven't delved into it beyond testing that the method works, but if there is a risk that the predicates in the second WHEN will be optimised away (anything compared to null is null/false so why bother running the assignment) and not executed, it also stops. This doesn't seem to happen in my experience but I'll gladly accept comments and propose solution if it could reasonably occur It may be wise to cast the nulls that create @pcolX to the actual types of your columns, in the subquery that creates the @pcolX variables, viz: select @pcol1 := CAST(null as INT), @pcol2 := CAST(null as DATE)
我总是遵循这个模式。给定这个表格:
+------+------+
| i | j |
+------+------+
| 1 | 11 |
| 1 | 12 |
| 1 | 13 |
| 2 | 21 |
| 2 | 22 |
| 2 | 23 |
| 3 | 31 |
| 3 | 32 |
| 3 | 33 |
| 4 | 14 |
+------+------+
你可以得到这样的结果:
+------+------+------------+
| i | j | row_number |
+------+------+------------+
| 1 | 11 | 1 |
| 1 | 12 | 2 |
| 1 | 13 | 3 |
| 2 | 21 | 1 |
| 2 | 22 | 2 |
| 2 | 23 | 3 |
| 3 | 31 | 1 |
| 3 | 32 | 2 |
| 3 | 33 | 3 |
| 4 | 14 | 1 |
+------+------+------------+
通过运行这个不需要定义任何变量的查询:
SELECT a.i, a.j, count(*) as row_number FROM test a
JOIN test b ON a.i = b.i AND a.j >= b.j
GROUP BY a.i, a.j