查询:

SELECT
year, id, rate
FROM h
WHERE year BETWEEN 2000 AND 2009
AND id IN (SELECT rid FROM table2)
GROUP BY id, year
ORDER BY id, rate DESC

收益率:

year    id  rate
2006    p01 8
2003    p01 7.4
2008    p01 6.8
2001    p01 5.9
2007    p01 5.3
2009    p01 4.4
2002    p01 3.9
2004    p01 3.5
2005    p01 2.1
2000    p01 0.8
2001    p02 12.5
2004    p02 12.4
2002    p02 12.2
2003    p02 10.3
2000    p02 8.7
2006    p02 4.6
2007    p02 3.3

我想要的是每个id只有前5个结果:

2006    p01 8
2003    p01 7.4
2008    p01 6.8
2001    p01 5.9
2007    p01 5.3
2001    p02 12.5
2004    p02 12.4
2002    p02 12.2
2003    p02 10.3
2000    p02 8.7

是否有一种方法来做到这一点,使用一些限制之类的修饰符,在GROUP BY中工作?


当前回答

你想找出每个组的前n行。这个答案使用与OP不同的示例数据提供了一个通用的解决方案。

在MySQL 8或更高版本中,您可以根据top 5的确切定义使用ROW_NUMBER, RANK或DENSE_RANK函数。下面是这些函数根据值降序排序生成的数字。注意领带是如何处理的:

pkid catid value row_number rank dense_rank
1 p01 100 *1 *1 *1
2 p01 90 *2 *2 *2
3 p01 90 *3 *2 *2
4 p01 80 *4 *4 *3
5 p01 80 *5 *4 *3
6 p01 80 6 *4 *3
7 p01 70 7 7 *4
8 p01 60 8 8 *5
9 p01 50 9 9 6
10 p01 40 10 10 7

一旦你选择了函数,就像这样使用它:

SELECT *
FROM (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY id ORDER BY value DESC) AS n
    FROM t
) AS x
WHERE n <= 5

DB < >小提琴


在MySQL 5。X,你可以使用穷人的排名超过分区,以达到预期的结果:外部连接表本身和每一行,计算它之前的行数(例如,前一行可以是一个较高的值)。

下面将产生类似RANK函数的结果:

SELECT t.pkid, t.catid, t.value, COUNT(b.value) + 1 AS rank
FROM t
LEFT JOIN t AS b ON b.catid = t.catid AND b.value > t.value
GROUP BY t.pkid, t.catid, t.value
HAVING COUNT(b.value) + 1 <= 5
ORDER BY t.catid, t.value DESC, t.pkid

进行以下更改以产生与DENSE_RANK函数类似的结果:

COUNT(DISTINCT b.value)

或进行以下更改,以产生类似于ROW_NUMBER函数的结果:

ON b.catid = t.catid AND (b.value > t.value OR b.value = t.value AND b.pkid < t.pkid)

DB < >小提琴

其他回答

我刚刚为MYSQL创建了一个top操作。代码很简单。

drop table if exists h;
create table h(id varchar(5), year int, rate numeric(8,2), primary key(id,year));
insert into h(year, id, rate) values
(2006,'p01',8),
(2003,'p01',7.4),
(2008,'p01',6.8),
(2001,'p01',5.9),
(2007,'p01',5.3),
(2009,'p01',4.4),
(2002,'p01',3.9),
(2004,'p01',3.5),
(2005,'p01',2.1),
(2000,'p01',0.8),
(2001,'p02',12.5),
(2004,'p02',12.4),
(2002,'p02',12.2),
(2003,'p02',10.3),
(2000,'p02',8.7),
(2006,'p02',4.6),
(2007,'p02',3.3);

select id, year, rate
from 
(
    select id, year, rate, @last, if(@last=id,@top:=@top+1, @top:=0) as ztop, @last:=id update_last
    from h
    order by id, rate desc, year desc
) t2
where ztop<5

请尝试下面的存储过程。我已经核实了。我得到正确的结果,但没有使用groupby。

CREATE DEFINER=`ks_root`@`%` PROCEDURE `first_five_record_per_id`()
BEGIN
DECLARE query_string text;
DECLARE datasource1 varchar(24);
DECLARE done INT DEFAULT 0;
DECLARE tenants varchar(50);
DECLARE cur1 CURSOR FOR SELECT rid FROM demo1;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;

    SET @query_string='';

      OPEN cur1;
      read_loop: LOOP

      FETCH cur1 INTO tenants ;

      IF done THEN
        LEAVE read_loop;
      END IF;

      SET @datasource1 = tenants;
      SET @query_string = concat(@query_string,'(select * from demo  where `id` = ''',@datasource1,''' order by rate desc LIMIT 5) UNION ALL ');

       END LOOP; 
      close cur1;

    SET @query_string  = TRIM(TRAILING 'UNION ALL' FROM TRIM(@query_string));  
  select @query_string;
PREPARE stmt FROM @query_string;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;

END
SELECT year, id, rate
FROM (SELECT
  year, id, rate, row_number() over (partition by id order by rate DESC)
  FROM h
  WHERE year BETWEEN 2000 AND 2009
  AND id IN (SELECT rid FROM table2)
  GROUP BY id, year
  ORDER BY id, rate DESC) as subquery
WHERE row_number <= 5

子查询与您的查询几乎相同。只有改变是增加

row_number() over (partition by id order by rate DESC)

对于那些像我一样有查询超时的人。我做了下面的限制和任何其他由特定的组。

DELIMITER $$
CREATE PROCEDURE count_limit200()
BEGIN
    DECLARE a INT Default 0;
    DECLARE stop_loop INT Default 0;
    DECLARE domain_val VARCHAR(250);
    DECLARE domain_list CURSOR FOR SELECT DISTINCT domain FROM db.one;

    OPEN domain_list;

    SELECT COUNT(DISTINCT(domain)) INTO stop_loop 
    FROM db.one;
    -- BEGIN LOOP
    loop_thru_domains: LOOP
        FETCH domain_list INTO domain_val;
        SET a=a+1;

        INSERT INTO db.two(book,artist,title,title_count,last_updated) 
        SELECT * FROM 
        (
            SELECT book,artist,title,COUNT(ObjectKey) AS titleCount, NOW() 
            FROM db.one 
            WHERE book = domain_val
            GROUP BY artist,title
            ORDER BY book,titleCount DESC
            LIMIT 200
        ) a ON DUPLICATE KEY UPDATE title_count = titleCount, last_updated = NOW();

        IF a = stop_loop THEN
            LEAVE loop_thru_domain;
        END IF;
    END LOOP loop_thru_domain;
END $$

它循环遍历一个域列表,然后每个域只插入200个限制

试试这个:

SELECT h.year, h.id, h.rate 
FROM (SELECT h.year, h.id, h.rate, IF(@lastid = (@lastid:=h.id), @index:=@index+1, @index:=0) indx 
      FROM (SELECT h.year, h.id, h.rate 
            FROM h
            WHERE h.year BETWEEN 2000 AND 2009 AND id IN (SELECT rid FROM table2)
            GROUP BY id, h.year
            ORDER BY id, rate DESC
            ) h, (SELECT @lastid:='', @index:=0) AS a
    ) h 
WHERE h.indx <= 5;