我如何才能最好地编写一个查询,从总共600k中随机选择10行?
当前回答
我是这样做的:
select *
from table_with_600k_rows
where rand() < 10/600000
limit 10
我喜欢它,因为它不需要其他表,写起来很简单,执行起来非常快。
其他回答
另一个简单的解决方案是对行进行排名,并随机获取其中之一,有了这个解决方案,你将不需要在表中有任何基于“Id”的列。
SELECT d.* FROM (
SELECT t.*, @rownum := @rownum + 1 AS rank
FROM mytable AS t,
(SELECT @rownum := 0) AS r,
(SELECT @cnt := (SELECT RAND() * (SELECT COUNT(*) FROM mytable))) AS n
) d WHERE rank >= @cnt LIMIT 10;
您可以根据需要更改限制值,以便访问尽可能多的行,但大多数情况下是连续的值。
然而,如果你不想要连续的随机值,那么你可以获取一个更大的样本并从中随机选择。就像……
SELECT * FROM (
SELECT d.* FROM (
SELECT c.*, @rownum := @rownum + 1 AS rank
FROM buildbrain.`commits` AS c,
(SELECT @rownum := 0) AS r,
(SELECT @cnt := (SELECT RAND() * (SELECT COUNT(*) FROM buildbrain.`commits`))) AS rnd
) d
WHERE rank >= @cnt LIMIT 10000
) t ORDER BY RAND() LIMIT 10;
使用下面的简单查询从表中获取随机数据。
SELECT user_firstname ,
COUNT(DISTINCT usr_fk_id) cnt
FROM userdetails
GROUP BY usr_fk_id
ORDER BY cnt ASC
LIMIT 10
一个伟大的职位处理几个情况,从简单,到差距,到不均匀与差距。
http://jan.kneschke.de/projects/mysql/order-by-rand/
对于大多数一般情况,你可以这样做:
SELECT name
FROM random AS r1 JOIN
(SELECT CEIL(RAND() *
(SELECT MAX(id)
FROM random)) AS id)
AS r2
WHERE r1.id >= r2.id
ORDER BY r1.id ASC
LIMIT 1
这假设id的分布是相等的,并且id列表中可能存在间隙。有关更高级的示例,请参阅本文
我需要一个查询从一个相当大的表中返回大量随机行。这是我想到的。首先获取最大记录id:
SELECT MAX(id) FROM table_name;
然后将该值代入:
SELECT * FROM table_name WHERE id > FLOOR(RAND() * max) LIMIT n;
Where max is the maximum record id in the table and n is the number of rows you want in your result set. The assumption is that there are no gaps in the record id's although I doubt it would affect the result if there were (haven't tried it though). I also created this stored procedure to be more generic; pass in the table name and number of rows to be returned. I'm running MySQL 5.5.38 on Windows 2008, 32GB, dual 3GHz E5450, and on a table with 17,361,264 rows it's fairly consistent at ~.03 sec / ~11 sec to return 1,000,000 rows. (times are from MySQL Workbench 6.1; you could also use CEIL instead of FLOOR in the 2nd select statement depending on your preference)
DELIMITER $$
USE [schema name] $$
DROP PROCEDURE IF EXISTS `random_rows` $$
CREATE PROCEDURE `random_rows`(IN tab_name VARCHAR(64), IN num_rows INT)
BEGIN
SET @t = CONCAT('SET @max=(SELECT MAX(id) FROM ',tab_name,')');
PREPARE stmt FROM @t;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
SET @t = CONCAT(
'SELECT * FROM ',
tab_name,
' WHERE id>FLOOR(RAND()*@max) LIMIT ',
num_rows);
PREPARE stmt FROM @t;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
END
$$
then
CALL [schema name].random_rows([table name], n);
它是非常简单的单行查询。
SELECT * FROM Table_Name ORDER BY RAND() LIMIT 0,10;
推荐文章
- 在SQL Server 2008 R2中重命名数据库时出错
- 将数据复制到另一个表中
- 将表从一个数据库复制到另一个数据库的最简单方法?
- 如何在SQL中选择表的最后一条记录?
- SQL在Oracle中连接多行列值的查询
- 在单个查询中计算空值和非空值
- 在存储过程中使用“SET XACT_ABORT ON”有什么好处?
- 如何通过查询在MySQL中获得数据库结构?
- SQL to LINQ工具
- 如何从一个查询插入多行使用雄辩/流利
- 如何连接列在Postgres选择?
- MySQL删除表中的所有行,并将ID重置为零
- 在准备语句中使用“like”通配符
- MySQL中的表名是否区分大小写?
- 库未加载:libmysqlclient.16。在OS X 10.6上使用mysql2 gem运行'rails server'时出现dylib错误