我感兴趣的是从数据库表中选择第n行的一些(理想的)数据库不可知的方法。看看如何使用以下数据库的本机功能来实现这一点也很有趣:
SQL Server
MySQL
PostgreSQL
SQLite
甲骨文
我目前正在SQL Server 2005中做以下事情,但我有兴趣看到其他更不可知论的方法:
WITH Ordered AS (
SELECT ROW_NUMBER() OVER (ORDER BY OrderID) AS RowNumber, OrderID, OrderDate
FROM Orders)
SELECT *
FROM Ordered
WHERE RowNumber = 1000000
以上SQL: Firoz Ansari的Weblog的功劳
更新:参见Troels Arvin关于SQL标准的回答。Troels,你有我们可以引用的链接吗?
在我看来,为了提高效率,您需要1)生成一个小于数据库记录数量的0到1之间的随机数,2)能够选择该位置的行。不幸的是,不同的数据库有不同的随机数生成器,以及在结果集中的某个位置选择一行的不同方法——通常您指定要跳过多少行和需要多少行,但不同的数据库有不同的做法。下面是在SQLite中为我工作的一些东西:
select *
from Table
limit abs(random()) % (select count(*) from Words), 1;
It does depend on being able to use a subquery in the limit clause (which in SQLite is LIMIT <recs to skip>,<recs to take>) Selecting the number of records in a table should be particularly efficient, being part of the database's meta data, but that depends on the database's implementation. Also, I don't know if the query will actually build the result set before retrieving the Nth record, but I would hope that it doesn't need to. Note that I'm not specifying an "order by" clause. It might be better to "order by" something like the primary key, which will have an index - getting the Nth record from an index might be faster if the database can't get the Nth record from the database itself without building the result set.
在我看来,为了提高效率,您需要1)生成一个小于数据库记录数量的0到1之间的随机数,2)能够选择该位置的行。不幸的是,不同的数据库有不同的随机数生成器,以及在结果集中的某个位置选择一行的不同方法——通常您指定要跳过多少行和需要多少行,但不同的数据库有不同的做法。下面是在SQLite中为我工作的一些东西:
select *
from Table
limit abs(random()) % (select count(*) from Words), 1;
It does depend on being able to use a subquery in the limit clause (which in SQLite is LIMIT <recs to skip>,<recs to take>) Selecting the number of records in a table should be particularly efficient, being part of the database's meta data, but that depends on the database's implementation. Also, I don't know if the query will actually build the result set before retrieving the Nth record, but I would hope that it doesn't need to. Note that I'm not specifying an "order by" clause. It might be better to "order by" something like the primary key, which will have an index - getting the Nth record from an index might be faster if the database can't get the Nth record from the database itself without building the result set.