假设我有一个带有数字列的表(让我们称之为“score”)。

我想生成一个计数表,显示分数在每个范围内出现的次数。

例如:

score range  | number of occurrences
-------------------------------------
   0-9       |        11
  10-19      |        14
  20-29      |         3
   ...       |       ...

在这个示例中,有11行分数在0到9之间,14行分数在10到19之间,3行分数在20到29之间。

有什么简单的方法吗?你有什么建议吗?


create table scores (
   user_id int,
   score int
)

select t.range as [score range], count(*) as [number of occurences]
from (
      select user_id,
         case when score >= 0 and score < 10 then '0-9'
         case when score >= 10 and score < 20 then '10-19'
         ...
         else '90-99' as range
     from scores) t
group by t.range

select cast(score/10 as varchar) + '-' + cast(score/10+9 as varchar), 
       count(*)
from scores
group by score/10

在postgres中(其中||是字符串连接操作符):

select (score/10)*10 || '-' || (score/10)*10+9 as scorerange, count(*)
from scores
group by score/10
order by 1

给:

 scorerange | count 
------------+-------
 0-9        |    11
 10-19      |    14
 20-29      |     3
 30-39      |     2

下面是如何在T-SQL中做到这一点:

DECLARE @traunch INT = 1000;

SELECT 
    CONCAT
    ( 
      FORMAT((score / @traunch) * @traunch, '###,000,000') 
      , ' - ' , 
      FORMAT((score / @traunch) * @traunch + @traunch - 1, '###,000,000') 
    ) as [Range]
  , FORMAT(MIN(score), 'N0') as [Min]
  , FORMAT(AVG(score), 'N0') as [Avg]
  , FORMAT(MAX(score), 'N0') as [Max]
  , FORMAT(COUNT(score), 'N0') as [Count]
  , FORMAT(SUM(score), 'N0') as [Sum]
FROM scores
GROUP BY score / @traunch
ORDER BY score / @traunch


也许你问的是如何让这样的事情继续下去……

当然,您将为查询调用全表扫描,如果包含需要统计(聚合)的分数的表很大,您可能想要一个性能更好的解决方案,您可以创建一个辅助表并使用规则,例如关于插入—您可能会研究它。

不过,并不是所有的RDBMS引擎都有规则!


declare @RangeWidth int

set @RangeWidth = 10

select
   Floor(Score/@RangeWidth) as LowerBound,
   Floor(Score/@RangeWidth)+@RangeWidth as UpperBound,
   Count(*)
From
   ScoreTable
group by
   Floor(Score/@RangeWidth)

我在这里看到的答案在SQL Server的语法中行不通。我会用:

select t.range as [score range], count(*) as [number of occurences]
from (
  select case 
    when score between  0 and  9 then ' 0-9 '
    when score between 10 and 19 then '10-19'
    when score between 20 and 29 then '20-29'
    ...
    else '90-99' end as range
  from scores) t
group by t.range

编辑:见评论


在我看来,James Curran的回答是最简洁的,但输出并不正确。对于SQL Server,最简单的语句如下:

SELECT 
    [score range] = CAST((Score/10)*10 AS VARCHAR) + ' - ' + CAST((Score/10)*10+9 AS VARCHAR), 
    [number of occurrences] = COUNT(*)
FROM #Scores
GROUP BY Score/10
ORDER BY Score/10

这假设了一个我用来测试它的#Scores临时表,我只是用0到99之间的随机数填充了100行。


在SQL Server 2000上,投票最多的答案都不正确。也许他们用的是另一个版本。

下面是在SQL Server 2000上这两个软件的正确版本。

select t.range as [score range], count(*) as [number of occurences]
from (
  select case  
    when score between 0 and 9 then ' 0- 9'
    when score between 10 and 19 then '10-19'
    else '20-99' end as range
  from scores) t
group by t.range

or

select t.range as [score range], count(*) as [number of occurrences]
from (
      select user_id,
         case when score >= 0 and score< 10 then '0-9'
         when score >= 10 and score< 20 then '10-19'
         else '20-99' end as range
     from scores) t
group by t.range

另一种方法是将范围存储在表中,而不是将它们嵌入到查询中。你最终会得到一个表,命名为Ranges,它看起来像这样:

LowerLimit   UpperLimit   Range 
0              9          '0-9'
10            19          '10-19'
20            29          '20-29'
30            39          '30-39'

查询如下所示:

Select
   Range as [Score Range],
   Count(*) as [Number of Occurences]
from
   Ranges r inner join Scores s on s.Score between r.LowerLimit and r.UpperLimit
group by Range

这确实意味着要建立一个表,但是当所需的范围发生变化时,维护这个表是很容易的。不需要更改代码!


select t.blah as [score range], count(*) as [number of occurences]
from (
  select case 
    when score between  0 and  9 then ' 0-9 '
    when score between 10 and 19 then '10-19'
    when score between 20 and 29 then '20-29'
    ...
    else '90-99' end as blah
  from scores) t
group by t.blah

如果在MySQL中,请确保使用'range'以外的单词,否则在运行上述示例时会出现错误。


因为要排序的列(Range)是一个字符串,所以使用字符串/单词排序而不是数值排序。

只要字符串有0来填充数字长度,排序在语义上仍然是正确的:

SELECT t.range AS ScoreRange,
       COUNT(*) AS NumberOfOccurrences
  FROM (SELECT CASE
                    WHEN score BETWEEN 0 AND 9 THEN '00-09'
                    WHEN score BETWEEN 10 AND 19 THEN '10-19'
                    ELSE '20-99'
               END AS Range
          FROM Scores) t
 GROUP BY t.Range

如果范围是混合的,简单地填充一个额外的0:

SELECT t.range AS ScoreRange,
       COUNT(*) AS NumberOfOccurrences
  FROM (SELECT CASE
                    WHEN score BETWEEN 0 AND 9 THEN '000-009'
                    WHEN score BETWEEN 10 AND 19 THEN '010-019'
                    WHEN score BETWEEN 20 AND 99 THEN '020-099'
                    ELSE '100-999'
               END AS Range
          FROM Scores) t
 GROUP BY t.Range

我会用一种稍微不同的方式来做,这样它就不必定义每种情况:

select t.range as [score range], count(*) as [number of occurences]
from (
  select FLOOR(score/10) as range
  from scores) t
group by t.range

没有测试过,但你懂的…


这将允许您不必指定范围,并且应该与SQL server无关。数学增值!

SELECT CONCAT(range,'-',range+9), COUNT(range)
FROM (
  SELECT 
    score - (score % 10) as range
  FROM scores
)

Try

SELECT (str(range) + "-" + str(range + 9) ) AS [Score range], COUNT(score) AS [number of occurances]
FROM (SELECT  score,  int(score / 10 ) * 10  AS range  FROM scoredata )  
GROUP BY range;

select t.range as score, count(*) as Count 
from (
      select UserId,
         case when isnull(score ,0) >= 0 and isnull(score ,0)< 5 then '0-5'
                when isnull(score ,0) >= 5 and isnull(score ,0)< 10 then '5-10'
                when isnull(score ,0) >= 10 and isnull(score ,0)< 15 then '10-15'
                when isnull(score ,0) >= 15 and isnull(score ,0)< 20 then '15-20'               
         else ' 20+' end as range
         ,case when isnull(score ,0) >= 0 and isnull(score ,0)< 5 then 1
                when isnull(score ,0) >= 5 and isnull(score ,0)< 10 then 2
                when isnull(score ,0) >= 10 and isnull(score ,0)< 15 then 3
                when isnull(score ,0) >= 15 and isnull(score ,0)< 20 then 4             
         else 5  end as pd
     from score table
     ) t

group by t.range,pd order by pd

我在这里是因为我有类似的问题,但我发现简短的答案是错误的,一个连续的“情况下”是太多的工作,看到任何重复在我的代码伤害我的眼睛。这就是解

SELECT --MIN(score), MAX(score),
    [score range] = CAST(ROUND(score-5,-1)AS VARCHAR) + ' - ' + CAST((ROUND(score-5,-1)+10)AS VARCHAR),
    [number of occurrences] = COUNT(*)
FROM order
GROUP BY  CAST(ROUND(score-5,-1)AS VARCHAR) + ' - ' + CAST((ROUND(score-5,-1)+10)AS VARCHAR)
ORDER BY MIN(score)



对于PrestoSQL/Trino应用Ken https://stackoverflow.com/a/232463/429476的答案

select t.range, count(*) as "Number of Occurance", ROUND(AVG(fare_amount),2) as "Avg",
  ROUND(MAX(fare_amount),2) as "Max" ,ROUND(MIN(fare_amount),2) as "Min" 
from (
  select 
   case 
      when trip_distance between  0 and  9 then ' 0-9 '
      when trip_distance between 10 and 19 then '10-19'
      when trip_distance between 20 and 29 then '20-29'
      when trip_distance between 30 and 39 then '30-39'
      else '> 39' 
   end as range ,fare_amount 
  from nyc_in_parquet.tlc_yellow_trip_2022) t
  where fare_amount > 1 and fare_amount < 401092
group by t.range;

 range | Number of Occurance |  Avg   |  Max  | Min  
-------+---------------------+--------+-------+------
  0-9  |             2260865 |  10.28 | 720.0 | 1.11 
 30-39 |                1107 | 104.28 | 280.0 |  5.0 
 10-19 |              126136 |   43.8 | 413.5 |  2.0 
 > 39  |               42556 |  39.11 | 668.0 | 1.99 
 20-29 |               19133 |  58.62 | 250.0 |  2.5