在多个列上计数DISTINCT

是否有更好的方法来执行这样的查询:

SELECT COUNT(*) 
FROM (SELECT DISTINCT DocumentId, DocumentSessionId
      FROM DocumentOutputItems) AS internalQuery

我需要数一下这个表中不同项的数量，但不同项超过两列。

我的查询工作得很好，但我想知道我是否可以只使用一个查询(不使用子查询)得到最终结果

当前回答

如果你只有一个字段可以“DISTINCT”，你可以使用:

SELECT COUNT(DISTINCT DocumentId) 
FROM DocumentOutputItems

并且返回与原始的相同的查询计划，正如SET SHOWPLAN_ALL ON测试的那样。然而，你正在使用两个字段，所以你可以尝试一些疯狂的东西，如:

    SELECT COUNT(DISTINCT convert(varchar(15),DocumentId)+'|~|'+convert(varchar(15), DocumentSessionId)) 
    FROM DocumentOutputItems

但如果涉及到null，就会出现问题。我还是用原来的问题吧。

2009-09-24 13:34:03

其他回答

您不喜欢现有查询的哪些方面?如果您担心两列之间的DISTINCT不返回唯一的排列，为什么不试试呢?

在Oracle中，它当然可以像您所期望的那样工作。

SQL> select distinct deptno, job from emp
  2  order by deptno, job
  3  /

    DEPTNO JOB
---------- ---------
        10 CLERK
        10 MANAGER
        10 PRESIDENT
        20 ANALYST
        20 CLERK
        20 MANAGER
        30 CLERK
        30 MANAGER
        30 SALESMAN

9 rows selected.


SQL> select count(*) from (
  2  select distinct deptno, job from emp
  3  )
  4  /

  COUNT(*)
----------
         9

SQL>

edit

我进入了分析的死胡同，但答案很明显……

SQL> select count(distinct concat(deptno,job)) from emp
  2  /

COUNT(DISTINCTCONCAT(DEPTNO,JOB))
---------------------------------
                                9

SQL>

编辑2

对于以下数据，上面提供的串联解决方案将会计数错误:

col1  col2
----  ----
A     AA
AA    A

所以我们要包含分隔符…

select col1 + '*' + col2 from t23
/

显然，所选择的分隔符必须是一个字符或一组字符，它不能出现在任何一列中。

2009-09-24 12:41:18

若要作为单个查询运行，请连接列，然后获取连接的字符串的不同实例计数。

SELECT count(DISTINCT concat(DocumentId, DocumentSessionId)) FROM DocumentOutputItems;

在MySQL中，你可以做同样的事情，而不需要下面的连接步骤:

SELECT count(DISTINCT DocumentId, DocumentSessionId) FROM DocumentOutputItems;

MySQL文档中提到了这个特性:

http://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_count-distinct

2016-07-28 20:21:27

我有一个类似的问题，但我的查询是一个子查询与比较数据在主查询。喜欢的东西:

Select code, id, title, name 
(select count(distinct col1) from mytable where code = a.code and length(title) >0)
from mytable a
group by code, id, title, name
--needs distinct over col2 as well as col1

忽略这个问题的复杂性，我意识到我无法用原问题中描述的双子查询将a.code的值获取到子查询中

Select count(1) from (select distinct col1, col2 from mytable where code = a.code...)
--this doesn't work because the sub-query doesn't know what "a" is

所以最后我发现我可以作弊，把这些列合并起来:

Select count(distinct(col1 || col2)) from mytable where code = a.code...

这就是最终成功的方法

2019-03-12 15:29:59

如果您使用的是固定长度的数据类型，则可以将其转换为二进制，从而非常容易和快速地完成此操作。假设documententid和DocumentSessionId都是int，因此都是4字节长…

SELECT COUNT(DISTINCT CAST(DocumentId as binary(4)) + CAST(DocumentSessionId as binary(4)))
FROM DocumentOutputItems

My specific problem required me to divide a SUM by the COUNT of the distinct combination of various foreign keys and a date field, grouping by another foreign key and occasionally filtering by certain values or keys. The table is very large, and using a sub-query dramatically increased the query time. And due to the complexity, statistics simply wasn't a viable option. The CHECKSUM solution was also far too slow in its conversion, particularly as a result of the various data types, and I couldn't risk its unreliability.

然而，使用上述解决方案几乎没有增加查询时间(与简单使用SUM相比)，并且应该是完全可靠的!它应该能够帮助其他处于类似情况的人，所以我把它贴在这里。

2019-09-18 00:27:22

如果你只有一个字段可以“DISTINCT”，你可以使用:

SELECT COUNT(DISTINCT DocumentId) 
FROM DocumentOutputItems

并且返回与原始的相同的查询计划，正如SET SHOWPLAN_ALL ON测试的那样。然而，你正在使用两个字段，所以你可以尝试一些疯狂的东西，如:

    SELECT COUNT(DISTINCT convert(varchar(15),DocumentId)+'|~|'+convert(varchar(15), DocumentSessionId)) 
    FROM DocumentOutputItems

但如果涉及到null，就会出现问题。我还是用原来的问题吧。

2009-09-24 13:34:03

在多个列上计数DISTINCT

推荐文章

最新文章

标签