SQL - APPROX_COUNT_DISTINCT() Function



The SQL APPROX_COUNT_DISTINCT() function returns the approximate number of rows with distinct expression values. This function provides an alternative to the COUNT (DISTINCT expression) function. This function uses less memory than a COUNT-DISTINCT executive operation. It is one of the new functions introduced in SQL Server 2019.

The APPROX_COUNT_DISTINCT() function is mostly used in big data scenarios. This function returns the number of unique non-null values in a group. It is typically intended for large data sets with more than a million rows and the aggregation of columns with many distinct values. It is designed for situations where responsiveness is more important than absolute clarity. This function ignores the rows that contain the null values.

Syntax

Following is the syntax of the SQL APPROX_COUNT_DISTINCT() function −

APPROX_COUNT_DISTINCT( expression ) 

Parameters

  • expression − an expression of any type. This function does not accept images, sql_variant, or text.

Example

In the following example, we are fetching the approximate number of values in the column "AGE" from the customers table by using the APPROC_COUNT_DISTINCT() function. Assume we have created a table named customers using the following query −

CREATE TABLE customers(ID INT NOT NULL, 
NAME VARCHAR(30) NOT NULL, 
AGE INT NOT NULL, 
ADDRESS CHAR(30), 
SALARY DECIMAL(18, 2));

The table stores the ID, NAME, AGE, ADDRESS, and SALARY. Now we are inserting the 7 records in the customers table using the INSERT statement.

INSERT INTO customers VALUES(1, 'Ramesh', 32, 'Ahmedabad', 2000.00);
INSERT INTO customers VALUES(2, 'Khilan', 25, 'Delhi', 1500.00);
INSERT INTO customers VALUES(3, 'kaushik', 23, 'Kota', 2000.00);
INSERT INTO customers VALUES(4, 'Chaitali', 25, 'Mumbai', 6500.00);
INSERT INTO customers VALUES(5, 'Hardik', 27, 'Bhopal', 8500.00);
INSERT INTO customers VALUES(6, 'Komal', 22, 'MP', 4500.00);
INSERT INTO customers VALUES(7, 'Aman', 23, 'Ranchi', null);
INSERT INTO customers VALUES(8,'Aman', 23, 'Delhi', 3000.00);
INSERT INTO customers VALUES(9, 'Khilan', 25, 'Delhi', 3000.00);
The customers table is created as shown below −
+----+----------+-----+-----------+---------+
| ID | NAME     | AGE | ADDRESS   | SALARY  |
+----+----------+-----+-----------+---------+
|  1 | Ramesh   |  32 | Ahmedabad | 2000.00 |
|  2 | Khilan   |  25 | Delhi     | 1500.00 |
|  3 | kaushik  |  23 | Kota      | 2000.00 |
|  4 | Chaitali |  25 | Mumbai    | 6500.00 |
|  5 | Hardik   |  27 | Bhopal    | 8500.00 |
|  6 | Komal    |  22 | MP        | 4500.00 |
|  7 | Aman     |  23 | Ranchi    | 5000.00 |
|  8 | Aman     |  22 | Delhi     | 3000.00 |
|  9 | Khilan   |  25 | Delhi     | 3000.00 |
+----+----------+-----+-----------+---------+

The following SQL query display the approximate count of DISTINCT AGE of customers −

SELECT APPROX_COUNT_DISTINCT(AGE) AS Approx_Distinct_AGE FROM customers;

Output

Following is the output of the above SQL query −

+--------------------+
|Approx_Distinct_AGE |
+--------------------+
|                  5 | 
+--------------------+

Example

In the following example, using the APPROX_COUNT_DISTINCT() function with the GROUP BY clause on the customers table. It fetches all names and counts the distinct age values −

SELECT NAME, APPROX_COUNT_DISTINCT(AGE) AS Approx_Distinct_AGE FROM customers GROUP BY NAME;

Output

Following is the output of the above SQL query. In the customers table, there are more than one name with the same age, but the APPROX_COUNT_DISTINCT() function only counts the distinct value −

+---------+---------------------+
| NAME    | Approx_Distinct_AGE |
+---------+---------------------+
| Ramesh  |                   1 |
+---------+---------------------+
| Hardik  |                   1 |
+---------+---------------------+
| Aman    |                   1 |
+---------+---------------------+
| kaushik |                   1 |
+---------+---------------------+
| Chaitali|                   1 |
+---------+---------------------+
| Khilan  |                   1 |
+---------+---------------------+
| Komal   |                   1 |
+---------+---------------------+

Example

In the following example, we are fetching the ID and salary and using the APPROX_COUNT_DISTINCT() function to count the distinct salaries. displaying the ID and salary along with the "group by" and "order by" clauses.

The following SQL query will fetch the ID, salary, and count of salaries from the above customer table −

SELECT ID, SALARY, APPROX_COUNT_DISTINCT(SALARY) AS Approx_Distinct_SALARY FROM customers GROUP BY ID, SALARY ORDER BY ID;

Output

Following is the output of the above SQL query −

+----+----------+------------------------+
| ID |   SALARY | Approx_Distinct_SALARY |
+----+----------+------------------------+
|  1 |  2000.00 |                     1  |
+----|----------+------------------------+
|  2 |  1500.00 |                     1  |
+----|----------+------------------------+
|  3 |  2000.00 |                     1  |
+----|----------+------------------------+
|  4 |  6500.00 |                     1  |
+----|----------+------------------------+
|  5 |  8500.00 |                     1  |
+----|----------+------------------------+
|  6 |  4500.00 |                     1  |
+----|----------+------------------------+
|  7 |  5000.00 |                     1  |
+----|----------+------------------------+
|  8 |  3000.00 |                     1  |
+----|----------+------------------------+
|  9 |  3000.00 |                     1  |
+----|----------+------------------------+
sql-aggregate-functions.htm
Advertisements