Thursday, April 28, 2016

COUNT FUNCTION

This SQL tutorial explains how to use the SQL COUNT function with syntax, examples, and practice exercises.

DESCRIPTION

The SQL COUNT function is used to count the number of rows returned in a SELECT statement.

SYNTAX

The syntax for the SQL COUNT function is:
SELECT COUNT(aggregate_expression)
FROM tables
[WHERE conditions];
OR the syntax for the COUNT function when grouping the results by one or more columns is:
SELECT expression1, expression2, ... expression_n,
       COUNT(aggregate_expression)
FROM tables
[WHERE conditions]
GROUP BY expression1, expression2, ... expression_n;

Parameters or Arguments

expression1, expression2, ... expression_n
Expressions that are not encapsulated within the COUNT function and must be included in the GROUP BY clause at the end of the SQL statement.
aggregate_expression
This is the column or expression whose non-null values will be counted.
tables
The tables that you wish to retrieve records from. There must be at least one table listed in the FROM clause.
WHERE conditions
Optional. These are conditions that must be met for the records to be selected.

ONLY INCLUDES NOT NULL VALUES

Not everyone realizes this, but the SQL COUNT function will only include the records in the count where the value of expressionin COUNT(expression) is NOT NULL. When expression contains a NULL value, it is not included in the COUNT calculations.
Let's look at a SQL COUNT function example that demonstrates how NULL values are evaluated by the COUNT function.
For example, if you have the following table called suppliers:
supplier_idsupplier_namestate
1IBMCA
2Microsoft
3NVIDIA
And if you ran the following SQL SELECT statement that uses the SQL COUNT function:
SELECT COUNT(supplier_id)
FROM suppliers;

Result: 3
This SQL COUNT example will return 3 since all supplier_id values in the query's result set are NOT NULL.
However, if you ran the next SQL SELECT statement that uses the SQL COUNT function:
SELECT COUNT(state) 
FROM suppliers;

Result: 1
This SQL COUNT example will only return 1, since only one state value in the query's result set is NOT NULL. That would be the first row where the state = 'CA'. It is the only row that is included in the COUNT function calculation.

EXAMPLE - WITH SINGLE EXPRESSION

The simplest way to use the SQL COUNT function would be to return a single field that returns the COUNT of something.
For example, you might wish to know how many employees have a salary that is above $25,000 / year.
SELECT COUNT(*) AS "Number of employees"
FROM employees
WHERE salary > 25000;
In this SQL COUNT function example, we've aliased the COUNT(*) expression as "Number of employees". As a result, "Number of employees" will display as the field name when the result set is returned.

EXAMPLE - USING SQL DISTINCT CLAUSE

You can use the SQL DISTINCT clause within the SQL COUNT function.
For example, the SQL statement below returns the number of unique departments where at least one employee makes over $25,000 / year.
SELECT COUNT(DISTINCT department) AS "Unique departments"
FROM employees
WHERE salary > 25000;
Again, the COUNT(DISTINCT department) field is aliased as "Unique departments". This is the field name that will display in the result set.

EXAMPLE - USING SQL GROUP BY CLAUSE

In some cases, you will be required to use the SQL GROUP BY clause with the SQL COUNT function.
For example, you could use the SQL COUNT function to return the name of the department and the number of employees (in the associated department) that make over $25,000 / year.
SELECT department, COUNT(*) AS "Number of employees"
FROM employees
WHERE salary > 25000
GROUP BY department;
Because you have listed one column in your SQL SELECT statement that is not encapsulated in the SQL COUNT function, you must use the SQL GROUP BY clause. The department field must, therefore, be listed in the GROUP BY section.

TIP: PERFORMANCE TUNING WITH SQL COUNT

Since the SQL COUNT function will return the same results regardless of what NOT NULL field(s) you include as the SQL COUNT function parameters (ie: within the parentheses), you can change the syntax of the SQL COUNT function to COUNT(1) to get better performance as the database engine will not have to fetch back the data fields.
For example, based on the example above, the following syntax would result in better performance:
SELECT department, COUNT(1) AS "Number of employees"
FROM employees
WHERE salary > 25000
GROUP BY department;
Now, the SQL COUNT function does not need to retrieve all fields from the employees table as it had to when you used the COUNT(*) syntax. It will merely retrieve the numeric value of 1 for each record that meets your criteria.

PRACTICE EXERCISE #1:

Based on the employees table populated with the following data, count the number of employees whose salary is over $55,000 per year.
CREATE TABLE employees
( employee_number int NOT NULL,
  employee_name char(50) NOT NULL,
  salary int,
  CONSTRAINT employees_pk PRIMARY KEY (employee_number)
);

INSERT INTO employees (employee_number, employee_name, salary)
VALUES (1001, 'John Smith', 62000);

INSERT INTO employees (employee_number, employee_name, salary)
VALUES (1002, 'Jane Anderson', 57500);

INSERT INTO employees (employee_number, employee_name, salary)
VALUES (1003, 'Brad Everest', 71000);

INSERT INTO employees (employee_number, employee_name, salary)
VALUES (1004, 'Jack Horvath', 42000);

Solution for Practice Exercise #1:

Although inefficient in terms of performance, the following SQL SELECT statement would return the number of employees whose salary is over $55,000 per year.
SELECT COUNT(*) AS "Number of employees"
FROM employees
WHERE salary > 55000;
It would return the following result set:
Number of employees
3
A more efficient implementation of the same solution would be the following SQL SELECT statement:
SELECT COUNT(1) AS "Number of employees"
FROM employees
WHERE salary > 55000;
Now, the SQL COUNT function does not need to retrieve all of the fields from the table (ie: employee_number, employee_name, and salary), but rather whenever the condition is met, it will retrieve the numeric value of 1. Thus, increasing the performance of the SQL statement.

PRACTICE EXERCISE #2:

Based on the suppliers table populated with the following data, count the number of distinct cities in the suppliers table:
CREATE TABLE suppliers
( supplier_id int NOT NULL,
  supplier_name char(50) NOT NULL,
  city char(50),
  CONSTRAINT suppliers_pk PRIMARY KEY (supplier_id)
);

INSERT INTO suppliers (supplier_id, supplier_name, city)
VALUES (5001, 'Microsoft', 'New York');

INSERT INTO suppliers (supplier_id, supplier_name, city)
VALUES (5002, 'IBM', 'Chicago');

INSERT INTO suppliers (supplier_id, supplier_name, city)
VALUES (5003, 'Red Hat', 'Detroit');

INSERT INTO suppliers (supplier_id, supplier_name, city)
VALUES (5004, 'NVIDIA', 'New York');

INSERT INTO suppliers (supplier_id, supplier_name, city)
VALUES (5005, 'NVIDIA', 'LA');

Solution for Practice Exercise #2:

The following SQL SELECT statement would return the number of distinct cities in the suppliers table:
SELECT COUNT(DISTINCT city) AS "Distinct Cities"
FROM suppliers;
It would return the following result set:
Distinct Cities
4

PRACTICE EXERCISE #3:

Based on the customers table populated with the following data, count the number of distinct cities for each customer_name in the customers table:
CREATE TABLE customers
( customer_id int NOT NULL,
  customer_name char(50) NOT NULL,
  city char(50),
  CONSTRAINT customers_pk PRIMARY KEY (customer_id)
);

INSERT INTO customers (customer_id, customer_name, city)
VALUES (7001, 'Microsoft', 'New York');

INSERT INTO customers (customer_id, customer_name, city)
VALUES (7002, 'IBM', 'Chicago');

INSERT INTO customers (customer_id, customer_name, city)
VALUES (7003, 'Red Hat', 'Detroit');

INSERT INTO customers (customer_id, customer_name, city)
VALUES (7004, 'Red Hat', 'New York');

INSERT INTO customers (customer_id, customer_name, city)
VALUES (7005, 'Red Hat', 'San Francisco');

INSERT INTO customers (customer_id, customer_name, city)
VALUES (7006, 'NVIDIA', 'New York');

INSERT INTO customers (customer_id, customer_name, city)
VALUES (7007, 'NVIDIA', 'LA');

INSERT INTO customers (customer_id, customer_name, city)
VALUES (7008, 'NVIDIA', 'LA');

Solution for Practice Exercise #3:

The following SQL SELECT statement would return the number of distinct cities for each customer_name in the customerstable:
SELECT customer_name, COUNT(DISTINCT city) AS "Distinct Cities"
FROM customers
GROUP BY customer_name;
It would return the following result set:
CUSTOMER_NAMEDistinct Cities
IBM1
Microsoft1
NVIDIA2
Red Hat3

No comments:

Post a Comment