Use and precautions of aggregate functions

1. Introduction to aggregate functions

Written in the front: For the convenience of understanding, a table can be regarded as a two-dimensional matrix, a piece of data is a row, and a column of data is for a field!

What is an aggregate function: The aggregate function performs calculations for a certain field (column) and returns a single value (a row).

When aggregate functions are used, NULL values ​​are automatically ignored, which is a major feature of aggregate functions.

Commonly used aggregate functions: count() count, sum() summation, max() maximum value, min minimum value, avg average value; these are the five most commonly used aggregate functions, of course, with the rapid development of SQL, the aggregate function has been expanded more and more!



2. The construction of database tables

create table `emp` (
	`EMPNO` int(4) not null,
	`ENAME` varchar(10) default null,
	`JOB` varchar(9) default null,
	`MGR` int(4) default null,
	`HIREDATE` date default null,
	`SAL` double(7,2) default null,
	`COMM` double(7,2) default null,
	`DEPTNO` int(2) default null,
	primary key (`EMPNO`)
) engine=innodb default charset=utf8
Insert picture description here





3. Basic use

The aggregate function automatically ignores NULL values

When NULL values ​​participate in mathematical operations, the result must be NULL

3.1, count the number of people

Query the total number of people, because the aggregate function will automatically ignore NULL, so when there is a NULL in a field, the results will be inconsistent. In short * query the number of rows of the two-dimensional matrix, the field (column) is the number of queries that are not NULL values!

*进行通配
select count(*) from emp; 				//结果14

select count(ename) from emp; 			//结果14

select count(comm) from emp;			//结果是4

select count(*), count(ename), count(comm) from emp;		//结果14 14 4


3.2, sum query total allowance

Similarly, the aggregate function will ignore NULL, so there is no need to judge the NULL value separately when using it

select sum(comm) from emp;
select sum(comm) from emp where comm is not null

The results of these two SQL statements are the same, but the second one should be 0 points in the written test. In fact, it is to examine the special handling of NULL by aggregate functions!


3.3. Query the annual salary of employees (1)

The employee's annual salary = (monthly salary + allowance) * 12, that is, yearsalary = (sal + comm) * 12.

select ename, (sal + comm)*12 as yearsalary from emp;
Insert picture description here


Obviously the result is wrong. Although most employees have a NULL value for their commendation (comm), they still have a basic monthly salary, so why is the annual salary NULL? This is because the result of all NULL participating operations must be NULL!

3.4. Query the employee's annual salary (2)

MySQL provides the ifnull (field, specified value) function, which can convert a field that may be a null value to a specified value. But ifnull() is not an aggregate function!

select ename, (sal + ifnull(comm, 0))*12 yearsalary from emp;
Insert picture description here


4. The difference between count (column name), count (1), count (*)

The difference between the three is mainly from: NULL value processing, and comparison of execution speed, generally speaking, * and 1 will be faster, and column names are slower!

4.1, NULL value processing

count(*): Including all columns, counting all rows, not ignoring NULL values.

count(1): Ignore all columns, count all rows, and don't ignore NULL values.

count (column name): only include the specified column, NULL values ​​will be ignored when counting the results.


4.2, execution speed

The execution speed mainly considers whether there is a primary key and whether there is an index.

The column name is the primary key: count (1) = count (*) = count (column name) The execution efficiency of the three is the same!

The column name is not the primary key and no index is created for the column name: count(1) = count(*)> count(column name).

The column name is not the primary key and the column name has an index created: count (1) = count (*) = count (column name).