SQL GROUP BY Clause – Count, Sum, And More

When working with databases and doing general data investigation the GROUP BY clause is going to become a part of your regular vocabulary. It’s very useful when writing SQL – whether it’s for reporting or trouble-shooting data issues and It’s especially useful for finding duplicates.

Note: When writing a SQL statement with a GROUP BY clause, you cannot or should not select more columns than you are grouping by.

Some databases allow you to select more columns than selected in a GROUP BY such as the DBeaver SQLite sample database, but if you don’t know the data inside and out this is a recipe for disaster. Usually the SQL editor will display an error from your database as you’ve selected more columns that you are grouping by.

Let’s give an example of using a GROUP BY. Let’s say that we want to know how many employees we have at their respective title within our organization. We could write a query like this:

WHERE TITLE = ‘IT Staff’ — Example to show that a WHERE clause must come before GROUP BY

Looking At Practical Examples Of GROUP BY
sql group by employee example

We can also rename the COUNT(*) to something more relevant:

sql employee count example

Let’s say instead of getting an employee count, we want to get the invoice totals by state. We could use the GROUP BY with a SUM:

sql sum group by example

The number of aggregate functions can vary based on the database, but other aggregate functions that can be used with a SQL GROUP BY include:

  • AVG()
  • MAX()
  • MIN()

These sorts of functions can answer a wide variety of questions such as:

  • Which company is bringing us the most income?
  • Where are most of our companies located?
  • What is the average amount invoiced per customer?
  • How many employees live outside of the U.S.?

This list of questions is never-ending, but hopefully you can see how these sorts of questions can be very useful. These types of data questions can offer insights to a company. Similarly, insights to a business are also referred to as Business Intelligence.

Next Section: The HAVING Clause – Filtering Aggregates