Primary Key in SQL – What Is It Used For?

We use primary keys in SQL on most tables in an OLTP or OLAP data model. These keys act as a unique identifier for a table. It’s usually an integer data type and values are almost always incremental (ex. 1,2,3, and so on).

But why do we need them?

We need these keys since they are used primarily as a pointing reference to another table. As a result, a pointer to a primary key on another table is referred to as a foreign key.

We can see in the image below how a primary key is used as a reference (foreign key) on another table.

primary key example in an rdbms database
Primary/Foreign Key Relationship

Primary keys are almost always indexed fields. As a result, the database optimizes and prioritizes these columns for performance reasons. As mentioned previously, this is because these keys are references as foreign keys to other tables. In the above example we can see how the customer data gets tied to the invoice data through the reference of this key.

Uniqueness & Enforced Integrity

These keys are also primarily used to enforce integrity of a table meaning there can be only one primary key value per table. These values will always be unique because they have enforced uniqueness. If we were to try and insert a duplicate value then we would see an error similar to the one below.

sqlite primary key constraint error
Primary Key Constraint Error in SQLite

On top of being unique, they cannot be NULL. Empty values are not allowed. This is not to be confused with a unique key. Unique keys will allow NULL values to exist.

All in all, the important things to remember with primary keys are:

  1. Always NOT NULL
  2. They must be UNIQUE
  3. Indexed by the database (usually by default)
  4. Likely to be used as a foreign key on another table
  5. Used as a point of record
Final Thoughts

At the end of the day these keys are a crucial piece of any relational database table. As a result, it’s important to understand these keys so we can write performant queries and create appropriate and optimized data models.