Understanding how comfortable someone is with databases and SQL often comes down to the features they use. In PostgreSQL, one such feature that distinguishes more advanced users is the MERGE
command, introduced in version 15 and expanded in version 17 (in beta at the time of writing this article). Before MERGE
, developers typically relied on INSERT ... ON CONFLICT DO UPDATE
for upserts—a method introduced in PostgreSQL 9.5 that has since become a staple in many developers' toolkits.
While ON CONFLICT
offers a straightforward solution for simple upsert scenarios, it can quickly become limiting as business logic grows in complexity. This is where the MERGE
command excels. Introduced in the SQL:2003 standard, MERGE
allows for more sophisticated data synchronisation tasks by combining multiple operations—such as conditional inserts, updates, and deletes—into a single, atomic statement.
In this article, we’ll explore the capabilities of the MERGE
command, comparing it with traditional upsert methods and examining how it can streamline database operations. Through practical examples, we'll illustrate how MERGE
can simplify even the most complex workflows, making it an indispensable tool for developers working with PostgreSQL.
This introduction sets the stage by highlighting the evolution from ON CONFLICT
to MERGE
, explains the context in which MERGE
becomes valuable, and clearly outlines what the reader will gain from the article.
Hands-on Example: Managing a Score Points System for a Mobile Game
Imagine you’re managing a tournament score system for an online game, where players earn special status by participating in weekly tournaments. The simple business logic would be:
- Players can earn score points by participating in the tournament.
- Players must participate in all tournaments to maintain their score (i.e., maintain the streak).
- Players who participate in a second consecutive tournament can achieve the status of
veteran
(if they reach 1,000 score points) or gain the status ofstar
(if they reach 100 or more score points).
The score points are updated weekly, based on the outcome of the previous week's participation.
We will start with a simple database schema consisting of a table tournament_scores
that tracks (as the name suggests) only active users.
CREATE TABLE tournament_scores (
player_id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
player_name TEXT UNIQUE NOT NULL,
score INT NOT NULL DEFAULT 0,
status TEXT NOT NULL DEFAULT 'newbie' CHECK (status IN ('newbie', 'veteran', 'star'))
);
And populate it with some sample data:
INSERT INTO tournament_scores (player_name, score, status)
VALUES
('PlayerOne', 900, 'newbie'), -- Regular player, close to Veteran promotion
('PlayerTwo', 1200, 'veteran'), -- Already a Veteran player
('PlayerThree', 300, 'newbie'); -- Regular player with a lower score
Upsert using ON CONFLICT
When activity data is received, there are multiple ways to process it. ON CONFLICT
is used here for demonstration purposes.
INSERT INTO tournament_scores (player_name, score)
VALUES
('PlayerOne', 50), -- Add points for PlayerOne
('PlayerTwo', 120), -- Add points for PlayerTwo
('PlayerFour', 70) -- New player without an account, PlayerFour
ON CONFLICT (player_name)
DO UPDATE SET
score = tournament_scores.score + EXCLUDED.score,
status = CASE WHEN (tournament_scores.score + EXCLUDED.score) > 1000 THEN 'veteran' ELSE 'newbie' END;
This performs the basic requirements, updating the scores and possibly evaluating the status for existing tournament users. To remove users who no longer participated (and hence broke the streak) and apply other conditional logic, you would need to use separate statements.
Handling Upserts with MERGE
Now, let’s introduce the MERGE
command and implement the same logic as above.
MERGE INTO tournament_scores ts
USING (
VALUES
('PlayerOne', 50), -- Add points for PlayerOne
('PlayerTwo', 120), -- Add points for PlayerTwo
('PlayerFour', 70) -- New player without an account, PlayerFour
) AS v(player_name, score_added)
ON ts.player_name = v.player_name
WHEN MATCHED THEN
UPDATE SET
score = ts.score + v.score_added,
status = CASE WHEN (ts.score + v.score_added) > 1000 THEN 'veteran' ELSE 'newbie' END
WHEN NOT MATCHED THEN
INSERT (player_name, score)
VALUES (v.player_name, v.score_added);
Here, we define the table tournament_scores
as the target and the list of VALUES as the source. Using a conditional clause, we define the logic for both matched and unmatched entries, giving us both INSERT and UPDATE paths.
The evaluation paths in this case are called when_clauses
. The MERGE
statement allows you to specify multiple clauses with different conditions, for example, allowing you to award users star
status if they gain more than 100 score points within a given tournament.
MERGE INTO tournament_scores ts
USING (
VALUES
('PlayerOne', 50), -- Add points for PlayerOne
('PlayerTwo', 120), -- Add points for PlayerTwo
('PlayerFour', 70) -- New player without an account, PlayerFour
) AS v(player_name, score_added)
ON ts.player_name = v.player_name
WHEN MATCHED AND v.score_added > 100 THEN
UPDATE SET score = ts.score + v.score_added, status = 'star'
WHEN MATCHED THEN
UPDATE SET
score = ts.score + v.score_added,
status = CASE WHEN (ts.score + v.score_added) > 1000 THEN 'veteran' ELSE 'newbie' END
WHEN NOT MATCHED THEN
INSERT (player_name, score)
VALUES (v.player_name, v.score_added);
Here’s a summary of how it works:
- The
MERGE
command evaluates thewhen_clauses
in the order they are written. - If a row matches the condition specified in a clause, the action defined in that clause is performed, and the row is no longer eligible to be matched against subsequent
when_clauses
.
While we introduced multiple evaluation paths, the MERGE
command in PostgreSQL goes beyond that and expands the WHEN NOT MATCHED
clause, which effectively becomes WHEN NOT MATCHED [BY TARGET]
. PostgreSQL allows you to specify WHEN NOT MATCHED BY SOURCE
to perform the necessary merge statement for the data not present in the source table.
Handling DELETEs with MERGE
The functionality completely missing from regular upserts with ON CONFLICT
is the ability to delete missing entries. In our sample scenario, we want to penalise players who broke their streak and did not participate in last week's tournament by effectively deleting their entries from the target table.
MERGE INTO tournament_scores ts
USING (
VALUES
('PlayerOne', 50), -- Add points for PlayerOne
('PlayerTwo', 120), -- Add points for PlayerTwo
('PlayerFour', 70) -- New player without an account, PlayerFour
) AS v(player_name, score_added)
ON ts.player_name = v.player_name
WHEN MATCHED AND v.score_added > 100 THEN
UPDATE SET score = ts.score + v.score_added, status = 'star'
WHEN MATCHED THEN
UPDATE SET score = ts.score + v.score_added
WHEN NOT MATCHED THEN
INSERT (player_name, score)
VALUES (v.player_name, v.score_added)
WHEN NOT MATCHED BY SOURCE THEN
DELETE;
The introduction of the DELETE
merge operation complements all possible MERGE
outcomes:
- The first is
merge_update
, applicable toWHEN MATCHED
andWHEN NOT MATCHED
clauses, consisting of regularUPDATE
operations. - The second is
merge_insert
for theWHEN NOT MATCHED
clause, allowing (as the name implies) data to be inserted. - The
DELETE
we introduce is part ofmerge_delete
.
Technically, there's also the ability to specify DO NOTHING
for all available when_clauses
, similar to upserts using ON CONFLICT
.
Using MERGE
Output
Starting from PostgreSQL 17, the MERGE
command has been extended to support RETURNING
clauses to process merged data further. When an INSERT
or UPDATE
action is performed, the new values of the target table's columns are used. When a DELETE
is performed, the old values of the target table's columns are used.
Conclusion
The MERGE
command significantly enhances the ability to handle complex INSERT
and UPDATE
logic by allowing multiple operations within a single query while also capturing even the most intricate scenarios. In this article, we have explored the syntax and common use cases of MERGE
. For further exploration, note that the source
dataset is where you can perform any data transformations required, making MERGE
particularly well-suited for ETL operations, data archival, cleanup tasks, and more.
Incorporating MERGE
into your PostgreSQL toolkit can simplify database operations, reduce the risk of data inconsistencies, and streamline your