Combining tables with SQL UNION
How to combine rows from two tables using SQL UNION.
What is SQL UNION
Let’s say you want to look at the number of people you have at each stage of the sales pipeline, but there’s a problem: leads and prospects live in one table (although with different filters), trial data live in another table, and customer data in another one. You need to count all of those and combine that info into one table, but they all have different logic and use different sources. SQL has a special operator UNION
that lets you combine query results from different tables.
There are two UNION
operators in SQL - UNION
and UNION ALL
. Most of this article applies both to UNION
and UNION ALL
.
Unions vs joins vs intersections: when to use each one
There are (at least) three ways to combine results of queries in SQL: JOIN
, UNION
, and INTERSECT
. They serve different purposes:
- Use
UNION
when you want to add more rows to your results. - Use
JOIN
to add more columns to your results. - Use
INTERSECT
when you want to find the common set of rows between tables.
Combine query results with SQL UNION
To see what UNION
does, let’s run the following SQL query (for example, you can run it in Metabase):
SELECT "Kitten" as pet_type, 0.5 as pet_age
UNION ALL
SELECT "Puppy", 2
UNION ALL
SELECT "Bird", 17;
The result is:
| pet_type | pet_age |
| -------- | ------- |
| Kitten | 0.5 |
| Puppy | 2 |
| Bird | 17 |
What’s going on:
- SQL executes each
SELECT
query between theUNION
clauses. - Stacks the rows of the results.
- Uses the aliases (
AS "Kitten"
) from the first query as the column headings.
Queries that are getting unioned (is that a word?..) don’t need to return a single line - the queries can include filters, grouping, CTEs, and so on. For example, let’s say you have two queries:
First query
SELECT
breed,
count(*)
FROM kittens
GROUP BY breed;
And the first query returns:
| BREED | COUNT(*) |
| ------------------ | --------- |
| Domestic shorthair | 23 |
| Domestic longhair | 10 |
| Siamese | 3 |
Second query
SELECT
breed,
count(*)
FROM puppies
GROUP BY breed;
And the second query returns:
| BREED | COUNT(*) |
| ------------- | --------- |
| Beagle | 3 |
| Borzoi | 1 |
| Corgi | 5 |
| Very good boy | 37 |
Then the union of these queries is:
-- First query
SELECT
breed,
count(*)
FROM kittens
GROUP BY color
UNION ALL
-- Second query
SELECT
breed,
count(*)
FROM puppies
GROUP BY color
And the result of the UNION ALL
query:
| BREED | COUNT(*) |
| ------------------ | --------- |
| Domestic shorthair | 23 |
| Domestic longhair | 10 |
| Siamese | 3 |
| Beagle | 3 |
| Borzoi | 1 |
| Corgi | 5 |
| Very good boy | 37 |
So UNION ALL
just executed queries one by one, and stacked the results. There are some nuances in how UNION (ALL)
handles column names, column order, and column types, but at the most basic level, it’s just stacking the rows.
UNION ALL
takes all the rows, UNION
tries to deduplicate rows
UNION ALL
takes query results and stacks them together.UNION
(without theALL
) does the same, but also tries to deduplicate rows.
So for example, let’s say you have two queries:
First query
SELECT
color,
count(*)
FROM kittens
GROUP BY color;
| COLOR | COUNT(*) |
| ------ | --------- |
| Orange | 18 |
| Black | 10 |
| White | 8 |
Second query
SELECT
color,
count(*)
FROM puppies
GROUP BY color;
| COLOR | COUNT(*) |
| ----- | --------- |
| Black | 10 |
| Brown | 23 |
| White | 15 |
Then the result of UNION ALL
is:
| COLOR | COUNT(*) |
| ------ | --------- |
| Orange | 18 |
| Black | 10 |
| White | 8 |
| Black | 10 |
| Brown | 23 |
| White | 15 |
Some colors (“Black” and “White”) have multiple rows.
UNION
will try deduplicate those repeated rows:
| COLOR | COUNT(*) |
| ------ | --------- |
| Orange | 18 |
| Black | 10 |
| White | 8 |
| Brown | 23 |
| White | 15 |
The row | Black | 10 |
appears in the results of both queries, so UNION
will drop one of the rows. As for the White
rows: even though the color White
appears twice, these rows are not duplicates. One row is | White | 8 |
and the other is | White | 10 |
, so UNION
keeps both.
SQL UNION
performs deduplication after the query results are retrieved, so if you have any LIMIT
or ORDER
clauses in individual queries, the result of a UNION
might not respect those.
Why (in general) you should prefer UNION ALL
to UNION
When you use UNION
instead of UNION ALL
, SQL will try to deduplicate the results - even if there are no duplicates. Deduplication can take a long time because SQL needs to go through all the records in each query.
Only use UNION
if you really need to drop duplicate rows.
SQL UNION
of complete tables
You can’t UNION ALL
plain tables. You can only union query results: This won’t work:
--this won't work:
SELECT * FROM (
kittens
UNION ALL
puppies
);
If you want to stack several tables together, you’ll need to select all records from each table:
-- UNION of two full tables
SELECT * FROM kittens
UNION ALL
SELECT * FROM puppies;
(although generally you should avoid using SELECT *
with UNION
, and instead specify the columns explicitly, see Unioned tables must have the same number of columns).
UNION ALL
example: building a sales funnel in SQL
Let’s say you store information about your sales funnel in three tables:
leads
table with information about sales leadsprospects
table with the data about prospectscustomers
table with (you guessed it) information about customers
You can run COUNT(*)
on each of those tables to find the numbers of leads, prospects, and customers independently, but it’d be more helpful to see those numbers together in a single table (for example if you want to understand conversion numbers). That’s where UNION ALL
can come in handy:
SELECT COUNT(*) from leads
UNION ALL
SELECT COUNT(*) from prospects
UNION ALL
SELECT COUNT(*) from customers;
This query return something like:
| COUNT(*) |
| --------- |
| 1749 |
| 832 |
| 562 |
So you can see the drop-off at each stage of the query - for example, about 48% of leads become prospects.
We use UNION ALL
here to speed up the query, but also make sure duplicates are not getting excluded: for example, if you have a killer app with a 100% conversion rate, and your prospect numbers are the same as customer numbers, using UNION
instead of UNION ALL
would mean that those duplicate customer numbers would be dropped.
There are two problems with these results: first, the column name COUNT(*)
is not helpful, and second, the numbers don’t have a lot of context: which one corresponds to which stage? So let’s add column names and an additional field storing the name of the stage:
SELECT 'Leads' as stage, COUNT(*) as ct from leads
UNION ALL
SELECT 'Prospects', COUNT(*) from prospects
UNION ALL
SELECT 'Customers', COUNT(*) from customers;
| stage | ct |
| --------- | ---- |
| Leads | 1749 |
| Prospects | 832 |
| Customers | 562 |
Notice that we only needed to add column names to the first query in the UNION
- that’s because SQL UNION takes column names only from the first query.
UNION ALL
example: tracking user flow in SQL
For a more complicated example, let’s say you’re running an eCommerce website, and you want to track the number of people going through the flow of:
- Catalog visit: Look at the product catalog.
- Product visit: Go to the specific product’s page.
- Add to cart: they’ve almost given you money.
- Order. They actually gave you money.
Your data is in multiple tables:
- Views to pages on the website (like the catalog page or the product’s page) are in
page_views
table. - Events like button clicks are tracked separately in the
events
table. - Actual customer orders are in the
orders
table.
You can count the total number of page views to the page with the list of products, like this:
-- Count all views to the catalog page
SELECT
COUNT(*)
FROM page_views
WHERE
page_url = 'https://www.whskr.co/catalog'
Next, you want to count visits to the page of a particular product - but not all views, just the views coming from the catalog page (as opposed to, for example, search or direct links). This uses the same page_views
table but with a different filter:
-- Count views to the product page coming from the catalog page
SELECT
COUNT(*)
FROM page_views
WHERE
page_url = 'https://www.whskr.co/product/smart-mouse-1234567'
AND referer_url = 'https://www.whskr.co/catalog'
Next, you want to count people who add the product to cart, so you filter all the interaction events. It can look something like this:
-- Clicks on "Add to cart" button on the product page
SELECT
COUNT(*)
FROM events
WHERE
event_type = 'button_clicked'
AND event_subtype = 'add-to-cart'
AND source_page = 'https://www.whskr.co/product/smart-mouse-1234567'
Finally, you want to know the total number of times people have ordered this product:
-- Number of orders for the product
SELECT COUNT(*)
FROM orders
WHERE product_id = 1234567
(Here we’re assuming the only way to place an order is from the product’s page.)
Now, you want to put all these counts into one table so that you can compare the number of people going though each stage. You’ll also want to add a column identifying each stage. That’s where UNION (ALL)
comes in:
-- Count all views to the catalog page
SELECT
'Catalog visits' as stage,
COUNT(*)
FROM page_views
WHERE
page_url = 'https://www.whskr.co/catalog'
UNION ALL
-- Count views to the product page coming from the catalog page
SELECT
'Product visits',
COUNT(*)
FROM page_views
WHERE
page_url = 'https://www.whskr.co/product/smart-mouse-1234567'
AND referer_url = 'https://www.whskr.co/catalog'
UNION ALL
-- Clicks on "Add to cart" button on the product page
SELECT
'Add to cart',
COUNT(*)
FROM interaction_events
WHERE
event_type = 'button_clicked'
AND event_subtype = 'cart'
AND source_page = 'https://www.whskr.co/product/smart-mouse-1234567'
UNION ALL
-- Number of orders for the product
SELECT
'Orders',
COUNT(*)
FROM orders
WHERE product_id = 1234567
Here we’re combining two queries on the page_views
table with two different filters, a query on the events
table, and a query on the orders
table. We’re using UNION ALL
because this isn’t a situation where we’d need to deduplicate the data.
The result will look something like this:
| stage | COUNT(*) |
| -------------- | --------- |
| Catalog visits | 5842 |
| Product visits | 851 |
| Add to cart | 592 |
| Orders | 346 |
UNION ALL
example: combine data from historical tables
Sometimes when tables are large and take a long time to query, people break one big table into several tables by date. For example, you could have a separate table with data for each year, like a table with data for 2023, 2024, 2025… The idea that in 2025, you’re probably not constantly querying data from, like, 1998, so there’s no need to make the database work to go through that data unless absolutely necessary.
But sometimes in setups like this you do need to retrieve data from all the previous years, and that’s where UNION (ALL)
can come in handy. For example, if you wanted to get all the order ids, dates, totals over all the years, you could write a query like this (assuming the world began in 2023):
SELECT id, created_at, total FROM orders_2023
UNION ALL
SELECT id, created_at, total FROM orders_2024
UNION ALL
SELECT id, created_at, total FROM orders_2023
UNION ALL
example: add a summary row
Let’s say you’re counting the number of products by category using the following query:
SELECT
category,
COUNT(*) as ct
FROM products
GROUP BY category;
with the following result:
| CATEGORY | ct |
| --------- | --- |
| Doohickey | 42 |
| Gadget | 53 |
| Gizmo | 51 |
| Widget | 54 |
You can use UNION ALL
to add a row with total count of records:
-- Query results
SELECT
category,
COUNT(*) as ct
FROM products
GROUP BY category
ORDER BY ct
UNION ALL
-- Total row
SELECT
'Total',
COUNT(*)
FROM products;
The result will look like this:
| CATEGORY | ct |
| --------- | --- |
| Doohickey | 42 |
| Gadget | 53 |
| Gizmo | 51 |
| Widget | 54 |
| Total | 200 |
You can use the same idea to add other kinds of summaries. For example, you can add a row for the average count across categories:
WITH ct_by_category AS (
SELECT
category,
COUNT(*) as ct
FROM products
GROUP BY category
)
SELECT category, ct FROM ct_by_category
UNION ALL
SELECT 'Average', AVG(ct) FROM ct_by_category;
Here we pulled the COUNT
ing into a CTE to make the query cleaner.
How SQL UNION
works
SQL UNION (ALL)
assumes you know what you’re doing, and doesn’t get in your way - but also doesn’t try to be smart and help you. SQL UNION (ALL)
will just execute the queries that you’re unioning (yep, that’s a word now), and try to stack the result rows. If something goes wrong - like the number of columns is different or or column types doesn’t match - UNION (ALL)
will just give up.
Here are some of things to keep in mind:
- SQL
UNION
takes columns from the first query. UNION
columns must be in the same order.- Unioned tables must have the same number of columns.
- Column data types must match in SQL
UNION
s.
SQL UNION
takes columns from the first query
UNION (ALL)
will take the column names from the first query, and ignore any column names in the subsequent queries.
So let’s say you have a query like this:
SELECT 'Kitten' as pet_type, 0.5 as pet_age
UNION ALL
SELECT 'Bird' as category, 17 as age_in_years;
The result will be:
| pet_type | pet_age |
| -------- | ------- |
| Kitten | 0.5 |
| Bird | 17 |
SQL UNION (ALL)
will not create new columns. So as long as the number of columns and the data types match, SQL UNION (ALL)
will just stack the rows. This might not might not be what you expect!
For example, take the following query:
SELECT 'Kitten' as pet_type, 0.5 as pet_age
UNION ALL
SELECT 'Senior Software Engineer' as job_title, 32768 as favorite_number;
What you might want:
| pet_type | pet_age | job_title | favorite_number |
| -------- | ------- | ------------------------ | --------------- |
| Kitten | 0.5 | | |
| | | Senior Software Engineer | 32768 |
but what the query actually returns:
| pet_type | pet_age |
| ------------------------ | ------- |
| Kitten | 0.5 |
| Senior Software Engineer | 32768 |
If what you want to keep the different columns and add nulls into the cells that don’t have corresponding data, you could try full outer join instead.
SQL UNION
columns must be in the same order
If you have columns that you want SQL UNION (ALL)
to “match” to each other, you have to put them in the same order in every query. SQL UNION (ALL)
will just try to put all the values in the first column for all the queries together, all the values in all the second columns together, etc, and will not try to match columns based on names, types or any other characteristics.
For example, here the query creator might have intended to have two columns pet_type
and pet_age
, but the column order is different in two queries:
-- This will cause an error in most databases:
SELECT 'Kitten' as pet_type, 0.5 as pet_age
UNION ALL
SELECT 17 as pet_age, 'Bird' as pet_type;
What you might expect:
-- Expected, but not actual the result of the query above
| pet_type | pet_age |
| -------- | ------- |
| Kitten | 0.5 |
| Bird | 17 |
but the actual result will depend on the database. Most databases will give an error, something like:
ERROR: Data conversion error converting "Kitten";
That’s the error message you get on Metabase Sample Database - other databases might give a different error message.
Some other databases (like MySQL) will give:
| pet_type | pet_age |
| -------- | ------- |
| Kitten | 0.5 |
| 17 | Bird |
But the result will never be:
-- Can't get this result with query above
| pet_type | pet_age |
| -------- | ------- |
| Kitten | 0.5 |
| Bird | 17 |
What’s happening here?
-
First of all, SQL
UNION
doesn’t care about the column names in the second query, so it’ll ignore them and will try to put'Kitten'
and17
together into one column, and0.5
and'Bird'
into another column. -
Then, since the types don’t match -
'Kitten'
is a string and17
is a number - most databases will just reject the operation. MySQL is a bit more generous, and will convert17
into a string'17'
.
This behavior can be especially confusing when the types are not mismatched, and so there’s no error. For example, SQL will happily execute the following query without error, because all columns are strings:
-- This query will not error out but will produce unexpected results
SELECT 'Kitten' as pet_type, 'Mittens' as pet_name
UNION ALL
SELECT 'Professor Beakman' as pet_name, 'Bird' as pet_type;
and return
| pet_type | pet_name |
| ----------------- | -------- |
| Kitten | Mittens |
| Professor Beakman | Bird |
which is probably not what you wanted.
Unioned tables must have the same number of columns
If the results of the queries have varying number of columns, the UNION (ALL)
just won’t work - even if there’s overlap between columns. For example, the following query won’t work regardless of what comes in ...
.
--- This query won't work
SELECT category, week_num, COUNT(*) FROM ...
UNION ALL
SELECT category, COUNT(*) FROM ...
The first query returns three columns, and the second query returns two columns, so UNION ALL
can’t stack those results. It will not create an additional column with a null value for the missing column. Instead, your database will return an error, something like :
ERROR: each UNION query must have the same number of columns
This can cause unexpected errors when you’re using SELECT *
instead of specifying columns explicitly, especially if what you’re selecting FROM
can change. Let’s say you have separate tables storing orders information for each year: orders_2024
, orders_2025
etc. To get all the orders over the years, you might run:
SELECT * FROM orders_2023
UNION ALL
SELECT * FROM orders_2024
UNION ALL
SELECT * FROM orders_2025;
But this query will start failing if someone adds a new column to orders_2025
.
So when you’re using UNION (ALL)
, always specify exactly which columns you’re selecting (this is good practice when writing any SELECT
in general, UNION
or not):
SELECT id, created_at, amount FROM orders_2023
UNION ALL
SELECT id, created_at, amount FROM orders_2024
UNION ALL
SELECT id, created_at, amount FROM orders_2025;
Column data types must match in SQL UNION
s
The types of columns in the results of queries must match. So for example, the first column in every query in the UNION ALL
would be a number, every second column would be text, etc. If the types don’t match - for example, one of the queries has a text column as its first SELECTed column, and another query has a number column as its first - then the UNION
query might not work. Whether it works depends on the database and the data types. Most databases will show an error if the types don’t match exactly. Some databases like MySQL will attempt type conversions (e.g., number to string) but will show an error if conversion isn’t possible. Implicit type conversion can also lead to unexpected results.
For example, take the following query:
SELECT true
UNION ALL
SELECT DATE '2025-04-19'
Here we’re trying to stack a boolean value and a date. Most databases (for example, Postgres), will just give an error.
MySQL will instead return
| TRUE |
| ---------- |
| 1 |
| 2025-04-19 |
Here MySQL makes TRUE
the header of the column because we didn’t supply any other option. If you check the data type of the returned column, you’ll see that it’s now a VARCHAR
column. So MySQL converted both true
and 2025-04-19
to strings, and true
is converted to string value 1
because MySQL doesn’t actually have boolean data types, and instead boolean values are stored as integers 1
or 0
, which in this query MySQL also converts to a string. So you can see how this can be confusing! Make sure that the data types of the columns in UNION
match to avoid unexpected results like this.