Combining tables with SQL UNION

How to combine rows from two tables using SQL UNION.

What is SQL UNION

Let’s say you want to look at the number of people you have at each stage of the sales pipeline, but there’s a problem: leads and prospects live in one table (although with different filters), trial data live in another table, and customer data in another one. You need to count all of those and combine that info into one table, but they all have different logic and use different sources. SQL has a special operator UNION that lets you combine query results from different tables.

There are two UNION operators in SQL - UNION and UNION ALL. Most of this article applies both to UNION and UNION ALL.

Unions vs joins vs intersections: when to use each one

There are (at least) three ways to combine results of queries in SQL: JOIN, UNION, and INTERSECT. They serve different purposes:

  • Use UNION when you want to add more rows to your results.
  • Use JOIN to add more columns to your results.
  • Use INTERSECT when you want to find the common set of rows between tables.

Union Join Intersect

Combine query results with SQL UNION

To see what UNION does, let’s run the following SQL query (for example, you can run it in Metabase):

SELECT "Kitten" as pet_type, 0.5 as pet_age

UNION ALL

SELECT "Puppy", 2

UNION ALL

SELECT "Bird", 17;

The result is:

| pet_type | pet_age |
| -------- | ------- |
| Kitten   | 0.5     |
| Puppy    | 2       |
| Bird     | 17      |

What’s going on:

  1. SQL executes each SELECT query between the UNION clauses.
  2. Stacks the rows of the results.
  3. Uses the aliases (AS "Kitten") from the first query as the column headings.

Queries that are getting unioned (is that a word?..) don’t need to return a single line - the queries can include filters, grouping, CTEs, and so on. For example, let’s say you have two queries:

First query

SELECT
    breed,
    count(*)
FROM kittens
GROUP BY breed;

And the first query returns:

| BREED              | COUNT(*)  |
| ------------------ | --------- |
| Domestic shorthair | 23        |
| Domestic longhair  | 10        |
| Siamese            | 3         |

Second query

SELECT
    breed,
    count(*)
FROM puppies
GROUP BY breed;

And the second query returns:

| BREED         | COUNT(*)  |
| ------------- | --------- |
| Beagle        | 3         |
| Borzoi        | 1         |
| Corgi         | 5         |
| Very good boy | 37        |

Then the union of these queries is:

-- First query
SELECT
    breed,
    count(*)
FROM kittens
GROUP BY color

UNION ALL

-- Second query
SELECT
    breed,
    count(*)
FROM puppies
GROUP BY color

And the result of the UNION ALL query:

| BREED              | COUNT(*)  |
| ------------------ | --------- |
| Domestic shorthair | 23        |
| Domestic longhair  | 10        |
| Siamese            | 3         |
| Beagle             | 3         |
| Borzoi             | 1         |
| Corgi              | 5         |
| Very good boy      | 37        |

So UNION ALL just executed queries one by one, and stacked the results. There are some nuances in how UNION (ALL) handles column names, column order, and column types, but at the most basic level, it’s just stacking the rows.

UNION ALL takes all the rows, UNION tries to deduplicate rows

  • UNION ALL takes query results and stacks them together.
  • UNION (without the ALL) does the same, but also tries to deduplicate rows.

So for example, let’s say you have two queries:

First query

SELECT
    color,
    count(*)
FROM kittens
GROUP BY color;
| COLOR  | COUNT(*)  |
| ------ | --------- |
| Orange | 18        |
| Black  | 10        |
| White  | 8         |

Second query

SELECT
    color,
    count(*)
FROM puppies
GROUP BY color;
| COLOR | COUNT(*)  |
| ----- | --------- |
| Black | 10        |
| Brown | 23        |
| White | 15        |

Then the result of UNION ALL is:

| COLOR  | COUNT(*)  |
| ------ | --------- |
| Orange | 18        |
| Black  | 10        |
| White  | 8         |
| Black  | 10        |
| Brown  | 23        |
| White  | 15        |

Some colors (“Black” and “White”) have multiple rows. UNION will try deduplicate those repeated rows:

| COLOR  | COUNT(*)  |
| ------ | --------- |
| Orange | 18        |
| Black  | 10        |
| White  | 8         |
| Brown  | 23        |
| White  | 15        |

The row | Black | 10 | appears in the results of both queries, so UNION will drop one of the rows. As for the White rows: even though the color White appears twice, these rows are not duplicates. One row is | White | 8 | and the other is | White | 10 |, so UNION keeps both.

SQL UNION performs deduplication after the query results are retrieved, so if you have any LIMIT or ORDER clauses in individual queries, the result of a UNION might not respect those.

Why (in general) you should prefer UNION ALL to UNION

When you use UNION instead of UNION ALL, SQL will try to deduplicate the results - even if there are no duplicates. Deduplication can take a long time because SQL needs to go through all the records in each query.

Only use UNION if you really need to drop duplicate rows.

SQL UNION of complete tables

You can’t UNION ALL plain tables. You can only union query results: This won’t work:

--this won't work:
SELECT * FROM (
    kittens
    UNION ALL
    puppies
);

If you want to stack several tables together, you’ll need to select all records from each table:

-- UNION of two full tables
SELECT * FROM kittens
UNION ALL
SELECT * FROM puppies;

(although generally you should avoid using SELECT * with UNION, and instead specify the columns explicitly, see Unioned tables must have the same number of columns).

UNION ALL example: building a sales funnel in SQL

Let’s say you store information about your sales funnel in three tables:

  • leads table with information about sales leads
  • prospects table with the data about prospects
  • customers table with (you guessed it) information about customers

You can run COUNT(*) on each of those tables to find the numbers of leads, prospects, and customers independently, but it’d be more helpful to see those numbers together in a single table (for example if you want to understand conversion numbers). That’s where UNION ALL can come in handy:

SELECT COUNT(*) from leads
UNION ALL
SELECT COUNT(*) from prospects
UNION ALL
SELECT COUNT(*) from customers;

This query return something like:

| COUNT(*)  |
| --------- |
| 1749      |
| 832       |
| 562       |

So you can see the drop-off at each stage of the query - for example, about 48% of leads become prospects.

We use UNION ALL here to speed up the query, but also make sure duplicates are not getting excluded: for example, if you have a killer app with a 100% conversion rate, and your prospect numbers are the same as customer numbers, using UNION instead of UNION ALL would mean that those duplicate customer numbers would be dropped.

There are two problems with these results: first, the column name COUNT(*) is not helpful, and second, the numbers don’t have a lot of context: which one corresponds to which stage? So let’s add column names and an additional field storing the name of the stage:

SELECT 'Leads' as stage, COUNT(*) as ct from leads
UNION ALL
SELECT 'Prospects', COUNT(*) from prospects
UNION ALL
SELECT 'Customers', COUNT(*) from customers;
| stage     | ct   |
| --------- | ---- |
| Leads     | 1749 |
| Prospects | 832  |
| Customers | 562  |

Notice that we only needed to add column names to the first query in the UNION - that’s because SQL UNION takes column names only from the first query.

UNION ALL example: tracking user flow in SQL

For a more complicated example, let’s say you’re running an eCommerce website, and you want to track the number of people going through the flow of:

  1. Catalog visit: Look at the product catalog.
  2. Product visit: Go to the specific product’s page.
  3. Add to cart: they’ve almost given you money.
  4. Order. They actually gave you money.

Your data is in multiple tables:

  • Views to pages on the website (like the catalog page or the product’s page) are in page_views table.
  • Events like button clicks are tracked separately in the events table.
  • Actual customer orders are in the orders table.

You can count the total number of page views to the page with the list of products, like this:

-- Count all views to the catalog page
SELECT
   COUNT(*)
FROM page_views
WHERE
    page_url = 'https://www.whskr.co/catalog'

Next, you want to count visits to the page of a particular product - but not all views, just the views coming from the catalog page (as opposed to, for example, search or direct links). This uses the same page_views table but with a different filter:

-- Count views to the product page coming from the catalog page
SELECT
    COUNT(*)
FROM page_views
WHERE
    page_url = 'https://www.whskr.co/product/smart-mouse-1234567'
    AND referer_url = 'https://www.whskr.co/catalog'

Next, you want to count people who add the product to cart, so you filter all the interaction events. It can look something like this:

-- Clicks on "Add to cart" button on the product page
SELECT
    COUNT(*)
FROM events
WHERE
    event_type = 'button_clicked'
    AND event_subtype = 'add-to-cart'
    AND source_page = 'https://www.whskr.co/product/smart-mouse-1234567'

Finally, you want to know the total number of times people have ordered this product:

-- Number of orders for the product
SELECT COUNT(*)
FROM orders
WHERE product_id = 1234567

(Here we’re assuming the only way to place an order is from the product’s page.)

Now, you want to put all these counts into one table so that you can compare the number of people going though each stage. You’ll also want to add a column identifying each stage. That’s where UNION (ALL) comes in:

-- Count all views to the catalog page
SELECT
   'Catalog visits' as stage,
   COUNT(*)
FROM page_views
WHERE
    page_url = 'https://www.whskr.co/catalog'

UNION ALL

-- Count views to the product page coming from the catalog page
SELECT
    'Product visits',
    COUNT(*)
FROM page_views
WHERE
    page_url = 'https://www.whskr.co/product/smart-mouse-1234567'
    AND referer_url = 'https://www.whskr.co/catalog'


UNION ALL

-- Clicks on "Add to cart" button on the product page
SELECT
    'Add to cart',
    COUNT(*)
FROM interaction_events
WHERE
    event_type = 'button_clicked'
    AND event_subtype = 'cart'
    AND source_page = 'https://www.whskr.co/product/smart-mouse-1234567'

UNION ALL

-- Number of orders for the product
SELECT
    'Orders',
     COUNT(*)
FROM orders
WHERE product_id = 1234567

Here we’re combining two queries on the page_views table with two different filters, a query on the events table, and a query on the orders table. We’re using UNION ALL because this isn’t a situation where we’d need to deduplicate the data.

The result will look something like this:

| stage          | COUNT(*)  |
| -------------- | --------- |
| Catalog visits | 5842      |
| Product visits | 851       |
| Add to cart    | 592       |
| Orders         | 346       |

UNION ALL example: combine data from historical tables

Sometimes when tables are large and take a long time to query, people break one big table into several tables by date. For example, you could have a separate table with data for each year, like a table with data for 2023, 2024, 2025… The idea that in 2025, you’re probably not constantly querying data from, like, 1998, so there’s no need to make the database work to go through that data unless absolutely necessary.

But sometimes in setups like this you do need to retrieve data from all the previous years, and that’s where UNION (ALL) can come in handy. For example, if you wanted to get all the order ids, dates, totals over all the years, you could write a query like this (assuming the world began in 2023):

SELECT id, created_at, total FROM orders_2023
UNION ALL
SELECT id, created_at, total FROM orders_2024
UNION ALL
SELECT id, created_at, total FROM orders_2023

UNION ALL example: add a summary row

Let’s say you’re counting the number of products by category using the following query:

SELECT
  category,
  COUNT(*) as ct
FROM products
GROUP BY category;

with the following result:

| CATEGORY  | ct  |
| --------- | --- |
| Doohickey | 42  |
| Gadget    | 53  |
| Gizmo     | 51  |
| Widget    | 54  |

You can use UNION ALL to add a row with total count of records:

-- Query results
SELECT
  category,
  COUNT(*) as ct
FROM products
GROUP BY category
ORDER BY ct

UNION ALL

-- Total row
SELECT
'Total',
COUNT(*)
FROM products;

The result will look like this:

| CATEGORY  | ct  |
| --------- | --- |
| Doohickey | 42  |
| Gadget    | 53  |
| Gizmo     | 51  |
| Widget    | 54  |
| Total     | 200 |

You can use the same idea to add other kinds of summaries. For example, you can add a row for the average count across categories:


WITH ct_by_category AS (
  SELECT
    category,
    COUNT(*) as ct
  FROM products
  GROUP BY category
)

SELECT category, ct FROM ct_by_category
UNION ALL
SELECT 'Average', AVG(ct) FROM ct_by_category;

Here we pulled the COUNTing into a CTE to make the query cleaner.

How SQL UNION works

SQL UNION (ALL) assumes you know what you’re doing, and doesn’t get in your way - but also doesn’t try to be smart and help you. SQL UNION (ALL) will just execute the queries that you’re unioning (yep, that’s a word now), and try to stack the result rows. If something goes wrong - like the number of columns is different or or column types doesn’t match - UNION (ALL) will just give up.

Here are some of things to keep in mind:

SQL UNION takes columns from the first query

UNION (ALL) will take the column names from the first query, and ignore any column names in the subsequent queries.

So let’s say you have a query like this:

SELECT 'Kitten' as pet_type, 0.5 as pet_age
UNION ALL
SELECT 'Bird' as category, 17 as age_in_years;

The result will be:

| pet_type | pet_age |
| -------- | ------- |
| Kitten   | 0.5     |
| Bird     | 17      |

SQL UNION (ALL) will not create new columns. So as long as the number of columns and the data types match, SQL UNION (ALL) will just stack the rows. This might not might not be what you expect!

For example, take the following query:

SELECT 'Kitten' as pet_type, 0.5 as pet_age
UNION ALL
SELECT 'Senior Software Engineer' as job_title, 32768 as favorite_number;

What you might want:

| pet_type | pet_age | job_title                | favorite_number |
| -------- | ------- | ------------------------ | --------------- |
| Kitten   | 0.5     |                          |                 |
|          |         | Senior Software Engineer | 32768           |

but what the query actually returns:

| pet_type                 | pet_age |
| ------------------------ | ------- |
| Kitten                   | 0.5     |
| Senior Software Engineer | 32768   |

If what you want to keep the different columns and add nulls into the cells that don’t have corresponding data, you could try full outer join instead.

SQL UNION columns must be in the same order

If you have columns that you want SQL UNION (ALL) to “match” to each other, you have to put them in the same order in every query. SQL UNION (ALL) will just try to put all the values in the first column for all the queries together, all the values in all the second columns together, etc, and will not try to match columns based on names, types or any other characteristics.

For example, here the query creator might have intended to have two columns pet_type and pet_age, but the column order is different in two queries:

-- This will cause an error in most databases:
SELECT 'Kitten' as pet_type, 0.5 as pet_age
UNION ALL
SELECT 17 as pet_age, 'Bird' as pet_type;

What you might expect:

-- Expected, but not actual the result of the query above

| pet_type | pet_age |
| -------- | ------- |
| Kitten   | 0.5     |
| Bird     | 17      |

but the actual result will depend on the database. Most databases will give an error, something like:

ERROR: Data conversion error converting "Kitten";

That’s the error message you get on Metabase Sample Database - other databases might give a different error message.

Some other databases (like MySQL) will give:

| pet_type | pet_age |
| -------- | ------- |
| Kitten   | 0.5     |
| 17       | Bird    |

But the result will never be:

-- Can't get this result with query above

| pet_type | pet_age |
| -------- | ------- |
| Kitten   | 0.5     |
| Bird     | 17      |

What’s happening here?

  • First of all, SQL UNION doesn’t care about the column names in the second query, so it’ll ignore them and will try to put 'Kitten' and 17 together into one column, and 0.5 and 'Bird' into another column.

  • Then, since the types don’t match - 'Kitten' is a string and 17 is a number - most databases will just reject the operation. MySQL is a bit more generous, and will convert 17 into a string '17'.

This behavior can be especially confusing when the types are not mismatched, and so there’s no error. For example, SQL will happily execute the following query without error, because all columns are strings:

-- This query will not error out but will produce unexpected results
SELECT 'Kitten' as pet_type, 'Mittens' as pet_name
UNION ALL
SELECT 'Professor Beakman' as pet_name, 'Bird' as pet_type;

and return

| pet_type          | pet_name |
| ----------------- | -------- |
| Kitten            | Mittens  |
| Professor Beakman | Bird     |

which is probably not what you wanted.

Unioned tables must have the same number of columns

If the results of the queries have varying number of columns, the UNION (ALL) just won’t work - even if there’s overlap between columns. For example, the following query won’t work regardless of what comes in ....

--- This query won't work

SELECT category, week_num, COUNT(*) FROM ...

UNION ALL

SELECT category, COUNT(*) FROM ...

The first query returns three columns, and the second query returns two columns, so UNION ALL can’t stack those results. It will not create an additional column with a null value for the missing column. Instead, your database will return an error, something like :

ERROR: each UNION query must have the same number of columns

This can cause unexpected errors when you’re using SELECT * instead of specifying columns explicitly, especially if what you’re selecting FROM can change. Let’s say you have separate tables storing orders information for each year: orders_2024, orders_2025 etc. To get all the orders over the years, you might run:

SELECT * FROM orders_2023
UNION ALL
SELECT * FROM orders_2024
UNION ALL
SELECT * FROM orders_2025;

But this query will start failing if someone adds a new column to orders_2025.

So when you’re using UNION (ALL), always specify exactly which columns you’re selecting (this is good practice when writing any SELECT in general, UNION or not):

SELECT id, created_at, amount FROM orders_2023
UNION ALL
SELECT id, created_at, amount FROM orders_2024
UNION ALL
SELECT id, created_at, amount FROM orders_2025;

Column data types must match in SQL UNIONs

The types of columns in the results of queries must match. So for example, the first column in every query in the UNION ALL would be a number, every second column would be text, etc. If the types don’t match - for example, one of the queries has a text column as its first SELECTed column, and another query has a number column as its first - then the UNION query might not work. Whether it works depends on the database and the data types. Most databases will show an error if the types don’t match exactly. Some databases like MySQL will attempt type conversions (e.g., number to string) but will show an error if conversion isn’t possible. Implicit type conversion can also lead to unexpected results.

For example, take the following query:

SELECT true

UNION ALL

SELECT DATE '2025-04-19'

Here we’re trying to stack a boolean value and a date. Most databases (for example, Postgres), will just give an error.

MySQL will instead return

| TRUE       |
| ---------- |
| 1          |
| 2025-04-19 |

Here MySQL makes TRUE the header of the column because we didn’t supply any other option. If you check the data type of the returned column, you’ll see that it’s now a VARCHAR column. So MySQL converted both true and 2025-04-19 to strings, and true is converted to string value 1 because MySQL doesn’t actually have boolean data types, and instead boolean values are stored as integers 1 or 0, which in this query MySQL also converts to a string. So you can see how this can be confusing! Make sure that the data types of the columns in UNION match to avoid unexpected results like this.

Further reading

Was this helpful?

Thanks for your feedback!

Join our weekly digest with the latest data and AI insights you simply can't miss