SQL filtering by text
Learn SQL text filtering: use WHERE, LIKE, IN, NOT IN, TRIM, UPPER, LOWER, regex, and variables to find and filter string data in tables.
Learn SQL with MetabaseDownload Metabase for free, or sign up for a free trial of Metabase Cloud
What we’ll cover
- Filter rows by exact text matches with
=
(or exclude with!=
› - Ignore case and spaces with
UPPER()
,LOWER()
, andTRIM()
› - Include or exclude rows with
IN
andNOT IN
› - Find partial matches with
LIKE
and wildcards › - Match by position with
SUBSTRING()
› - Filter by multiple columns with
AND
andOR
› - Work with missing values using
IS NULL
andIS NOT NULL
› - Use regular expressions for advanced text filtering ›
- Handle empty strings vs NULL values ›
- Parameterize your filters with variables ›
You can filter rows with SQL’s WHERE
keyword. This guide covers common ways to filter tables by text columns (columns with string or varchar data types).
SQL to filter rows by exact match
To search for an exact, case-sensitive match, use the WHERE
clause with the =
operator. The term should be in single quotes.
SELECT
*
FROM
products
WHERE
title = 'Lightweight Wool Computer' -- note single quotes
If you want to exclude everything BUT products where the title is “Lightweight wool computer”, use !=
:
SELECT
*
FROM
products
WHERE
-- Get all the rows that don't match this term
title != 'Lightweight Wool Computer'
SQL to handle text casing with UPPER()
and LOWER()
If you want to ignore case, you can convert both sides of the comparison to the same case. Here’s how you’d find all products with a title that matches, no matter the capitalization:
SELECT
*
FROM
products
WHERE
LOWER(title) = LOWER('LIGHTWEIGHT WOOL COMPUTER')
SQL to ignore spaces with TRIM()
If your strings could include extra whitespace, you can ignore leading or trailing spaces in your text columns with TRIM()
:
SELECT
*
FROM
products
WHERE
-- Remove leading and trailing white space characters
TRIM(title) = 'Lightweight Wool Computer';
SQL to filter rows by multiple search terms with IN
To search for multiple terms, you can use =
and OR
(s), like this:
SELECT
*
FROM
products
WHERE
-- The equal operator is case sensitive
title = 'Lightweight Wool Computer'
OR title = 'Intelligent Paper Hat'
But it’s better to use IN
(it’s easier to read), like so:
SELECT
*
FROM
products
WHERE
-- List of exact terms to match, enclosed by parentheses
title IN (
'Lightweight Wool Computer', -- Terms separated by commas
'Intelligent Paper Hat' -- No comma at the end of the list
);
SQL to exclude rows that contain exact matches with NOT IN
You can also get all rows that are NOT the search terms with NOT IN
:
SELECT
*
FROM
products
WHERE
-- List of exact terms you want to exclude, enclosed by parentheses
title NOT IN (
'Lightweight Wool Computer', -- Terms separated by commas
'Intelligent Paper Hat'
);
SQL to filter rows that contain part of some text with LIKE
To filter rows by whether a column contains part of a text/string, use the LIKE
key word. For example, if we want to search for all of the Lightweight products (case sensitive):
| Title | Category |
|----------------------------|----------|
| Lightweight Paper Bottle | Gadget |
| Lightweight Wool Bag | Gadget |
| Lightweight Leather Gloves | Gadget |
You can use LIKE
, like so:
SELECT
title,
category
FROM
products
WHERE
-- Term is case sensitive
title LIKE 'Lightweight%'
LIKE
typically supports two wildcard characters:
%
is a wildcard operator to match zero or more characters._
matches a single character.
If your search term includes a literal %
or _
character, you can tell the db engine to treat it as a regular character with ESCAPE
. This code filters for titles like Wool %
:
SELECT
*
FROM
products
WHERE
title LIKE 'Wool \%' ESCAPE '\';
\
is a classic escape character. But if \
has special meaning in your data, you can supply different characters to the ESCAPE
function, e.g., ESCAPE !
.
You can also use the case conversion trick with LIKE
for partial matches:
SELECT
*
FROM
products
WHERE
-- Use UPPER (or LOWER) to make sure casing doesn't matter
UPPER(title) LIKE 'LIGHTWEIGHT%';
Leading wildcard operators trigger full table scans. Meaning, the database engine will have to check every row; it won’t be able to use any indexes set up on the column. So be careful with leading wildcard operators (like
%wool
). See SQL querying best practices.
SQL to filter part of a string based on position
To filter based on part of a string based on the position, use SUBSTRING()
. For example, to find products where the title starts with “Small”.
| Title | Category |
|-------------------------|------------|
| Small Marble Shoes | Doohickey |
| Small Marble Hat | Doohickey |
| Small Plastic Computer | Doohickey |
| ... | ... |
You could use SUBSTRING
:
SELECT
title,
category
FROM
products
WHERE
-- Filter for the first five letters in the string
-- FROM determines the starting point (index starts at 1)
-- FOR determines the number of characters from the starting point
-- Case sensitive
SUBSTRING(title FROM 1 FOR 5) = 'Small';
SQL to filter rows by multiple columns
You can filter by multiple columns in a single WHERE
clause with AND
and OR
.
To get results where results must be both “Lightweight” and in the Gizmo category:
| Title | Category |
|------------------------------|----------|
| Lightweight Linen Coat | Gizmo |
| Lightweight Linen Bottle | Gizmo |
| Lightweight Steel Knife | Gizmo |
| Lightweight Leather Bench | Gizmo |
You’d use a WHERE
clause with AND
:
SELECT
title,
category
FROM
products
WHERE
-- AND will return rows that satisfy both criteria
title LIKE 'Lightweight%'
AND category = 'Gizmo';
If you want rows where either the product is lightweight, or in the Gizmo category, like this:
| Title | Category |
|---------------------------|----------|
| Rustic Paper Wallet | Gizmo |
| Mediocre Wooden Table | Gizmo |
| Sleek Paper Toucan | Gizmo |
| Synergistic Steel Chair | Gizmo |
| Lightweight Paper Bottle | Gadget |
| ... | ... |
Use OR
:
SELECT
title,
category
FROM
products
WHERE
-- OR will return rows that satisfy either criteria
title LIKE 'Lightweight%'
OR category = 'Gizmo';
With parentheses, you can get fancy with conditionals. Here’s how you’d search for lightweight products in either the Gizmo
or Widget
categories.
SELECT
title,
category
FROM
products
WHERE
title LIKE 'Lightweight%'
-- Scope the OR conditional with parentheses
AND (
category = 'Gizmo'
OR category = 'Widget'
)
;
The parentheses here are required, otherwise you’d return rows where the wool product is in the Gizmo category, plus all Widget products, regardless of whether sheep were involved.
SQL to filter for, or exclude, rows with missing values
Sometimes you want to find rows where a text column is empty (null). You can do that with IS NULL
:
SELECT
*
FROM
products
WHERE
title IS NULL;
That query gives you all the products where the title is missing. If you want the opposite—rows where the title is present—use IS NOT NULL
:
SELECT
*
FROM
products
WHERE
title IS NOT NULL;
SQL to filter text with regular expressions
For surgically precise searches, some databases let you use regular expressions to search your data.
Let’s say (for some reason) we want to search for all products where the last word of the product title has exactly five letters, like this:
| Title | Category |
|------------------------------|------------|
| Small Marble Shoes | Doohickey |
| Synergistic Granite Chair | Doohickey |
| Enormous Aluminum Shirt | Doohickey |
| Enormous Steel Watch | Doohickey |
| Mediocre Wooden Table | Gizmo |
| ... | ... |
We can use a regular expression to filter the rows:
SELECT
title,
category
FROM
products
WHERE
-- You may need to use a different regex function name;
-- check which function name your database supports
REGEXP_LIKE (title, '\b[a-zA-Z]{5}\W*$', 'i');
\b[a-zA-Z]{5}\W*$
is alien for “last word in the string that has exactly 5 characters. Good luck learning regular expressions; most people just look up how to do specific regexes as needed. generative models are getting pretty good at regexes. Just for context, here’s what those symbols mean:
\b
: Word boundary, so you match the whole word, not part of another word.[a-zA-Z]
: Matches any uppercase or lowercase letter (ASCII).{5}
: “exactly five” of those letters.\W*
: Matches zero or more non-word characters (anything not a letter, number, or underscore), in case there is any punctuation at the end that could screw up our count.$
: The position at end of the string.
The i
in the third position of the function call stands for insensitive
, as in case-insensitive. Databases differ on whether you can use regexes, and what functions are available, so you’ll need to look up how your DB does it. Here are some examples:
- PostgreSQL:
~
,~*
- MySQL:
REGEXP
,RLIKE
, orREGEXP_LIKE()
In general, only use regexes when you can’t filter rows with the above methods. Regexes are only legible to telepathic entities, and databases won’t be able to take advantage of indexes to speed up query times.
SQL to handle empty strings vs NULL values
Empty strings (''
) and NULL
values are different in SQL, but people often confuse them:
- An empty string is an actual value that contains zero characters
NULL
represents the absence of any value (so, not even a blank string)
To find rows where a text column exists but is empty (a blank string ''
):
SELECT
title
FROM
products
WHERE
title = '';
To find rows where a column is either NULL
or empty:
SELECT
title
FROM
products
WHERE
title IS NULL OR description = '';
This distinction matters when you’re trying to find truly “missing” data versus fields that were intentionally left blank. Some database systems treat empty strings as NULL
, but it’s better to be explicit about what you’re looking for.
SQL filter by a variable
In Metabase (this only works in Metabase), you can parameterize SQL queries so people can plug values into the WHERE
clause. Here we create a variable title
(wrapped in double braces) and use CONCAT
to bolt on the %
wildcard operators to the variable:
SELECT
title,
category,
vendor
FROM
products
WHERE
-- You need CONCAT to build the string with the interpolated value
-- This variable syntax only works in Metabase
title LIKE CONCAT ({{title}}, '%');
That way people enter Lightweight
in the filter widget of a dashboard and the query will return only lightweight products.
Check out SQL parameters