Menu Close

Custom expressions in the notebook editor

Jul 16, 2020 by The Metabase Team

In mathematics, expressions are collections of symbols that together express a value. If you’ve ever worked with spreadsheet software before, expressions are formulas, like =SUM(A1, B1).

Custom expressions in Metabase’s notebook editor are powerful tools that can cover the vast majority of analytics use cases without the need to take SQL out of the toolbox. In fact, there are big advantages to using Metabase’s notebook editor that you don’t get when you use SQL:

  • Drill-through. Drill-through allows people to filter, break out, zoom in, or x-rays the data on click. Metabase Enterprise adds even more functionality by allowing you to customize the drill-through experience so that you can determine what happens when people click through your charts. Drill-through is not available for questions composed in the SQL editor.

  • Extensibility. Using the notebook editor to build queries allows those with whom you share your questions to learn from and build on your questions without needing to know any SQL.

And you can always switch to SQL at any point during your development by converting an existing notebook editor question to a native SQL question.

Using custom expressions

There are three places in the notebook editor where we can use custom expressions:

  • Custom columns use functions to compute values (like + to add numbers) or manipulate text (like lower to change text to lowercase).
  • Custom filters use functions like contains that evaluate to either true or false.
  • Custom summaries use functions like count or sum to aggregate records.

Custom columns

We can add a custom column to our data using an expression to compute a new column.

Let’s see an expression in action. Here is the Orders Table from the Sample Data included with Metabase.

Sample dataset orders

Let’s say we want to know the discount percentage applied to orders based on the pre-tax subtotal.

For example, if we gave a $1 discount on a $10 order, we’d want to see a column show that we had discounted that order by 10%.

Unfortunately, a quick scan of the columns in the preview tells us that the database does not store that computation (i.e. there is no “discount percentage” column). We only have the subtotal of the order, and the discount total.

Thanks to math, however, we can use the discount total and order subtotal to compute the percentage. Here’s where expressions come into play: we can use an expression to compute the discount percentage for each row, and store that computed value in a new column.

Let’s walk through how to create a custom column.

When in the notebook editor, we select Add custom column in the Data section.

Add custom column

To calculate the discount percentage, we’ll need the original total (i.e. the subtotal + the discount), and then we’ll need to divide the discount by the original total to get the discount percentage.

In expressions, we reference columns using brackets. For example, we can refer to the Discount column in the Orders table as [Discount]. If we need to reference a column from another table linked by a foreign key, we can use a . between the table and the column, as in [Table.Column] (alternatively you can select [Table -> Column] from the dropdown menu that appears when you type an open bracket ([). For example, we could enter [Product.Category] which will resolve to: [Product → Category].

For now, we’re just interested in columns in the Orders table, so there’s no need to reference another table. Here’s the expression (or formula), we’ll use to compute our custom discount percentage column:

= [Discount] / ([Discount] + [Subtotal])

Enter that expression in the field formula, then give the new column a name: “Discount Percentage”.

Field formula

Click done, then click Visualize to see your new column.

Since the value in our new Discount percentage column concerns discounts, let’s move the column next to the Discount column. You can move columns around on tables by clicking on the column header and dragging the column to your target location, like so:

Changing column order

Since we’re computing a percentage, let’s fix the formatting so it’s easier to read. Click on the Discount percentage, and select Formatting.

Column formatting

Metabase will slide out a sidebar with formatting options.

Let’s change the Style to Percent, and bump up the number of decimal places to 2. And since the title Discount Percentage takes up a lot of space, let’s rename the column to Discount %.

There’s an option to add a mini bar chart as well. This bar chart won’t show the percentage with respect to 100%; instead the mini bar chart will show us the discount percentage relative to the discount percentage given to other orders. Let’s leave the mini bar chart off for now.

Here’s the finished question with the added Discount % column:

Discount percentage column

Custom filters

Metabase comes with a lot of filtering options out of the box, but you can design more sophisticated filters using custom filter expressions. These are particularly useful for creating filters that use OR statements, and that’s what we’ll be covering here.

Normally in the notebook editor, when we add multiple filters to our question, Metabase implicitly combines the filters with an AND operator. For example, if we add a filter for products that are Enormous and a filter for products that are Aerodynamic, our question will only return products that are both Enormous AND Aerodynamic, which (enormous, aerodynamic products) do not exist in Metabase’s Sample Dataset.

To filter for products that are either Enormous OR Aerodynamic, we’ll select Custom Expression from the Filter dropdown, and use the contains function to check if the product has either Enormous or Aerodynamic somewhere in the title.

contains(string1, string2)

contains checks to see if string1 contains string2 within it. So string1 is the string to check (the haystack), and string2 is the text to look for (the needle). And since we want to look for either Enormous or Aerodynamic products, we can write two contains expressions with an OR operator in between:

= contains([Title], "Enormous") OR contains([Title], "Aerodynamic")

Filter expression

The resulting data set will contain products that are either Enormous or Aerodynamic:

Enormous and aerodynamic products

Note that custom filter expressions must always resolve to either true or false. In the case of the contains function used above, the expression evaluates as true if the title has either Enormous or Aerodynamic in it, otherwise the expression evaluates as false.

You can, however, nest expressions that do not resolve to true or false within statements, like:

= contains(concat([First Name], [Last Name]), "Wizard")

because the outermost function (contains) resolves to either true or false. Whereas you couldn’t use concat([First Name], [Last Name]) as a filter, as it would resolve to a string of text (though you could use concat to create a custom column like Full Name).

Custom summaries

Custom expressions unlock many different ways to aggregate our data. Let’s consider the Share function, which returns the percent of rows in the data that match the condition, as a decimal. For example, say we want to know the total percentage of paper products in our product line, i.e. what share of our product line is composed of paper products?

To start, we’ll select the Products table from the Sample Dataset. Next, we’ll click on Summarize in the notebook editor, and select Custom Expressions. Then, we’ll select Share from the dropdown menu, which will prompt us for a condition. In this case, we want to know which products have “Paper” in their title, so we’ll use the contains function to search through Title.

= Share(contains([Title], "Paper"))

Share of paper products

Then we name our expression (e.g., Percentage of Paper Products) and click Done. Click Visualize, and Metabase will compute the share of paper products.

To change the formatting, select Settings in the bottom-left to bring up the settings sidebar, and change the Number options->Style to Percent.

Percentage of paper products

Putting it all together

Let’s create a fairly complex (contrived) question using expressions. Say we’ve been tasked to find the average net inflow for wool and cotton products by month in 2019, with net inflow being the selling price minus the cost we paid for the product. In other words: for each wool and cotton product unit sold, how much money on average did we make (or lose) per unit each month in 2019?

To the get these fascinating numbers, we’ll need to use expressions to:

  • Compute the selling price per unit (custom column)
  • Filter results to only include wool or cotton products (custom filter), and limit those results to 2019.
  • Compute the average net inflow (custom summary), and group by month.

Let’s go:

  1. We create a custom column, named Unit price. To compute the Unit price, we’ll use an expression to divide the subtotal by the number of units sold (Quantity):

    = [Subtotal] / [Quantity]
    
  2. Next, we’ll use a custom filter expression to filter for orders of Wool and Cotton products (i.e. for products that contain “Wool” or “Cotton” somewhere in their Product.Title).

  3. We’ll also filter for orders between 01/01/2019 and 12/31/2019.

  4. We’ll use a custom expression to create a custom summary. Let’s assume the standard retail markup of 50% (the keystone markup). So if the Product.Price is $2, we’ll assume the product cost us $1 to acquire per unit. Given this assumption, we can simply define net inflow per unit sold to be the Unit price minus half of the Product.Price. Then we’ll summarize that data by taking the average of those numbers for each order.

    = Average([Unit Price] - [Product → Price] / 2)
    
  5. Lastly, we’ll group those orders by Orders.Created_At by month.

Here’s our notebook:

Wool and cotton notebook

And lo, our chart:

Wool and cotton chart

Plus, since we composed this question using the notebook editor, we can drill through our data:

Drill through wool

Further reading