Menu Close Log in Get started

Lesson

Working with the Metabase API

An introduction to Metabase's API.

This article explains how to automate tasks using Metabase’s API. We use that API ourselves to connect the front end and the back end, so you can script almost everything that Metabase can do.

Warning: the Metabase API can change

There are two caveats before you start:

  1. The API is subject to change. The API is tightly coupled with the front end and is subject to change between releases. The endpoints likely won’t change that much (existing API endpoints are changed infrequently, and removed rarely), but if you write code to use the API, you might have to update it in the future.
  2. The API is not versioned. Meaning: it can change version to version, so don’t expect to stay on a particular version of Metabase in order to use a “stable” API.

Getting started with the Metabase API

To keep things simple, we’ll use the venerable command line utility curl for our API call examples; you could also consider a dedicated tool for developing API requests (like Postman). To follow along, you can spin up a fresh local instance of Metabase and play around.

Authenticate your requests with a session token

You’ll need a session token to authenticate your requests, or Metabase will refuse to talk with you. To get a session token, submit a request to the /api/session endpoint with your username and password:

curl -X POST \
  -H "Content-Type: application/json" \
  -d '{"username": "person@metabase.com", "password": "fakepassword"}' \
  http://localhost:3000/api/session

If you’re working with a remote server, you’ll need replace localhost:3000 with your server address. This request will return a JSON object with a key called id and the token as the key’s value, e.g.:

 {"id":"38f4939c-ad7f-4cbe-ae54-30946daf8593"}

You’ll need to include that session token in the headers of your subsequent requests like this:

"X-Metabase-Session: 38f4939c-ad7f-4cbe-ae54-30946daf8593"

Some things to note about sessions:

  • By default, sessions are good for 14 days. You can configure this session duration by setting the environment variable MAX_SESSION_AGE (value is in minutes).
  • You should cache credentials to reuse them until they expire, because logins are rate-limited for security.
  • Invalid and expired session tokens return a 401 (Unauthorized) status code.
  • Handle 401 status codes gracefully. We recommend writing your code to fetch a new session token and automatically retry a request when the API returns a 401 status code.
  • Some endpoints require that the user be an admin, also known as a superuser. Endpoints that require admin or superuser status (admin = superuser) generally say so in their documentation. They will return a 403 (Forbidden) status code if the current user is not an admin.
  • If you would like an alternative authentication mechanism feel free to upvote this feature request.

Example GET request

Here’s an example API request (note the session token) that hits the /api/user/current endpoint, which returns information about the current user:

curl -X GET \
  -H "Content-Type: application/json" \
  -H "X-Metabase-Session: 38f4939c-ad7f-4cbe-ae54-30946daf8593" \
  http://localhost:3000/api/user/current

The above request returns a JSON object (formatted for readability):

 {
    "email": "person@metabase.com",
    "ldap_auth": false,
    "first_name": "Human",
    "locale": null,
    "last_login": "2020-08-31T13:08:50.203",
    "is_active": true,
    "is_qbnewb": false,
    "updated_at": "2020-08-31T13:08:50.203",
    "group_ids": [
        1,
        2
    ],
    "is_superuser": true,
    "login_attributes": null,
    "id": 1,
    "last_name": "Person",
    "date_joined": "2020-08-19T10:50:46.547",
    "personal_collection_id": 1,
    "common_name": "Human Person",
    "google_auth": false
}

Example POST request

You can also use a file to store the JSON payload for a POST request. This makes it easy to have a pre-defined set of requests you want to make to the API.

curl -H @header_file.txt -d @payload.json http://localhost/api/card

Here’s an example of a JSON file (the @payload.json in the command above) that creates a question:

{
  "visualization_settings": {
    "table.pivot_column": "QUANTITY",
    "table.cell_column": "SUBTOTAL"
  },
  "description value": "A card generated by the API",
  "collection_position": null,
  "result_metadata": null,
  "metadata_checksum": null,
  "collection_id": null,
  "name": "API-generated question",
  "dataset_query": {
    "database": 1,
    "query": {
      "source-table": 2
    },
    "type": "query"
  },
  "display": "table"
}

That request generated the question shown in figure 1.

<em>Fig. 1</em>. A question in Metabase generated by the API: a list of the <code>Orders</code> table in the Sample Dataset
Fig. 1. A question in Metabase generated by the API: a list of the Orders table in the Sample Dataset

Use developer tools to see how Metabase makes requests

If the auto-generated API docs are unclear, you can use the developer tools that ship with browsers like Firefox, Chrome, and Edge to view Metabase’s requests and responses (figure 2).

<em>Fig. 2</em>. Using Firefox's network tab to inspect the JSON request payload that Metabase sent when a user clicked to save a question created in the Notebook Editor.
Fig. 2. Using Firefox's network tab to inspect the JSON request payload that Metabase sent when a user clicked to save a question created in the Notebook Editor.

In the Metabase application, perform the action that you’d like to script, such as adding a user or creating a dashboard. Then use the developer tools in your browser to view the request Metabase made to the server when you performed that action.

A few things you can do with the Metabase API

Provision a Metabase instance

In addition to using environment variables, you can use the Metabase API to setup an instance of Metabase. Once you have installed Metabase using your preferred method, and the Metabase server is up and running, you can create the first user (as an Admin) by posting to a special endpoint, /api/setup. This /api/setup endpoint:

  • Creates the first user as an Admin (superuser).
  • Logs them in.
  • Returns a session ID.

You can then configure settings using the /api/settings endpoint, set up email using the /api/email endpoint, and use the /api/setup/admin_checklist endpoint to verify your setup progress. See figure 3 for the graphical representation of the checklist in the Admin panel.

<em>Fig. 3</em>. Admin checklist for setting up Metabase to make the most of your application.
Fig. 3. Admin checklist for setting up Metabase to make the most of your application.

Add a data source

You can add a new database using the POST /api/database/ endpoint, and validate that database’s connection details using the /api/setup/validate endpoint. Once you’ve connected the database to your Metabase instance, you can rescan the database and update the schema metadata. You can even add our trusty Sample Dataset as a new database to your instance with POST /api/database/sample_dataset.

Here’s an example database creation call for a Redshift database.

curl -s -X POST \
    -H "Content-type: application/json" \
    -H "X-Metabase-Session: ${MB_TOKEN}" \
    http://localhost:3000/api/database \
    -d '{
        "engine": "redshift",
        "name": "Redshift",
        "details": {
            "host": "redshift.aws.com",
            "port": "5432",
            "db": "dev",
            "user": "root",
            "password": "password"
        }
    }'

Set up users, groups, and permissions

You can use the /api/user endpoints to create, update, and disable users, or the /api/permissions endpoints to set up groups or add users to them. Here’s an example curl command to create a user:

curl -s "http://localhost:3000/api/user" \
    -H 'Content-Type: application/json' \
    -H "X-Metabase-Session: ${MB_TOKEN}" \
    -d '{
    "first_name":"Basic",
    "last_name":"User",
    "email":"basic@somewhere.com",
    "password":"Sup3rS3cure_:}"
}'

Generate reports

In Metabase, “reports” are referred to as dashboards. You can interact with dashboards using the /api/dashboard endpoint. You can create a new dashboard with POST /api/dashboard/, and add a saved question to a dashboard with [POST/api/dashboard/:id/cards].

Useful endpoints

The links in the Endpoint column below will take you to you to the first action available for that endpoint, which alphabetically is usually the DELETE action. You can scroll down in the API documentation to see the full list of actions and URLs for that endpoint, and view descriptions of each.

Domain Description Endpoint
Collections Collections are a great way to organize your dashboards, saved questions, and pulses. /api/collection
Dashboards Dashboards are reports that comprise a set of questions and text cards. /api/dashboard
Databases Fetch databases, fields, schemas, primary (entity) keys, lists of tables, and more. /api/database
Email Update emails settings and send test emails. /api/email
Embedding Use signed JWTs to fetch info on embedded cards and dashboards. /api/embed
Metrics Metrics are saved computations (like Revenue). Create and update metrics, return related entities, revert to prior versions, and more. /api/metric
Permissions Metabase manages permissions to databases and collections with groups. Create permission groups, add and remove users to groups, retrieve a graph of all permissions groups, and more. /api/permissions
Search Search cards (questions), dashboards, collections and pulses for a substring. /api/search
Segments Segments are named sets of filters (like “Active Users”). Create and update segments, revert to previous versions, and more. /api/segment
Sessions Reset passowrds with tokens, login with Google Auth, send passord reset emails, and more. /api/sessions
Settings Create/update global application settings. /api/setting
Queries Use the API to execute queries and return their results in a specified format. /api/dataset
Questions Questions (known as cards in the API) are queries and their visualized results. /api/card

There are some other cool endpoints to check out, like api/database/:virtual-db/metadata, which is used to “fool” the frontend so that it can treat Saved Questions as if they were tables in a virtual database. This is how Metabase lets you use Saved Questions as if they were data sources.

The documentation contains a complete list of API endpoints along with documentation for each endpoint, so dig around and see what other cool endpoints you can find.

The endpoint reference is periodically updated with new versions of Metabase. You can also generate the reference by running:

java -jar metabase.jar api-documentation

Running Custom Queries

Queries written with our Notebook Editor are saved in our custom JSON-based query language, MBQL. You can view the definitive grammar for MBQL, as well as an (incomplete) MBQL reference document to learn about some of the design philosophy behind MBQL.

To familiarize yourself with MBQL, we recommend using the Metabase application to build a custom question, then use your browser’s developer tools to see how Metabase formatted the request body with the query.

Examples in three languages

Curl is a handy tool for exploring APIs, but if you are integrating Metabase into a large data ecosystem, you will probably use something else. To show how you can access the API with Python, R, and Node.js, let’s create two questions. The first, shown in figure 4, finds the average pre-tax value of orders over $100 grouped by category. It is shared publicly—this tutorial explains how to do that. The second question, shown in figure 5, counts the number of people in the database. It is not shared: we have included it to show how to distinguish shared from unshared questions.

<em>Fig. 4</em>. The notebook of a public question calculating the average value of orders over $100 by product category.
Fig. 4. The notebook of a public question calculating the average value of orders over $100 by product category.
<em>Fig. 5</em>. The notebook of a non-public question calculating the number of people in the database.
Fig. 5. The notebook of a non-public question calculating the number of people in the database.

Python

Our first example is written in Python. Like most data science programs it uses the requests library to send HTTP requests and Pandas to manage tabular data, so we start by importing those:

import requests
import pandas as pd

The next step is to get a session token to authenticate all future requests. (The example uses my credentials—you can replace them with your username and password, but be sure that you don’t commit those values to a version control repository.) To get the token out of the result, we convert the latter to JSON and look up the ID. We store this in a dictionary with the right key for future use:

response = requests.post('http://localhost:3000/api/session',
                         json={'username': 'greg@metabase.com',
                               'password': 'database1'})
session_id = response.json()['id']
headers = {'X-Metabase-Session': session_id}

We can now ask Metabase which questions have public IDs, i.e., which ones have been shared so that we can invoke them remotely. As the code below suggests, when we ask for all cards we get a list with some information about all of the questions; only the ones with a public_uuid field are callable:

response = requests.get('http://localhost:3000/api/card',
                         headers=headers).json()
questions = [q for q in response if q['public_uuid']]
print(f'{len(questions)} public of {len(response)} questions')

Sure enough, the output tells us that there are two questions, but only one is public:

1 public of 2 questions

Let’s get some information about that public question and print its title:

uuid = questions[0]['public_uuid']
response = requests.get(f'http://localhost:3000/api/public/card/{uuid}',
                        headers=headers)
print(f'First title: {response.json()["name"]}')
First title: Average value of orders over $100 grouped by category

Finally, we can pull down data from the first question in the list. The 'data' key in the JSON response has a lot of information; what we’re most interested in are the values under the sub-key 'rows', which stores the result table in the usual list-of-lists form. Let’s convert that to a Pandas dataframe and print it:

response = requests.get(f'http://localhost:3000/api/public/card/{uuid}/query',
                        headers=headers)
rows = response.json()['data']['rows']
data = pd.DataFrame(rows, columns=['Category', 'Average'])
print('First data')
print(data)
First data
    Category     Average
0  Doohickey  114.679742
1     Gadget  123.530916
2      Gizmo  120.897286
3     Widget  122.078721

R with the Tidyverse

The R version of our example has the same structure as the Python version. Like most data scientists we use the tidyverse family of libraries, so let’s load those along with httr for managing HTTP requests, jsonlite for parsing JSON, and glue for string formatting:

library(tidyverse)
library(httr)
library(jsonlite)
library(glue)

Once again we get a session ID and save it for future use:

data <- POST(
    'http://localhost:3000/api/session',
    body = list(username = 'greg@metabase.com', password = 'database1'),
    encode = 'json'
  ) %>%
  content(as = 'text', encoding = 'UTF-8') %>%
  fromJSON()
session_id <- data$id
headers <- add_headers('X-Metabase-Session' = session_id)

We then get information about all of the questions and ask which ones are public:

data <- GET('http://localhost:3000/api/card', headers) %>%
  content(as = 'text', encoding = 'UTF-8') %>%
  fromJSON()
num_questions <- data %>%
  nrow()
num_public <- data %>%
  pull(public_uuid) %>%
  discard(is.na) %>%
  length()
glue('{num_public} public of {num_questions} questions')
1 public of 2 questions

Displaying the title of the first public card gives the same result as it did with Python, which is reassuring:

uuid <- data %>%
  pull(public_uuid) %>%
  discard(is.na) %>%
  first()
data <- glue('http://localhost:3000/api/public/card/{uuid}') %>%
  GET(headers) %>%
  content(as = 'text', encoding = 'UTF-8') %>%
  fromJSON()
glue('First title: {data$name}')
First title: Average value of orders over $100 grouped by category

And the data associated with that card is the same as well once we convert it to a tibble, though R’s default display doesn’t give us as many decimal places:

data <- glue('http://localhost:3000/api/public/card/{uuid}/query') %>%
  GET(headers) %>%
  content(as = 'text', encoding = 'UTF-8') %>%
  fromJSON()
rows <- data$data$rows
colnames(rows) <- c('Category', 'Average')
rows <- rows %>% as_tibble()
rows$Average <- as.numeric(rows$Average)
glue('First data')
rows
First data
# A tibble: 4 x 2
  Category  Average
  <chr>       <dbl>
1 Doohickey    115.
2 Gadget       124.
3 Gizmo        121.
4 Widget       122.

Node.js

JavaScript is an increasingly popular language for server-side scripting, but unlike Python and R, it doesn’t have a single predominant library for data tables. For large projects we are fond of data-forge, but for small examples we stick to Dataframe-js. We also use got for HTTP requests instead of the older request package, as the latter has now been deprecated. Finally, since we find async/await syntax a lot easier to read than promises or callbacks, we put all of our code in an async function that we then call immediately:

const got = require("got");
const DataFrame = require("dataframe-js").DataFrame;

const main = async () => {
  // ...program goes here...
};

main();

Once again we start by authenticating ourselves:

// Get a session token to authenticate all future requests.
let response = await got.post("http://localhost:3000/api/session", {
  json: { username: "greg@metabase.com", password: "database1" },
  responseType: "json",
});
session_id = response.body.id;
headers = { "X-Metabase-Session": session_id };

We then ask for the complete list of questions and filter them to select the public ones:

response = await got.get("http://localhost:3000/api/card", {
  responseType: "json",
  headers: headers,
});
// filter for public questions
questions = response.body.filter((q) => q.public_uuid);
console.log(`${questions.length} public of ${response.body.length} questions`);
1 public of 2 questions

The first public card still has the title we’ve seen before:

const uuid = questions[0].public_uuid;
response = await got.get(`http://localhost:3000/api/public/card/${uuid}`, {
  responseType: "json",
  headers: headers,
});
console.log(`First title: ${response.body.name}`);
First title: Average value of orders over $100 grouped by category

When we pull down its data we get the same values, though the numbers are shown in yet another slightly different way:

response = await got.get(
  `http://localhost:3000/api/public/card/${uuid}/query`,
  {
    responseType: "json",
    headers: headers,
  }
);
const rows = response.body.data.rows;
const df = new DataFrame(rows, ["Category", "Average"]);
df.show();
| Category  | Average   |
------------------------
| Doohickey | 114.67... |
| Gadget    | 123.53... |
| Gizmo     | 120.89... |
| Widget    | 122.07... |

Have fun

If you have found this tutorial interesting, you can spin up a local instance of Metabase, experiment with the API, and have fun! If you get stuck, check out our forum to see if anyone’s run into a similar issue, or post a new question.

Thanks for your feedback!

Get articles like this one in your inbox every month