Metabase needs to know what’s in your database in order to show tables and fields, populate dropdown menus, and suggest good visualizations, but loading all the data would be very slow (or simply impossible if you have a lot of data). It therefore does three things:
Metabase periodically asks the database what tables are available, then asks which columns are available for each table. We call this syncing, and it happens hourly or daily depending on how you’ve configured it. It’s very fast with most relational databases, but can be slower with MongoDB and some community-built database drivers.
Metabase fingerprints the column the first time it synchronizes. Fingerprinting fetches the first 10,000 rows from each column and uses that data to guesstimate how many unique values each column has, what the minimum and maximum values are for numeric and timestamp columns, and so on. Metabase only fingerprints each column once, unless the administrator explicitly tells it to fingerprint the column again, or in the rare event that a new release of Metabase changes the fingerprinting logic.
A scan is similar to fingerprinting, but is done every 24 hours (unless it’s configured to run less often or disabled). Scanning looks at the first 5000 distinct records ordered ascending, when a field is set to “A list of all values” in the Data Model, which is used to display options in dropdowns. If the textual result of scanning a column is more than 10 kilobytes long, for example, we display a search box instead of a dropdown.
If the credentials Metabase is using to connect to the database don’t give it privileges to read the tables, the first sign will often be a failure to sync, which would then also stop fingerprint and scan.
How to detect this: You can’t see any of the tables in the database, or columns that have just been added to your data source don’t show up in Metabase.
How to fix this: This guide explains how to troubleshoot database connections. The relevant steps for solving this problem are:
Note that we only get the first 10,000 documents when scanning a MongoDB collection, so if you’re not seeing some new fields, those fields might not exist in the documents we looked at. Please see this discussion for more details.
How to detect this:
How to fix this:
Metabase syncs and scans regularly, but if the database administrator has just changed the database schema, or if a lot of data is added automatically at specific times, you may want to write a script that uses the Metabase API to force sync or scan to take place right away. Our API provides two ways to do this:
Using an endpoint with a session token: /api/database/:id/sync_schema or api/database/:id/rescan_values. These do the same things as going to the database in the Admin Panel and choosing Sync database schema now or Re-scan field values now respectively. In this case you have to authenticate with a user ID and pass a session token in the header of your request.
Using an endpoint with an API key: /api/notify/db/:id. This endpoint was made to notify Metabase to sync after an ETL operation finishes. In this case you must pass an API key by defining the MB_API_KEY environment variable.
How to detect this: Your script fails to run.
How to detect this: Sync and scan take a long time to complete.
You can “fix” this by disabling scan entirely by going to the database in the Admin Panel and telling Metabase, “This is a large database,” and then going to the Scheduling tab. However, sync is necessary: without it, Metabase won’t know what tables exist or what columns they contain.
Did this article help you?
Thanks for your feedback!