This article gives an overview of how to set up and scale self-service analytics. We’ll stay at a high-level and link out to more detailed guides on individual features, and our focus here will be on the administration of Metabase, particularly of people in your organization. For operational scaling — the actual running of the Metabase application — check out Metabase at scale.
Managing people in Metabase
The goal with democratizing data in your organization is to give people the information they need to make better decisions. And the best way to do that is to give everyone access to as much data as you can, while ensuring that they can’t see any data they’re not supposed to see.
To that end, we recommend you organize your Metabase with the goal of simplifying permissions. You probably collect sensitive data for various reasons (payments, taxes, liability, and so on), but that data isn’t relevant for business intelligence (and it’s probably mixed in with data that is relevant for analysis). The idea will be to put into place hard boundaries in key places so you don’t have to worry about who can see what, then set up lightweight systems — or even just conventions — that keep the place tidy as the number of people grows.
Here’s our recommendation for how to manage people in your Metabase to set up self-service analytics.
- Create groups in Metabase
- Assign permissions to those groups, including sandboxes
- Set up SSO in your Metabase
- Synchronize SSO and Metabase groups
- Tell people they can create a MB account just by logging in
- Use the Audit section to monitor how people are using Metabase
1. Create groups in Metabase
Metabase manages permissions using groups, which is far more efficient than managing permissions for each person. We recommend that you set up your Metabase so that the groups in your identity provider map to the same groups in Metabase. In general, your groups should map to departments in your org — and possibly also to project or mission teams that cut across departments. That way, once the new person in Accounting starts, they’ll log into Metabase and already have access to the same databases and collections that other people in Accounting have.
2. Assign permissions to those groups, including sandboxes
There are fundamentally two types of permissions in Metabase: data permissions, which determine access to databases, and collection permissions, which determine access to items in Metabase like questions and dashboards.
These permissions work at the table-level or Collection-level, but what if you need to restrict access to rows or columns?
Hiding irrelevant or technical data for all users
If there are fields (or entire tables) that aren’t very helpful or relevant, administrators can hide those in the Data Model section. Note that SQL queries are not affected by this setting — users with SQL editor access for a database can always access all tables and fields in that database.
Like it says in the description in figure 1, setting a field’s visibility to Do not include will exclude the field from menus and tables in GUI questions. The disadvantage of changing the field visibility at the data model level is that the action is global, so it’s not very flexible. If, however, you want to selectively grant access to rows or columns to different groups of people, you’ll need to sandbox the data.
Data sandboxing is an Enterprise feature that, when combined with SSO (either SAML or JWT), lets you restrict access to rows or columns based on user attributes. You can add these attributes manually in Metabase, or via your authentication service. You can set up row-level access keyed to a user attribute, or you can restrict access to columns by creating custom views of tables that exclude certain fields. Note that sandboxing only applies to GUI questions, which brings us to our next section:
SQL editor access and sandboxes are mutually exclusive
An important thing you need to know about permissions is that groups with SQL access to a database can access all data in the database. Even if a field is not visible in GUI menus or in the data browser, people with SQL access will still be able to query the tables (and all rows and columns) in the database. More specifically, they’ll be able to query any data available to the user account in the database that you used to connect Metabase to that database. Which brings us to a workaround:
Metabase doesn’t have table-level SQL permissions; you either grant a group SQL editor access to a database, or you don’t. But there’s a workaround: since you can set SQL editor permissions at the database level, you can create two (or more) connections to the same database, each with different connection strings for different user accounts in that database. For example, you could set up two connections, each with access to a different set of tables:
- Connection 1: access to the whole shebang
- Connection 2: access limited to tables A, B, and C
You can then grant most of your groups access to connection 2 (the less permissive one), and grant select users (like dedicated data analysts) access to connection 1 (the whole database). Metabase will treat these connections as if they were two separate databases, even though they are just two different access levels to the same database. From the perspective of each person, however, they’ll only see one database (the one their group has access to).
3. Set up SSO in your Metabase
While we know Metabase will always have a special place in your heart, it’s not the only piece of software you’re using of course. If your org is starting to grow, chances are you’re working with a single sign-on (SSO) identity provider like Okta, Auth0, or OneLogin that lets people authenticate once and get access to all the apps your org uses. Metabase integrates with services that use the SAML and JWT standards, which will give you fine-grained control over access to data.
There are currently four basic options for authentication in Metabase:
Open source edition
SAML (Security Assertion Markup Language) is an open protocol for exchanging data between identity and service providers using XML. JWT token is similar, though less formal — it’s a token, not a protocol. Both standards are used by identity providers like Okta and Auth0 to create authentication services (essentially a global password manager for the people in your organization). With Okta, for example, can sign in to your identity provider once, then they’ll be able to use all the services they have access to without having to constantly re-enter their login and password — or different logins and passwords. The identity provider (in this case Okta) will handle the handshakes with each service provider. To learn more, check out Auth0’s overview of SAML.
The big advantage with setting up SSO with SAML or JWT is that you can pass user attributes to Metabase, which allows you to sandbox data based on who the user is, limiting access to the rows and/or columns of a table.
4. Synchronize SSO and Metabase groups
Now that you’ve created groups, set permissions, and connected your SSO provider to Metabase, it’s time to synchronize groups. Check out our documentation to learn how to synchronize group membership with your identity provider. Note that you can also synchronize groups using LDAP (an option available for the free, open source edition of Metabase).
5. Tell people they can create a Metabase account just by logging in
At this point, you should have everything set up. People should be able to log in to Metabase and see everything they need to see, and no more.
6. Use the Audit section to monitor how people are using Metabase
Lastly, if you’re using the Enterprise version, you can explore the Audit logs to verify what people are looking at and confirm that your permissions work as you expect. You can see which dashboards and questions people view, the contents of SQL queries they run, and what data they download. It’s also a great way to see how your rollout is going, like how many new users are logging in over time, and how much stuff everyone’s looking at.
The Audit section is also useful for checking the performance of your workhorse dashboards and questions. One of the issues with democratizing data is that people of all skill ranges will be asking questions, and that can sometimes lead to some less efficient queries. You can use the Audit logs to find commonly-viewed items that are running slowly, then check out our posts for tips on making dashboards faster and best practices for writing SQL queries.
To learn more about how the auditing features work, check out our article on auditing users and data.