From VLOOKUP/XLOOKUP to Joins

An introduction to database joins based on Excel's VLOOKUP and XLOOKUP functions.

In spreadsheets software like Excel and Google Sheets, the XLOOKUP function (and its predecessor, VLOOKUP) connects data that has been broken up into different sheets or parts of a spreadsheet. If you’ve used XLOOKUP before, you might not have realized that you basically performed the same operation as a database join that combines two database tables. Here is a look at how XLOOKUP works, and how it’s related to simple database joins.

A look at XLOOKUP

These two lookup functions serve the same purpose: to find a record in a spreadsheet based on a lookup value. They differ in some relatively minor ways that we’ll ignore for the purposes of this discussion. XLOOKUP is the more flexible and powerful version of VLOOKUP, which is why we’ll only refer to XLOOKUP for the rest of this article.

Product and Order sheets

For example, in the image above, the left sheet contains a list of orders, while the spreadsheet on the right contains information about products. Each order only has one kind of product, but the quantity ordered can vary. We want to compute the total for each product. The result will be another spreadsheet, which looks like this.

The resulting sheet of order totals

The Orders and Products sheets have some data in common, a list of Product IDs that allows us to link products to orders.

Product and Order spreadsheets with ProductID columns highlighted

In the Products sheet, the values in the ProductID column identify rows in the Products sheet itself. In the Orders sheet, the values in the ProductID column refer to rows in a different sheet, in this case the Products sheet.

To compute the total for each order, we have to look up the product in the Products table, get its price from the Price column, and then multiply it by the entry in the order’s Quantity column.

Looking up using XLOOKUP

The XLOOKUP function takes several arguments:

  • lookup_value: the value to look up (which column)
  • lookup_array: where to look it up (which table)
  • return_array: what to return
  • (Plus a few optional arguments, which we’ll ignore here)

In our case, the value we want to look up is a value in the ProductID column of our Orders table. Where to look it up is the first column of the Products sheet which is also called ProductID.

XLOOKUP between the Orders and Products tables

XLOOKUP scans the column until it finds a matching ProductID, then returns a value from the “what to return” argument from the matching row. We’ll specify the Price column as “what to return”, so XLOOKUP will return the value in the Price column of the row with the matching ProductID. XLOOKUP then inserts this price into our Totals sheet.

Inserting the found value into our Order Totals table

At this point, we can get our order total with standard Excel or Google Sheets functions. We simply multiply the order’s Quantity by its Price.

To perform this operation on the entire table, we can simply copy the formula across all rows, again just like you would in any spreadsheet. One thing to be careful with is to make sure the lookup and return ranges don’t slip as you copy the formula down the column. For details on how to use XLOOKUP in Excel, see the XLOOKUP documentation.

Using a JOIN instead

In a database, the same operation we just performed is done with a join. A join works basically the same way as XLOOKUP, but operates on the entire table at once. Under the hood, a join “looks up” each row in the orders table, just like XLOOKUP.

If we have a database with the same tables as in our spreadsheet example above, we can create the same totals table. We’ll first use Metabase’s query builder, and then look at how to do it using a SQL query.

Joining tables using the query builder

Joining Orders and Product tables using the query builder

Here, we’re picking the two tables to be joined on the left, Orders and Products. Then we define how to join them on the right, which is done by comparing the respective ProductID columns in each.

We can then create a custom column that multiples the values in the Quantity column of the Orders table by the Price in Products.

We end up with the same result as using XLOOKUP above, except now coming from our database.

The result of using a join on a database

Specifying the join in SQL

SQL, the Structured Query Language, is the native way of talking to a database. For many questions, a visual editor like the one above is the way to go, but for some advanced queries, SQL is necessary. The example here just serves to show how a simple SQL query works.

The query consists of three parts:

  • select the relevant fields from the Orders table (OrderID, Quantity, etc.)
  • multiply the Price value by the Quantity (in the line ending in AS Total)
  • create the join using the LEFT JOIN statement and specify which fields have to match
SELECT
  Orders.OrderID,
  Orders.ProductID,
  Orders.Quantity,
  Products.Price * Orders.Quantity AS Total,
  Products.Name

FROM
  Orders
 
LEFT JOIN Products ON Orders.ProductID = Products.ProductID

The details of the join aside, the main takeaway is that we’ve specified the same elements as in XLOOKUP: the join is defined using one column each from two different tables, and we’re specifying which value to take from the matched row for further operations.

Joins vs. XLOOKUP

There are certainly differences between database joins and XLOOKUP in spreadsheets. There are different types of joins, for example, and we’re only covering left outer joins here. Joins can also match on multiple fields, use functions as part of the matching, etc.

However, for the purpose of combining data from two tables that are matched on a single column, XLOOKUP and (left outer) joins perform essentially the same operation.

Where databases really shine is handling more complex operations. Perhaps you want to compute the sum of all totals per year or month, or compute the difference between values and a rolling average. In a database, aggregation and other operations can be done in the same query as a join.

Was this helpful?

Thanks for your feedback!
Weekly tips for analysts
Get actionable insights
on AI and data directly to your inbox