What is a join key in data studio?

If youre a Google Data Studio advanced user, chances are youve already used the data blending feature.

Its a great feature that allows you to enrich and unlock the potential of your data quickly. Especially if you dont have the time to pull data from multiple sources and combine them in spreadsheets.

However, data blending also comes with some limitations that could slow your report down at best and affect your data accuracy at worst.

In this article, we teamed up with two experts from our team, Bartosz Schneider and Evan Kaeding, to discuss the good and the bad of data blending in Data Studio, and how you can avoid all the headaches.

To make sure were on the same page, lets look at the basics first.

The basics of data joining

What is data joining?

Lets say youre managing an online store. Youre running paid ads across popular social media platforms. You want to know what channels bring in the most revenue. To do this, you need to combine your paid social data with data from Shopify.

Or you want to see how your ecommerce funnel looks. For example, what pages customers visited and what products they added to their shopping cart before purchasing. In this case, you can connect Google Analytics data with Shopify data.

Thats a rough description of data joining. Whenever you join data from multiple data sources into a single dataset, youre performing data joining. Data joining works when your joined data sources share at least one common dimension, or a join key.

Typically, business accumulates data from different sources. Without combining all the data, youre missing the whole picture of your performance. Data joining helps you:

  • Uncover valuable insights from separate data sources.
  • Discover meaningful relationships among data sets.
  • Make better data-informed decisions.

Different types of joins

  • Inner join means combining data from both sources matching it where the join keys are the same and dropping the data that doesnt match.
  • Outer join means taking all the data from both sources matching it where the join keys are the same. And finally, in the joined table, padding the non-matching columns with empty values.
  • Left join means taking all the data from the left table and the matching data from the right table where the join keys are the same.
  • Right join, similarly, means taking all the data from the right table and the matching data from the left table where the join keys are the same.
What is a join key in data studio?
What is a join key in data studio?

So, does data joining have anything to do with data blending?


Spoiler alert: It does. Data blending in Data Studio is a left outer join.

Data blending in Google Data Studio

By default, when you create a chart in Google Data Studio, youre pulling data from a single data source. However, you can connect multiple data sources and visualize them together in a chart or a table with data blending.

Data blending is a left outer join

To blend data, you need to choose:

  • A primary source: the first (or leftmost) data source that you add to the Blend data panel.
  • At least one secondary source: any data sources added to the right of the primary source
  • A join key

Since data blending in Data Studio is a left outer join, the blended data will include all data from the primary data source and matching data from secondary sources that share the same join key.

Lets take a look at the example below.

Here, Google Ads has conversion data from five countries: the United States, Germany, Finland, France, and Australia.

Facebook Ads has conversion data from seven countries: the United States, Germany, France, Ireland, India, Singapore, and Spain.

What is a join key in data studio?
What is a join key in data studio?

If you pick Google Ads as your primary data source, the blended result will show conversions from the United States, Germany, Finland, France, and Australia only. Since Ireland, India, Singapore, and Spain arent included in Google Ads (your primary source), the data will be excluded from the blended table.

You can also see Finland, which has data in the Google Ads table and not in the Facebook Ads table, will stay in the blended table. However, its Facebook Ads conversion value will be null.

What is a join key in data studio?
What is a join key in data studio?

Alternatively, if Facebook Ads is your primary source, your results will be conversion data from the United States, Germany, France, Ireland, India, Singapore, and Spain.

What is a join key in data studio?
What is a join key in data studio?
What is a join key in data studio?
What is a join key in data studio?
Any data that is not present in your primary table will be lost, and thats the first thing you have to watch out for. It makes sense to choose the longest table as the primary source (the left table for the blend). Otherwise, you may lose some important data during the process.
Bartosz Schneider, Lead Analytics Consultant, Supermetrics

In Google Data Studio, the first data source you bring into the Blend data view is your primary source. Changing the order of the data source is pretty straightforward. All you have to do is drag and drop the data source to the position you want.

What is a join key in data studio?
What is a join key in data studio?

How to create a blended data source

There are two approaches you can use to blend your data.

The first approach is quite quick and easy. If you have two tables with a common dimension, you can select both tables, right-click, and choose Blend data. Data Studio will quickly combine two tables into one. Then, automatically generate a blended data view based on the fields provided in the source tables.

What is a join key in data studio?
What is a join key in data studio?

The second approach requires more steps, but it gives you a little more control of your data.

To start, click on Resource Manage blended data.

What is a join key in data studio?
What is a join key in data studio?

Next, open your Blend data view by clicking on Add a data view.

Then, add the data sources you want to blend. Remember, the first data source you add to the view will become your primary source.

What is a join key in data studio?
What is a join key in data studio?

From here, you can choose the join keys, dimensions, and metrics you want to blend.

Tip: Give your blended data source a name so its easy to distinguish from other sources later on.

What is a join key in data studio?
What is a join key in data studio?

After youre happy with the setting, click Save.

Start building charts with your blended data source by adding it to the Data source field.

What is a join key in data studio?
What is a join key in data studio?

The limitations of data blending in Google Data Studio

Accuracy

Traditionally, when you join data in a spreadsheet, you can use different formulas to tell the computer precisely what data you want to retrieve. This lets you see whats happening with your data in each step. If an error occurs, you can always go back to the raw data and trace the problem.


However, with Data Studio, the join happens under the hood, so if the blended data has errors, you wouldnt know what caused the problem.

What is a join key in data studio?
What is a join key in data studio?
In Data Studio, you really have no way to check your joined data. So youre kind of flying blind. Its very hard to inspect. Theres no way to know what Google does, whether Data Studio is producing duplicates, or theres no way to understand if fields are dropped if they dont match.
Evan Kaeding, Senior Sales Engineer, Supermetrics

Supports only left outer join

As mentioned above, data joining in Google Data Studio is always a left outer join. This can be somewhat limiting if youre used to using different types of joins to enrich your data.

You have to be extra careful when blending data, especially the order in which you join them. One problem with the primary sources can harm the accuracy of your blended results.

Speed

You probably notice Google Data Studio can take its sweet time loading your reports. Things get worse when you bring data blending to the picture.

Whenever you create a blended data source, Google has to go through different APIs to retrieve data. And that process requires quite a bit of computational power.

The more blended data sources you add, the slower your dashboard will be.

A limited number of blended sources

Another frustrating limitation is that you can blend a maximum of five data sources. While this number sounds like a lot, it isnt. Occasionally, in many advanced and in-depth reports, you need to blend data from more than five sources. Youll easily cross the limit if you want to create a very detailed table with many columns.

So, should you just save yourself from all the trouble and avoid data blending?

In fairness, Google Data Studio does a splendid job with a simple and light blending. So if you want to blend one to two data sources with a simple join key like date, you can stick with Data Studio.

On the other hand, if youre looking to gain more control over your data and do more advanced blending, Google Sheets is the way to go.

Data blending in Google Sheets

When data blending in Google Data Studio becomes a bit of a hassle, you can blend your data in Google Sheets and bring it back together in Data Studio for reporting.

This approach gives you more flexibility with your data. You can take advantage of the Google Sheets formulas to enrich your data. Additionally, its much faster to load blended data from a Google Sheet than from several sources.

In addition, you can use Supermetrics to pull data into Google Sheets automatically. Youll have more time to do what youre good at analyzing the data and getting meaningful insights.

What is a join key in data studio?

Move your data into Google Sheets in minutes

Start a 14-day free Supermetrics trial. Full features. No credit card required.

Start free trial

Lets take a look at some tips for joining data in Google Sheets.

Manage your data in Google Sheets

It can get messy quickly when you bring data from different sources to Google Sheets for blending. A good way to stay organized with your data is to divide them into separate tabs.

What is a join key in data studio?
What is a join key in data studio?
Its a good idea to separate your data into three buckets raw data, blended data, and reporting data. So basically, you want to be able to make changes to both your raw layer and blended layer. But then you want to keep the final report as clean as possible.
Even Kaeding, Senior Sales Engineer, Supermetrics

The raw data tab is where you store all your unformatted raw data from your data sources. In this example report, we use Supermetrics to pull data from Facebook, Microsoft, and Google Ads into three separate tabs.

What is a join key in data studio?
What is a join key in data studio?

The blended data tab is where the magic happens. You can match your data together and perform some calculations to get more insights from your data.

What is a join key in data studio?
What is a join key in data studio?

The reporting data tab is where you put the last piece of the puzzle. When youre done enriching and transforming the data, you can present them in a separate tab where its easier to monitor.

Additionally, you can connect the reporting data tab to Google Data Studio to bring the final results to your dashboard. You can find the Google Sheets connector in the connector gallery.

What is a join key in data studio?
What is a join key in data studio?

Next, lets take a look at some functions you need to know when blending data in Google Sheets.

Three useful functions for joining data in Google Sheets

VLOOKUP

VLOOKUP is one of the most used functions for data joining. It lets you search for a value in one table and use it in another table.

The syntax for VLOOKUP is:

VLOOKUP (search_key, range, index, [is_sort])

  • search_key: the value you want to look up.
  • range: the range that contains the value you want to look up. Note that VLOOKUP will search from the first column in your range.
  • index: the column number (within your chosen range) that contains the returning value.
  • is_sort: this parameter is optional. Here, you can specify if you want to receive an exact match (FALSE) or the nearest match value (TRUE). In the case of data joining, youll want to set it to an exact match.

Youre telling Google Sheets what value you want to search for, where you want to search for it, the column number in the range that has the value to return, and finally, if you want to receive an exact match (FALSE) or the nearest match (TRUE).

Lets say you have two tables:

  • A marketing table with data about date, source, medium, campaign, impressions, cost, and clicks
  • A conversion table with data about date, source, medium, transactions, and revenue.
What is a join key in data studio?
What is a join key in data studio?

According to Bartosz, there are two steps to connecting the puzzles.

First, you need to create composite keys for two tables using the TEXTJOIN function. Each composite key can be used to uniquely identify each row of the table. Without the composite keys, youre likely to run into one-to-many relationships. Additionally, you can use them as join keys for VLOOKUP.

Your composite keys will include campaigns date, source, medium, and campaign (which means campaign name in this case). Itll look something like this.

What is a join key in data studio?
What is a join key in data studio?

Next, use VLOOKUP to join two tables. For example, the formula for combining transaction data with the marketing table is:

VLOOKUP($A4,$A$22:$J$33,6,0)

Tip: Using absolute reference makes it easier for Google to search for the value and for you to drag the formula across your spreadsheet.

Simply put, Google searches the first column for the composite keys and returns the corresponding transactions.

What is a join key in data studio?
What is a join key in data studio?

IF + REGEXPMATCH

The first step is to remap the campaign name to new values with an IF function (columns F and N). That new cleaned-up name is then used as a join-key to generate the metrics table on the right side of the sheet, where metrics from two sources are aggregated together where the previously remapped campaign name matches.

The function were looking at next is a nested function IF + REGEXPMATCH, where

  • IF starts a conditional evaluation.
  • REGEXPMATCH checks the target for a text match

Bartosz finds that this function comes in handy when he needs to remap campaign names from one or many different data sources.

Lets take a look at the table below. As you can see, it has different naming conventions, for example, Google Data Studio and googledatastudio or Enterprise and enterprise.

What is a join key in data studio?
What is a join key in data studio?

You can put all your Google Data Studio campaigns in one basket and Enterprise campaigns in one basket using this formula:

=IF(

REGEXMATCH(A7,"Data Studio|datastudio"),"Data Studio Campaigns",

IF(REGEXMATCH(A7,"Enterprise|enterprise"),"Enterprise campaigns"

))

In simpler terms, your function searches in column A7 for Data Studio or datastudio and returns Data Studio Campaigns. If there is no such value, search for Enterprise or enterprise and returns Enterprise campaigns.

You can remap campaign names from different sources and use them as your join key.

Conditional aggregation

In Google Sheets, you can use different aggregation functions to summarize your data calculating the sum, average, or counting the number of data points. However, in reality, you may not want to aggregate all the data you have. In that case, you can use conditional aggregation to specify which data you want to aggregate.

Conditional aggregation is a function that tells Google to perform data aggregation over a set of data when it meets certain criteria. Well take a look at some common conditional aggregation functions.

The SUMIF function tells Google to calculate the sum of the data that meets a predefined condition in a range. The syntax for the SUMIF function is:

SUMIF (range, criterion, [sum_range])

  • range: you want to specify the data range you want to apply the condition to.
  • criterion: you should specify the condition that defines which cells will be summed.
  • sum_range: you should specify the range to be summed if different from range. This is optional.

Take the table below as an example. Lets say you want to calculate the impressions from the US. You can do so by using SUMIF (B3:J12, US, D3:D12).

What is a join key in data studio?
What is a join key in data studio?

The AVERAGEIF function returns the average value of data that meets certain criteria in a range. The syntax for the AVERAGEIF function is:

AVERAGEIF (criteria_range, criterion, [average_range])

  • criteria_range: you should choose the data range you want to apply the condition to.
  • criterion: specify the condition that defines which cells will be averaged.
  • average_range: you should specify the range to be averaged if different from criteria_range. This is optional.

For example, if you want to calculate the average cost from the US, you can use AVERAGEIF(B3:J12, US, E3:E12).

What is a join key in data studio?
What is a join key in data studio?

Similarly, the COUNTIF function performs a conditional count over your data. The syntax for COUNTIF is:

COUNTIF (range, criterion)

  • range: the range you want to count.
  • criterion: the condition you want to apply.

For example, you want to count how many countries have CPC greater than 1. You can do so by using COUNTIF(H3:H12, >1).

What is a join key in data studio?
What is a join key in data studio?

Different ways to use data blending

There are many ways you can put data blending into practice. Well take a look at some examples in this section. Additionally, youll also find some ready-made templates with blended data that you can use right away.

Note that connecting the data sources to the templates will automatically start your 14-day free Supermetrics trial.

Compare your Facebook Ads vs. Google Ads performance

Google Ads and Facebook Ads are among the most popular advertising platforms. Even though it isnt exactly an exact comparison, combining Facebook Ads data with Google Ads data can tell you which types of campaigns work best on which channels.

For example, in the Google Ads vs. Facebook Ads dashboard below, you can easily see:

  • A side-by-side performance of each channel.
  • The split in your cost, impressions, clicks, and conversions between two channels.
  • Your best-performing campaigns by channels.
What is a join key in data studio?
What is a join key in data studio?

Swipe the Google Ads vs. Facebook Ads template >>

Organic social media

Managing your companys social media accounts isnt a walk in the park. For one thing, you have to manage at least three different accounts, all of which have different algorithms and requirements for content.

Blending data from social media platforms helps you manage your performance easily and stay on top of your social game.

For example, in the dashboard below, we combine data from four popular social media channels Facebook, Instagram, Twitter, and LinkedIn. This dashboard is great for:

  • Monitoring your performance across channels.
  • Comparing the performance of different channels.
  • Diving deeper into each channels performance.
What is a join key in data studio?
What is a join key in data studio?

Swipe the social media mix dashboard >>

You probably have performance marketing data in paid channel platforms and sessions and conversion data in Google Analytics. Blending paid ad data with web analytics data helps you understand which campaigns and channels drive high-quality traffic.

For example, in this paid channel mix dashboard, we blend paid data from Google, LinkedIn, Twitter, Facebook, Microsoft with Google Analytics data. With it, youll see:

  • A general overview of your paid campaigns.
  • A comparison of your performance with the previous period.
  • A table showing the key metrics of your best-performing channels.
What is a join key in data studio?
What is a join key in data studio?

Get the paid channel mix dashboard >>

Organic search vs. paid search analytics

Its not about organic search versus paid search. To grow your business, you need both. For example, performance marketers can look at the high-ranking search phrase and decide if it makes sense to bid on those keywords.

Similarly, content marketers can also use paid search data to fuel their content strategy.

In this organic search vs. paid search analytics template, our friends at OIKIO agency combine data from Google Ads and Google Search Console. It helps you drive conversions in both channels. With it, you can determine:

  • The Google Ads phrase performance that you havent included in your organic search.
  • The organic search phrase performance that you havent bid on with Google Ads yet.
  • The PPC/SEO ratio for a particular search phrase.
What is a join key in data studio?
What is a join key in data studio?

Swipe the organic search vs. paid search analytics template >>

Organic traffic and keyword analysis

It was pretty frustrating when Google removed search phrase data from Google Analytics. But worry not because theres a workaround. By combining data from Google Search Console with Google Analytics data, youll figure out which organic keywords bring in traffic to your website.

Lets take a look at this organic traffic and keyword analysis template by our friends at OIKIO. With this template, you can:

  • Figure out which search phrases bring in the most traffic.
  • Compare the performance of different landing pages.
What is a join key in data studio?
What is a join key in data studio?

Get the organic traffic and keyword analysis template >>

Over to you

Congrats, youve made it to the end of this article. Now pat yourself on the back.

After all, data blending helps you make the most of your data and uncover more meaningful insights.

If youre working with a small and manageable amount of data, you can totally take advantage of the data blending feature in Google Data Studio.

On the other hand, if youre handling a much bigger dataset and want to have better control over your data, Google Sheets is a better solution.

And remember, whenever you need help with moving your data to Google Sheets, you can start your 14-day Supermetrics free trial.