Skip to main content

Google Analytics 4 (GA4)

This page contains the setup guide and reference information for the Google Analytics 4 (GA4) source connector.

Google Analytics 4 (GA4) is the latest version of Google Analytics, introduced in 2020. It offers a new data model that emphasizes events and user properties, rather than pageviews and sessions. This updated model allows for more flexibility and customization in reporting, and provides more accurate measurement of user behavior across various devices and platforms.

note

The Google Analytics Universal Analytics (UA) connector utilizes the older version of Google Analytics, which was the standard for tracking website and app user behavior before the introduction of GA4. Please note that the UA connector is being deprecated in favor of this one. As of July 1, 2023, standard Universal Analytics properties no longer process hits. For further reading on the transition from UA to GA4, refer to Google's official support page.

Prerequisites

  • A Google Analytics account with access to the GA4 Property(Property Ids) you want to sync

Setup guide

Step 1: Set up Google Analytics 4 (GA4)

Create a Service Account for authentication

  1. Sign in to the Google Account you are using for Google Analytics as an admin.
  2. Go to the Service Accounts page in the Google Developers console.
  3. Select the project you want to use (or create a new one) and click Continue.
  4. Click + Create Service Account at the top of the page.
  5. Enter a name for the service account, and optionally, a description. Click Create and Continue.
  6. Choose the role for the service account. We recommend the Viewer role (Read & Analyze permissions). Click Continue.
  7. Select your new service account from the list, and open the Keys tab. Click Add Key > Create New Key.
  8. Select JSON as the Key type. This will generate and download the JSON key file that you'll use for authentication. Click Continue.

Enable the Google Analytics APIs

Before you can use the service account to access Google Analytics data, you need to enable the required APIs:

  1. Go to the Google Analytics Reporting API dashboard. Make sure you have selected the associated project for your service account, and enable the API. You can also set quotas and check usage.
  2. Go to the Google Analytics API dashboard. Make sure you have selected the associated project for your service account, and enable the API.
  3. Go to the Google Analytics Data API dashboard. Make sure you have selected the associated project for your service account, and enable the API.

For Airbyte Cloud:

  1. Log into your Airbyte Cloud account.

  2. Click Sources and then click + New source.

  3. On the Set up the source page, select Google Analytics 4 (GA4) from the Source type dropdown.

  4. Enter a name for the Google Analytics 4 (GA4) connector.

  5. Select Authenticate via Google (Oauth) from the dropdown menu and click Authenticate your Google Analytics 4 (GA4) account. This will open a pop-up window where you can log in to your Google account and grant Airbyte access to your Google Analytics account.

  6. Enter the Property ID whose events are tracked. This ID should be a numeric value, such as 123456789. If you are unsure where to find this value, refer to Google's documentation.

    note

    If the Property Settings shows a "Tracking Id" such as "UA-123...-1", this denotes that the property is a Universal Analytics property, and the Analytics data for that property cannot be reported on using this connector. You can create a new Google Analytics 4 property by following these instructions.

  7. (Optional) In the Start Date field, use the provided datepicker or enter a date programmatically in the format YYYY-MM-DD. All data added from this date onward will be replicated. Note that this setting is not applied to custom Cohort reports.

  8. (Optional) In the Custom Reports field, you may optionally describe any custom reports you want to sync from Google Analytics. See the Custom Reports section below for more information on formulating these reports.

  9. (Optional) In the Data Request Interval (Days) field, you can specify the interval in days (ranging from 1 to 364) used when requesting data from the Google Analytics API. The bigger this value is, the faster the sync will be, but the more likely that sampling will be applied to your data, potentially causing inaccuracies in the returned results. We recommend setting this to 1 unless you have a hard requirement to make the sync faster at the expense of accuracy. This field does not apply to custom Cohort reports. See the Data Sampling section below for more context on this field.

caution

It's important to consider how dimensions like month or yearMonth are specified. These dimensions organize the data according to your preferences. However, keep in mind that the data presentation is also influenced by the chosen date range for the report. In cases where a very specific date range is selected, such as a single day (Data Request Interval (Days) set to one day), duplicated data entries for each day might appear. To mitigate this, we recommend adjusting the Data Request Interval (Days) value to 364. By doing so, you can obtain more precise results and prevent the occurrence of duplicated data.

  1. Click Set up source and wait for the tests to complete.

For Airbyte Open Source:

  1. Navigate to the Airbyte Open Source dashboard.

  2. In the left navigation bar, click Sources. In the top-right corner, click + New source.

  3. Find and select Google Analytics 4 (GA4) from the list of available sources.

  4. Select Service Account Key Authenication dropdown list and enter Service Account JSON Key from Step 1.

  5. Enter the Property ID whose events are tracked. This ID should be a numeric value, such as 123456789. If you are unsure where to find this value, refer to Google's documentation.

    note

    If the Property Settings shows a "Tracking Id" such as "UA-123...-1", this denotes that the property is a Universal Analytics property, and the Analytics data for that property cannot be reported on in the Data API. You can create a new Google Analytics 4 property by following these instructions.

  6. (Optional) In the Start Date field, use the provided datepicker or enter a date programmatically in the format YYYY-MM-DD. All data added from this date onward will be replicated. Note that this setting is not applied to custom Cohort reports.

note

If the start date is not provided, the default value will be used, which is two years from the initial sync.

caution

Many analyses and data investigations may require 24-48 hours to process information from your website or app. To ensure the accuracy of the data, we subtract two days from the starting date. For more details, please refer to Google's documentation.

  1. (Optional) Toggle the switch Keep Empty Rows if you want each row with all metrics equal to 0 to be returned.
  2. (Optional) In the Custom Reports field, you may optionally describe any custom reports you want to sync from Google Analytics. See the Custom Reports section below for more information on formulating these reports.
  3. (Optional) In the Data Request Interval (Days) field, you can specify the interval in days (ranging from 1 to 364) used when requesting data from the Google Analytics API. The bigger this value is, the faster the sync will be, but the more likely that sampling will be applied to your data, potentially causing inaccuracies in the returned results. We recommend setting this to 1 unless you have a hard requirement to make the sync faster at the expense of accuracy. This field does not apply to custom Cohort reports. See the Data Sampling section below for more context on this field.
  4. (Optional) In the Lookback window (Days) field, you can specify how many days in the past we should refresh the data in every run. Since attribution changes after the event date, and Google Analytics has a data processing latency this is key to keep up with consistent information. If you set it at 5 days, in every sync it will fetch the last bookmark date minus 5 days..
caution

It's important to consider how dimensions like month or yearMonth are specified. These dimensions organize the data according to your preferences. However, keep in mind that the data presentation is also influenced by the chosen date range for the report. In cases where a very specific date range is selected, such as a single day (Data Request Interval (Days) set to one day), duplicated data entries for each day might appear. To mitigate this, we recommend adjusting the Data Request Interval (Days) value to 364. By doing so, you can obtain more precise results and prevent the occurrence of duplicated data.

  1. Click Set up source and wait for the tests to complete.

Supported sync modes

The Google Analytics 4 (GA4) source connector supports the following sync modes:

Supported Streams

This connector outputs the following incremental streams:

Connector-specific features

Custom Reports

Custom reports in Google Analytics allow for flexibility in querying specific data tailored to your needs. You can define the following components:

  • Name: The name of the custom report.
  • Dimensions: An array of categories for data, such as city, user type, etc.
  • Metrics: An array of quantitative measurements, such as active users, page views, etc.
  • CohortSpec: (Optional) An object containing specific cohort analysis settings, such as cohort size and date range. More information on this object can be found in the GA4 documentation.
  • Pivots: (Optional) An array of pivot tables for data, such as page views by city, etc. More information on pivots can be found in the GA4 documentation.

A full list of dimensions and metrics supported in the API can be found here. To ensure your dimensions and metrics are compatible for your GA4 property, you can use the GA4 Dimensions & Metrics Explorer.

The following is an example of a basic User Engagement report to track sessions and bounce rate, segmented by city:

[
{
"name": "User Engagement Report",
"dimensions": ["city"],
"metrics": ["sessions", "bounceRate"]
}
]

By specifying a cohort with a 7-day range and pivoting on the city dimension, the report can be further tailored to offer a detailed view of engagement trends within the top 50 cities for the specified date range.

[
{
"name": "User Engagement Report",
"dimensions": ["city"],
"metrics": ["sessions", "bounceRate"],
"cohortSpec": {
"cohorts": [
{
"name": "Last 7 Days",
"dateRange": {
"startDate": "2023-07-27",
"endDate": "2023-08-03"
}
}
],
"cohortReportSettings": {
"accumulate": true
}
},
"pivots": [
{
"fieldNames": ["city"],
"limit": 50,
"metricAggregations": ["TOTAL"]
}
]
}
]

Data Sampling and Data Request Intervals

Data sampling in Google Analytics 4 refers to the process of estimating analytics data when the amount of data in an account exceeds Google's predefined compute thresholds. To mitigate the chances of data sampling being applied to the results, the Data Request Interval field allows users to specify the interval used when requesting data from the Google Analytics API.

By setting the interval to 1 day, users can reduce the data processed per request, minimizing the likelihood of data sampling and ensuring more accurate results. While larger time intervals (up to 364 days) can speed up the sync, we recommend choosing a smaller value to prioritize data accuracy unless there is a specific need for faster synchronization at the expense of some potential inaccuracies. Please note that this field does not apply to custom Cohort reports.

Refer to the Google Analytics documentation for more information on data sampling.

Performance Considerations

The Google Analytics connector is subject to Google Analytics Data API quotas. Please refer to Google's documentation for specific breakdowns on these quotas.

Data type map

Integration TypeAirbyte Type
stringstring
numbernumber
arrayarray
objectobject

Reference

Config fields reference

Field
Type
Property name
array<string>
property_ids
object
credentials
string
date_ranges_start_date
array<object>
custom_reports_array
integer
window_in_days
integer
lookback_window
boolean
keep_empty_rows
boolean
convert_conversions_event

Changelog

Expand to review
VersionDatePull RequestSubject
2.5.92024-09-2145773Update dependencies
2.5.82024-09-1445503Update dependencies
2.5.72024-09-0745289Update dependencies
2.5.62024-08-3144980Update dependencies
2.5.52024-08-2444645Update dependencies
2.5.42024-08-1744337Update dependencies
2.5.32024-08-1343929Increase streams max_time to backoff
2.5.22024-08-1243909Update dependencies
2.5.12024-08-1043289Update dependencies
2.5.02024-08-0742841Upgrade to CDK 3
2.4.142024-07-2742746Update dependencies
2.4.132024-07-2042347Update dependencies
2.4.122024-07-1341801Update dependencies
2.4.112024-07-1041561Update dependencies
2.4.102024-07-0941295Update dependencies
2.4.92024-07-0640935Update dependencies
2.4.82024-06-2540429Update dependencies
2.4.72024-06-2240140Update dependencies
2.4.62024-06-2139916Added ability to skip missing stream in the CATALOG
2.4.52024-06-0638884Make lookback window configurable.
2.4.42024-06-0639209[autopull] Upgrade base image to v1.2.2
2.4.32024-06-0338865Enforce unique property IDs
2.4.22024-03-2036302Don't extract state from the latest record if stream doesn't have a cursor_field
2.4.12024-02-0935073Manage dependencies with Poetry.
2.4.02024-02-0734951Replace the spec parameter from previous version to convert all conversions:* fields
2.3.02024-02-0634907Add new parameter to spec to convert conversions:purchase field to float
2.2.22024-02-0134708Add rounding integer values that may be float
2.2.12024-01-1834352Add incorrect custom reports config handling
2.2.02024-01-1034176Add a report option keepEmptyRows
2.1.12024-01-0834018prepare for airbyte-lib
2.1.02023-12-2833802Add CohortSpec to custom report in specification
2.0.32023-11-0332149Fixed bug with missing metadata when the credentials are not valid
2.0.22023-11-0232094Added handling for JSONDecodeError while checking for api qouta limits
2.0.12023-10-1831543Base image migration: remove Dockerfile and use the python-connector-base image
2.0.02023-09-2930930Use distinct stream naming in case there are multiple properties in the config.
1.6.02023-09-1930460Migrated custom reports from string to array; add FilterExpressions support
1.5.12023-09-2030608Revert : auto replacement name to underscore
1.5.02023-09-1830421Add yearWeek, yearMonth, year dimensions cursor
1.4.12023-09-1730506Fix None type error when metrics or dimensions response does not have name
1.4.02023-09-1530417Change start date to optional; add suggested streams and update errors handling
1.3.12023-09-1430424Fixed duplicated stream issue
1.3.02023-09-1330152Ability to add multiple property ids
1.2.02023-09-1130290Add new preconfigured reports
1.1.32023-08-0429103Update input field descriptions
1.1.22023-07-0327909Limit the page size of custom report streams
1.1.12023-06-2627718Limit the page size when calling check()
1.1.02023-06-2627738License Update: Elv2
1.0.02023-06-2226283Added primary_key and lookback window
0.2.72023-06-2127531Fix formatting
0.2.62023-06-0927207Improve api rate limit messages
0.2.52023-06-0827175Improve Error Messages
0.2.42023-06-0126887Remove authSpecification from connector spec in favour of advancedAuth
0.2.32023-05-1626126Fix pagination
0.2.22023-05-1225987Categorized Config Errors Accurately
0.2.12023-05-1126008Added handling for 429 - potentiallyThresholdedRequestsPerHour error
0.2.02023-04-1325179Implement support for custom Cohort and Pivot reports
0.1.32023-03-1023872Fix parse + cursor for custom reports
0.1.22023-03-0723822Improve rate limits customer faced error messages and retry logic for 429
0.1.12023-01-1021169Slicer updated, unit tests added
0.1.02023-01-0820889Improved config validation, SAT
0.0.32022-08-1515229Source Google Analytics Data Api: code refactoring
0.0.22022-07-2715087fix documentationUrl
0.0.12022-05-0912701Introduce Google Analytics Data API source