Tools / Crawler / externalDataSources
Type: object[]
Parameter syntax
{
  externalDataSources: [
    {
      dataSourceId: 'your_data_source_id',
      type: 'googleanalytics'|'csv',
      // if type is "googleanalytics"
      metrics: ['ga:metric'],
      startDate: 'startdate',
      endDate: 'enddate',
      samplingLevel: 'DEFAULT',
      credentials: {
        type: 'service_account',
        client_email: 'client_email',
        private_key:
          'privatekey',
        viewIds: ['view_id'],
      },
    },
  ],
}

About this parameter

Defines external data sources you want to retrieve during every crawl and make available to your extractor function.

There are two supported data sources: Google Analytics and CSV files.

Once you set up an externalDataSource, it’s exposed in your recordExtractor. You can access it through the dataSources object, which has the following structure:

1
2
3
4
{
  dataSourceId1: { data1: 'val1', data2: 'val2' },
  dataSourceId2: { data1: 'val1', data2: 'val2' },
}

You can add a maximum of 10 sources, which combined can provide a maximum of 11 millions URLs.

Examples

Adding a CSV to your externalDataSources

1
2
3
4
5
6
7
8
9
10
11
12
externalDataSources: [
  {
    dataSourceId: 'myPageviews',
    type: 'csv',
    url: 'http://www.example.com/pageviews.csv',
  },
  {
    dataSourceId: 'myCSV',
    type: 'csv',
    url: 'http://www.example.com/website-data.csv',
  },
],

Adding a GoogleAnalyticsto your externalDataSources

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
{
  externalDataSources: [
    {
      dataSourceId: 'myAnalytics',
      type: 'googleanalytics',
      metrics: ['google_analytics_metric1', 'google_analytics_metric2', ...],
      startDate: 'start_date',
      endDate: 'end_date',
      samplingLevel: 'DEFAULT',
      credentials: {
        type: 'account_type',
        client_email: 'example@my-project.iam.gserviceaccount.com',
        private_key:
          'your_google_analytics_private_key',
        viewIds: ['target_view_id1', 'target_view_id2', ...],
      },
    },
  ],
}

Parameters

externalDataSource

An external data source object from the provided list.

externalDataSource ➔ dataSource

dataSourceID
type: string
Required

Each external data source must have a unique identifier dataSourceId that is needed to access the corresponding data from the extractors. Other properties are used to connect to the data source.

type
type: string
Required

Type of data source. We support the following values: 'googleanalytics', 'csv'.

metrics
type: string[]
Required if type is 'googleanalytics'

List of metrics to fetch from Google Analytics for each URL. E.g., 'ga:uniquePageViews'. See the full list of supported metrics on Google Analytics’ API reference.

We make the value of each metric available through the recordExtractor’s dataSources object.

Note: We automatically include the 'ga:uniquePageViews' metric.

startData
type: string
default: 7daysAgo
Optional

The start date of the period from which the analytics should be fetched. Its format should comply with ISO 8601. Google Analytics also supports values like '365daysAgo'.

endDate
type: string
default: today
Optional

The end date of the period for which the analytics should be fetched. Its format should comply with ISO 8601. Google Analytics also supports values like '365daysAgo'.

samplingLevel
type: string|enum
default: DEFAULT
Optional

The Google Analytics sampling level to use for your analytics data. We support the following values: 'DEFAULT', 'SMALL', 'LARGE'.

credentials
type: Object
default: false
if type is `googleanalytics`

Your Google Analytics credentials.

externalDataSource ➔ credentials ➔ credentials

type
type: string
Required

Type of authentication mechanism. So far, service_account is the only supported value for this property.

client_email
type: string
Required

Client email provided after creating a service account. It must have been given read permissions to the Google Analytics view(s).

private_key
type: string
Required

Private key provided after creating a service account.

viewIds
type: string[]
default: all credential accessible views
Optional

List of Google Analytics view identifiers to fetch data from.

Did you find this page helpful?