Skip to main content

Documentation Index

Fetch the complete documentation index at: https://growthbook-preview.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

In order to use GrowthBook for experimentation, you need to connect to a data source. There are two main ways to configure GrowthBook: via the UI, or via a config.yml file. The default way to define these is by filling out forms in the GrowthBook UI, which persists them to MongoDB. The other option is to create a config.yml file, which allows for version control and easier management of your configuration. In the Docker container, this file must be placed at /usr/local/src/app/config/config.yml. Below is an example file:
datasources:  
  warehouse:  
    type: postgres  
    name: Main Warehouse  
    # Connection params (different for each type of data source)  
    params:  
      host: localhost  
      port: 5432  
      user: root  
      password: ${POSTGRES_PW} # use env for secrets  
      database: growthbook  
    # How to query the data (same for all SQL sources)  
    settings:  
      userIdTypes:  
        - userIdType: user_id  
          description: Logged-in user id  
        - userIdType: anonymous_id  
          description: Anonymous visitor id  
      queries:  
        exposure:  
          - id: user_id  
            name: Logged-in user experiments  
            userIdType: user_id  
            query: >  
              SELECT  
                user_id,  
                received_at as timestamp,  
                experiment_id,  
                variation_id,  
                context_location_country as country  
              FROM  
                experiment_viewed  
            dimensions:  
              - country  
        identityJoins:  
          - ids: ["user_id", "anonymous_id"]  
            query: SELECT user_id, anonymous_id FROM identifies

Data Source Connection Params

The contents of the params field for a data source depends on the type. As seen in the example above, you can use environment variable interpolation for secrets (e.g. ${POSTGRES_PW}).

Redshift, ClickHouse, Postgres, and Mysql (or MariaDB)

type: postgres # or "redshift" or "mysql" or "clickhouse"  
params:  
  host: localhost  
  port: 5432  
  user: root  
  password: password  
  database: growthbook
Redshift and Postgres also support optional params to force an SSL connection:
type: postgres  
params:  
  ...  
  ssl: true  
  # Omit the below fields to use the default trusted CA from Mozilla  
  caCert: "-----BEGIN CERTIFICATE-----\n..."  
  clientCert: "-----BEGIN CERTIFICATE-----\n..."  
  clientKey: "-----BEGIN CERTIFICATE-----\n..."

Snowflake

type: snowflake  
params:  
  account: abc123.us-east-1  
  username: user  
  password: password  
  database: GROWTHBOOK  
  schema: PUBLIC  
  role: SYSADMIN  
  warehouse: COMPUTE_WH

BigQuery

You must first create a Service Account in Google with the following roles:
  • Data Viewer
  • Metadata Viewer
  • Job User
If you want GrowthBook to auto-discover credentials from environment variables or GCP metadata, use the following:
type: bigquery  
params:  
  authType: auto
If you prefer to pass in credentials directly, you can use this format instead:
type: bigquery  
params:  
  projectId: my-project  
  clientEmail: growthbook@my-project.iam.gserviceaccount.com  
  privateKey: -----BEGIN PRIVATE KEY-----\nABC123\n-----END PRIVATE KEY-----\n

Presto and TrinoDB

type: presto  
params:  
  engine: presto # or "trino"  
  host: localhost  
  port: 8080  
  username: user  
  password: password  
  catalog: growthbook  
  schema: growthbook

Databricks

type: databricks  
params:  
  host: dbc-123-abc.cloud.databricks.com  
  port: 443  
  path: /sql/1.0/warehouses/abc123  
  token: dapi123abc

AWS Athena

If you want GrowthBook to auto-discover credentials from environment variables or instance metadata, use the following format:
type: athena  
params:  
  authType: auto  
  region: us-east-1  
  database: growthbook  
  bucketUri: aws-athena-query-results-growthbook  
  workGroup: primary
If you prefer to specify access key and secret directly instead, use the following format:
type: athena  
params:  
  accessKeyId: AKIA123  
  secretAccessKey: AB+cdef123  
  region: us-east-1  
  database: growthbook  
  bucketUri: aws-athena-query-results-growthbook  
  workGroup: primary

Data Source Settings

The settings tell GrowthBook how to query your data. There are a couple queries you need to define plus an optional Python script to run queries from inside a Jupyter notebook:
type: postgres  
params: ...  
settings:  
  # The different types of supported identifiers  
  userIdTypes:  
    - userIdType: user_id  
      description: Logged-in user id  
    - userIdType: anonymous_id  
      description: Anonymous visitor id  
  queries:  
    # These queries returns experiment variation assignment info  
    # One row every time a user was put into an experiment  
    exposure:  
      - id: user_id  
        name: Logged-in user experiments  
        userIdType: user_id  
        query: >  
          SELECT  
            user_id,  
            received_at as timestamp,  
            experiment_id,  
            variation_id,  
            context_location_country as country  
          FROM  
            experiment_viewed  
        # List additional columns you selected in your experimentsQuery  
        # Can use these to drill down into experiment results  
        dimensions:  
          - country  
    # These optional queries map between different types of identifiers  
    identityJoins:  
      - ids: ["user_id", "anonymous_id"]  
        query: SELECT user_id, anonymous_id FROM identifies  
  # Used when exporting experiment results to a Jupyter notebook  
  # Define a `runQuery(sql)` function that returns a pandas data frame  
  notebookRunQuery: >  
    import os  
    import psycopg2  
    import pandas as pd  
    from sqlalchemy import create_engine, text  

    # Use environment variables or similar for passwords!  
    password = os.getenv('POSTGRES_PW')  
    connStr = f'postgresql+psycopg2://user:{password}@localhost'  
    dbConnection = create_engine(connStr).connect();  

    def runQuery(sql):  
      return pd.read_sql(text(sql), dbConnection)

Organization Settings

Some organization settings can also be controlled from config.yml. Below are all of the currently supported settings:
organization:  
  settings:  
    # Minimum experiment length (in days) when importing past experiments. Default `6`  
    pastExperimentsMinLength: 3  
    # Number of days of historical data to use when analyzing metrics  
    # (must be between 1 and 400, default `90`)  
    metricAnalysisDays: 90  
    # The min percent of users exposed to multiple variations in an  
    # experiment before we start warning you (between 0 and 1, defaults to `0.01`)  
    multipleExposureMinPercent: 0.01  
    # Whether Regression Adjustment (CUPED) should be on or off by default how many  
    # days to use. Can be overridden in your metric definitions if you wish.  
    regressionAdjustmentEnabled: true  
    regressionAdjustmentDays: 14  
    # When we should auto-update experiment results  
    updateSchedule:  
      type: stale  
      hours: 6
The updateSchedule setting has 3 types of values:
  • Never update automatically
    updateSchedule:  
      type: never
    
  • Update if data is X hours stale
    updateSchedule:  
      type: stale  
      hours: 6
    
  • Update on a fixed Cron schedule
    updateSchedule:  
      type: cron  
      cron: "0 */6 * * *"
    

Unit Dimensions

Unit Dimensions let you join additional tables in order to drill down into your experiment results. Dimensions only have 4 properties: name, datasource, userIdType, and SQL. The SQL query must return two columns: the identifier type and value. Example:
name: Country  
# Must match one of the datasources defined in config.yml  
datasource: warehouse  
userIdType: user_id  
sql: SELECT user_id, country as value FROM users
By default, when using config.yml, it’s not possible to create dimensions via the GrowthBook UI. Everything must be done directly in the config.yml file. There is an optional environment variable you can specify to change this behavior:
ALLOW_CREATE_DIMENSIONS=true
This will let you create new dimensions via the UI. Dimensions defined in config.yml will be marked as “Official” and not editable, while ones defined via the UI will be editable.

Metrics (Deprecated)

Legacy metrics can be defined in the config.yml file as well, although this behavior is deprecated. If you want to version control metrics, the recommended approach is to integrate your version control system with our API instead. See a guide here for doing this with GitHub. Below is an example of all the possible settings with comments for legacy metrics defined in config.yml:
name: Revenue per User  
# Required. The data distribution and unit  
type: revenue # or "binomial" or "count" or "duration"  
# Required. Must match one of the datasources defined in config.yml  
datasource: warehouse  
# Description supports full markdown  
description: This metric is **super** important  
# For inverse metrics, the goal is to DECREASE the value (e.g. "page load time")  
inverse: false  
# When ignoring nulls, only users who convert are included in the denominator  
# Setting to true here would change from "Revenue per User" to "Average Order Value"  
ignoreNulls: false  
# Which identifier types are supported for this metric  
userIdTypes:  
  - user  
# Any user with a higher metric amount will be capped at this value  
# In this case, if someone bought a $10,000 order, it would only be counted as $100  
# Note: you can also specify `type: percentile` and a `value` between 0 and 1  
# for percentile based capping  
cappingSettings:  
  type: absolute  
  value: 100  
# Control the date window for your metrics  
windowSettings:  
  type: conversion  
  # Ignore all conversions within the first X hours of being put into an experiment.  
  delayHours: 0  
  # After the conversion delay (if any), wait this many hours for a conversion event.  
  windowValue: 72  
  windowUnit: hours  
# Min number of conversions for an experiment variation before we reveal results  
minSampleSize: 150  
# The "suspicious" threshold. If the percent change for a variation is above this,  
#   we hide the result and label it as suspicious.  
# Default 0.5 = 50% change  
maxPercentChange: 0.50  
# The minimum change required for a result to considered a win or loss. If the percent  
# change for a variation is below this threshold, we will consider an otherwise conclusive  
# test a draw.  
# Default 0.005 = 0.5% change  
minPercentChange: 0.005  
# Overrides for  Regression Adjustment (CUPED) at the metric level. To enforce these fields  
# you must set regressionAdjustmentOverride to true.  
# Leave the settings out of your config file to accept your organization level settings; or  
# set regressionAdjustmentOverride to false.  
regressionAdjustmentOverride: true  
regressionAdjustmentEnabled: true  
regressionAdjustmentDays: 14  
# Arbitrary tags used to group related metrics  
tags:  
  - revenue  
  - core
In addition to all of those settings, you also need to tell GrowthBook how to query the metric with SQL. Depending on the other settings, the columns you need to select may differ slightly:
  • timestamp - always required
  • value - required unless type is set to “binomial”
Plus, you need to select a column for each identifier type the metric supports. A full example:
type: duration  
userIdTypes:  
  - user_id  
  - anonymous_id  
sql: >  
  SELECT  
    created_at as timestamp,  
    user_id,  
    anonymous_id,  
    duration as value  
  FROM  
    requests
And a simple binomial metric that only supports logged-in users:
type: binomial  
userIdTypes:  
  - user  
sql: SELECT user_id, timestamp FROM orders
By default, if a user has more than 1 non-binomial metric row during an experiment, we sum the values together. You can override this behavior with the aggregation setting:
type: duration  
userIdTypes:  
  - user_id  
sql: >  
  SELECT  
    created_at as timestamp,  
    user_id,  
    duration as value  
  FROM  
    requests  
aggregation: MAX(value) # use MAX instead of the default SUM

Allow Creating Metrics in the UI

By default, when using config.yml, it’s not possible to create legacy metrics via the GrowthBook UI. Everything must be done directly in the config.yml file. There is an optional environment variable you can specify to change this behavior:
ALLOW_CREATE_METRICS=true
This will let you create new metrics via the UI. Metrics defined in config.yml will be marked as “Official” and not editable, while ones defined via the UI will be editable.