Skip to content

Configuration Reference

This document provides a detailed reference for all parameters available in the MMM config.yaml file.

Version: 2.5.1

# Data handling
raw_data_granularity: weekly # or daily
train_test_ratio: 0.9 # optional, default 0.9
data_rows: # optional
total: 156 # optional
start_date: '2021-01-01' # optional
end_date: '2023-12-31' # optional
# Column mapping
ignore_cols: # optional
- unused_column_1
date_col: date # optional, default 'date'
target_col: sales # required
extra_features_cols: # optional
- competitor_spend
- is_promo_week
media: # required, list of media channels
- display_name: TV
impressions_col: tv_impressions
spend_col: tv_spend
- display_name: Search
impressions_col: search_clicks
spend_col: search_cost
# Prophet Integration (Required)
prophet:
include_holidays: true
holiday_country: 'US'
yearly_seasonality: true
trend: true
# Model/Sampler Parameters
tune: 2000 # optional, default 2000
draws: 2000 # optional, default 2000
chains: 4 # optional, default 4
ad_stock_max_lag: 4 # optional, default 4
target_accept: 0.95 # optional, default 0.95
seed: 42 # optional
# Custom Priors (Optional Section)
custom_priors:
intercept:
dist: Normal
kwargs: { mu: 0, sigma: 2 }
beta_channel:
dist: HalfNormal
kwargs: { sigma: 2 }
alpha: # Adstock decay rate
dist: Beta
kwargs: { alpha: 1, beta: 3 }
lam: # Saturation parameter
dist: Gamma
kwargs: { alpha: 3, beta: 1 }
likelihood:
dist: Normal
kwargs:
sigma:
dist: HalfNormal
kwargs: { sigma: 2 }
gamma_control: # Control variable coefficients
dist: Normal
kwargs: { mu: 0, sigma: 2 }

raw_data_granularity

  • Description: The time frequency of the input data rows.
  • Type: string
  • Allowed Values: "daily", "weekly"
  • Required: Yes

train_test_ratio

  • Description: Proportion of the data (ordered by date) to use for training. The rest is used for out-of-sample testing. Ignored if data_rows with start_date/end_date is used.
  • Type: float
  • Range: 0.5 to 1.0
  • Default: 0.9
  • Required: No

data_rows

  • Description: Defines the specific subset of data to use based on row count or dates. Overrides train_test_ratio if present.
  • Type: object
  • Required: No
  • Properties:
    • total (Optional int): Total number of rows to use from the start of the dataset.
    • start_date (Optional string): Start date (inclusive, format YYYY-MM-DD).
    • end_date (Optional string): End date (inclusive, format YYYY-MM-DD).

ignore_cols

  • Description: List of column names from the input CSV to exclude from processing.
  • Type: list of string
  • Required: No

date_col

  • Description: Name of the column containing dates. Dates must be in YYYY-MM-DD format.
  • Type: string
  • Default: "date"
  • Required: No

target_col

  • Description: Name of the column representing the target variable (e.g., sales, conversions).
  • Type: string
  • Required: Yes

target_type

  • Description: Specifies the nature of the target variable, influencing naming of key performance metrics.
  • Type: string
  • Allowed Values: "revenue", "conversion"
  • Default: "revenue"
  • Required: No
  • Impact:
    • If "revenue": Outputs use media_performance_roi.csv and media_performance_cost_per_revenue_unit.csv
    • If "conversion": Outputs use media_performance_conversion_efficiency.csv and media_performance_cpa.csv

extra_features_cols

  • Description: List of column names representing control variables or external factors (e.g., competitor activity, promotions). These are included as linear regressors. Columns must be numeric.
  • Type: list of string
  • Required: No

media

  • Description: List defining each media channel to be included in the model.
  • Type: list of object
  • Required: Yes
  • Object Properties:
    • display_name (string, required): Name used for the channel in reports and plots.
    • impressions_col (string, required): Column name containing the volume metric for the channel (e.g., impressions, clicks, GRPs).
    • spend_col (string, required): Column name containing the cost/spend data for the channel.
    • learned_prior (float, optional): Custom prior value for this channel’s effectiveness. Overrides the default cost-based prior when specified.

prophet

  • Description: Configuration for Prophet seasonality and trend decomposition. Prophet integration is required.
  • Type: object
  • Required: Yes
  • Properties:
    • include_holidays (bool): Whether to include holidays in the model.
    • holiday_country (string): Country code for holiday calendar (e.g., ‘US’, ‘GB’).
    • yearly_seasonality (bool): Enable yearly seasonality component.
    • weekly_seasonality (bool): Enable weekly seasonality component.
    • daily_seasonality (bool): Enable daily seasonality component.
    • trend (bool): Enable trend component.

tune

  • Description: Number of tuning (burn-in) steps for the MCMC sampler.
  • Type: int
  • Default: 2000
  • Required: No

draws

  • Description: Number of sampling steps to perform after tuning for each chain.
  • Type: int
  • Default: 2000
  • Required: No

chains

  • Description: Number of independent MCMC chains to run. Multiple chains are essential for assessing convergence.
  • Type: int
  • Default: 4
  • Required: No

ad_stock_max_lag

  • Description: Maximum number of time periods over which advertising effects can carry over.
  • Type: int
  • Range: 1 to 52 (Warning if > 26)
  • Default: 4
  • Required: No

target_accept

  • Description: Target acceptance rate for NUTS sampler. Controls step size adaptation.
  • Type: float
  • Range: 0.6 to 0.99
  • Default: 0.95
  • Required: No

seed

  • Description: Integer seed for the random number generator, ensuring reproducibility.
  • Type: int
  • Required: No

This section is optional. If omitted, default priors are used.

Description: Allows overriding default prior distributions for model parameters.

Type: object

Required: No

Properties (Parameter Groups):

  • intercept: Prior for the base intercept term.
  • beta_channel: Prior for the effectiveness coefficients of media channels.
  • alpha: Prior for the decay rate parameter in the adstock transformation (typically Beta distribution).
  • lam (or saturation_beta): Prior for the saturation parameter in the logistic saturation function (typically Gamma or HalfNormal).
  • likelihood: Defines the observation distribution and its parameters (e.g., sigma for Normal likelihood).
  • gamma_control: Prior for the coefficients of control variables.

Structure:

  • dist (string, required): Name of the PyMC distribution (e.g., “Normal”, “HalfNormal”, “Beta”, “Gamma”, “Laplace”, “LogNormal”).
  • kwargs (object, required): Dictionary of keyword arguments for the distribution.

The configuration system supports user-friendly parameter aliases that map to internal mathematical parameter names:

User-Friendly NameInternal NameDescription
saturation_betalamSaturation parameter for logistic saturation function
adstock_alphaalphaDecay rate parameter for adstock transformation
media_coefficientsbeta_channelMedia channel effectiveness coefficients
control_coefficientsgamma_controlControl variable coefficients
error_sigmalikelihood.sigmaError term standard deviation

The system includes automatic validation of configuration parameters:

  1. Parameter Name Validation: Checks that all parameter names are valid or have valid aliases
  2. Typo Detection: Suggests corrections for misspelled parameter names using fuzzy matching
  3. Distribution Validation: Verifies that specified distributions exist in PyMC
  4. Fail-Fast behaviour: Invalid configurations raise immediate errors

Common Validation Errors:

Invalid Parameter Name:

ValueError: Invalid parameter 'saturaton_beta'. Did you mean 'saturation_beta'?

Invalid Distribution:

ValueError: Invalid distribution 'Norml'. Did you mean 'Normal'?

Missing Required Arguments:

ValueError: Distribution 'Beta' requires arguments: alpha, beta