Skip to content

AMMM User Guide

Version: 2.5.1

AMMM is a Python library for building Bayesian marketing mix models using PyMC. It quantifies marketing channel effectiveness and optimises budget allocation.

Key Features:

  • Bayesian inference with uncertainty quantification
  • Saturation and carryover effect modelling
  • Budget optimization and scenario planning
  • Built-in diagnostics and validation

Requirements:

  • Python 3.10+
  • 8GB RAM minimum (16GB recommended)
Terminal window
git clone https://github.com/tandpds/ammm.git
cd ammm
pip install -r requirements.txt

Verify:

from src.driver import MMMBaseDriverV2 # The driver class is exported at src.driver for convenience
print("AMMM installed successfully")

Minimal run (CLI):

Terminal window
python -u runme.py [--no-scenarios] [--scenarios "-20,-15,-10,-5,0,5,10,15,20"]

Advanced flow (Python):

from src.driver import MMMBaseDriverV2 # exported for convenience
driver = MMMBaseDriverV2(
config_filename='demo/demo_config.yml',
input_filename='demo/demo_data.csv',
holidays_filename='demo/holidays.xlsx'
)
model = driver.fit_model()
r2 = driver.calculate_train_r_squared()
print(f"Model R²: {r2:.3f}")
driver.init_output()
driver.visualize()

CSV Structure:

ColumnTypeDescriptionRequired
datedateDate of observation (YYYY-MM-DD)Yes
targetfloatTarget variable (sales/revenue)Yes
media_channel_*floatSpend or impressionsYes
control_var_*floatControl variablesNo

Quality Requirements:

  • Minimum 52 observations (one year weekly data)
  • No missing values in critical columns
  • Positive values for revenue and media spend
  • Consistent time frequency (daily/weekly/monthly)

Basic YAML structure:

raw_data_granularity: weekly
date_col: "date"
target_col: "revenue"
media:
- display_name: "TV"
impressions_col: "tv_impressions"
spend_col: "tv_spend"
prophet:
include_holidays: true
holiday_country: 'US'
yearly_seasonality: true
trend: true
tune: 2000
draws: 2000
chains: 4
ad_stock_max_lag: 8
target_accept: 0.95
seed: 42

See Configuration Reference for all parameters.

Standard fitting:

model = driver.fit_model()

Note: Data and configuration validation occurs during driver initialisation via internal preprocessing. Errors will raise with clear messages. Fit the model with:

model = driver.fit_model()

Generate predictions:

predictions = driver.predict_on_test()
mean_pred = predictions.mean(axis=0)
lower_bound = np.percentile(predictions, 2.5, axis=0)
upper_bound = np.percentile(predictions, 97.5, axis=0)

Calculate performance:

r2 = driver.calculate_train_r_squared()

Budget scenarios are produced by the pipeline when running via the CLI. Use --scenarios to specify percentage changes and review results/csv/budget_scenario_results.csv for allocations and impacts.

Example:

Terminal window
python -u runme.py --scenarios "-20,-10,0,10,20"

Then inspect the CSV described in the Output Schema for results.

NEW: Plan budgets across multiple time periods (e.g., 13 weeks, 12 months) with automatic seasonality adjustments.

Enable via CLI:

Terminal window
# 13-week planning with seasonality (default)
python runme.py --multiperiod
# Custom planning horizon (26 weeks)
python runme.py --multiperiod --multiperiod-weeks 26
# Without seasonality adjustments
python runme.py --multiperiod --no-seasonality

Via Python API:

import src as ammm
# After model fitting
ammm.optimize_marketing_budget(
model=driver.model,
data=driver.processed_data,
config=driver.config,
results_dir=driver.results_dir,
multiperiod_mode=True,
use_seasonality=True,
n_time_periods=13,
frequency='W'
)

Key Features:

  • Time-varying budget allocation based on expected channel effectiveness
  • Prophet seasonality integration (yearly + weekly patterns)
  • Seasonal effectiveness multipliers (1.0 = baseline, >1.0 = more effective, <1.0 = less effective)
  • Support for per-period budget constraints (min/max per period)
  • Backward compatible (existing code unchanged)

Outputs:

  1. CSV: results/csv/multiperiod_optimization_results.csv

    • Columns: period, period_date, channel, budget, contribution, seasonal_multiplier, roi
    • Contains results for every period × channel combination
  2. PNG Visualizations (5 files):

    • multiperiod_budget_heatmap.png - Budget allocation heatmap across periods and channels
    • multiperiod_contribution_over_time.png - Contribution trends with stacked area chart
    • multiperiod_seasonal_patterns.png - Seasonal effectiveness multipliers by channel
    • multiperiod_period_comparison.png - Side-by-side budget vs contribution comparison
    • multiperiod_budget_vs_contribution.png - Dual-axis trends with ROI overlays

Advanced Options:

from src.driver.opt import optimize_multiperiod_budget
results_df = optimize_multiperiod_budget(
model=driver.model,
data=driver.processed_data,
config=driver.config,
results_dir='results',
n_periods=13,
total_budget=12_500_000, # £12.5M across all periods
use_seasonality=True,
frequency='W',
start_date='2025-01-06',
period_budget_limits=(800_000, 1_200_000) # £800K-£1.2M per week
)

When to Use:

  • Planning budgets for multiple weeks/months ahead
  • Business has significant seasonality (retail, travel, etc.)
  • Need to optimize across a planning horizon (Q1, full year)
  • Want to respect time-varying budget constraints

See Multi-Period Optimization Guide for detailed usage and examples.

Generate all plots:

driver.init_output()
driver.visualize()

Individual plots:

driver.plot_model_trace()
driver.plot_posterior_predictive()
driver.plot_components_contributions()
driver.plot_waterfall_components_decomposition()

Common errors:

# File not found
try:
driver = MMMBaseDriverV2(config_filename='config.yml', ...)
except FileNotFoundError as e:
print(f"Error: {e}")
# Data validation
try:
driver = MMMBaseDriverV2(...)
except DataValidationError as e:
print(f"Data issue: {e.details}")
# Model not fitted
try:
results = driver.predict_on_test()
except ModelNotFittedError as e:
driver.fit_model()

Convergence issues:

driver.config['tune'] = 2000
driver.config['target_accept'] = 0.99
model = driver.fit_model()

Memory issues:

driver.config['chains'] = 2
driver.config['draws'] = 500
model = driver.fit_model()

Cache health (optional): Use the cache monitor utilities to inspect or tidy PyTensor cache if you switch model shapes frequently.

from src.utils.cache_monitor import CacheMonitor
cm = CacheMonitor()
info = cm.get_cache_info()
print(info)
# Optimise or clear cache when needed
cm.optimize_cache()
cm.clear_cache(confirm=True) # irreversible, deletes compiled functions

Load pre-fitted model:

model_path = 'saved_model.nc'
driver.model.save(model_path)
driver_new = MMMBaseDriverV2(...)
loaded_model = driver_new.fit_model(model_filename=model_path)

Batch processing:

configs = ['config1.yml', 'config2.yml', 'config3.yml']
results = []
for config_file in configs:
driver = MMMBaseDriverV2(
config_filename=config_file,
input_filename='data.csv',
holidays_filename='holidays.xlsx'
)
model = driver.fit_model()
r2 = driver.calculate_train_r_squared()
results.append({'config': config_file, 'r2': r2})
print(pd.DataFrame(results))

You can run the full pipeline via the runner or convenience scripts:

CLI (recommended):

Terminal window
python -u runme.py [--no-scenarios] [--scenarios "-20,-15,-10,-5,0,5,10,15,20"]

Scripts:

  • Linux: ./run_pipeline_linux.sh [--no-scenarios] [--scenarios "-10,-5,0,5,10"]
  • Windows: run_pipeline_windows.bat [--no-scenarios] [--scenarios "-10,-5,0,5,10"]

Notes:

  • runme.py loads .env at startup before defaults, so environment variables set there apply to the entire run (LLM/JAX settings).
  • Use --no-scenarios to skip budget planning.

AMMM supports PyMC’s JAX backend for faster sampling on GPU.

  1. Install CUDA-enabled JAX and NumPyro (CUDA 12 wheels):
Terminal window
pip install -U pip
pip uninstall -y jax jaxlib
pip install -U "jax[cuda12]" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
pip install -U numpyro
  1. Set env vars (preferably in .env):
Terminal window
JAX_PLATFORMS=cuda
AMMM_USE_JAX=1
AMMM_JAX_CHAIN_METHOD=vectorized
# Optional
XLA_PYTHON_CLIENT_PREALLOCATE=false
XLA_PYTHON_CLIENT_MEM_FRACTION=0.8
  1. Verify GPU:
Terminal window
python -c "import jax; print(jax.devices())" # Expect [CudaDevice(id=0)]

If JAX or numpyro is unavailable, AMMM automatically falls back to CPU pm.sample().

AMMM generates AI-powered business insights that focus on commercial value and statistical rigour.

Enable in YAML:

agentic_report: true

Set LLM provider in .env (OpenAI preferred if OPENAI_API_KEY is set):

Terminal window
# OpenAI
OPENAI_API_KEY=...
LLM_PROVIDER=openai
LLM_MODEL=gpt-5.2 # e.g., gpt-5.2, gpt-5, gpt-4o
# (Optional) Gemini
GEMINI_API_KEY=...
LLM_PROVIDER=gemini

Other useful settings:

Terminal window
LLM_MAX_TOKENS=8000
LLM_TEMPERATURE=0.3
LLM_REASONING_EFFORT=medium
LLM_ENABLE_CACHE=true
LLM_DAILY_LIMIT=10
LLM_MONTHLY_LIMIT=50

Features:

  1. ROI-based Performance Classification

    • Top 2 channels: “Top Performer”
    • ROI > 1: “Solid Performer”
    • Otherwise: “Review Needed”
    • Performance tables display actual ROI values (from media_contribution_per_spend.csv)
  2. Statistical Quality Flags

    Channels are automatically flagged when statistical quality concerns are detected:

    FlagThresholdMeaningAction
    High ROIROI ≥ 10Potential selection bias or data sparsityVerify channel targeting and spend levels
    Wide Uncertaintyp95/p5 ≥ 5Unreliable posterior estimatesIncrease spend or collect more data
    Low Spend<1% of total budgetInference unstable due to sparse dataScale up or consolidate with similar channels
    Selection BiasPropensity-targeted channelEndogeneity risk (targets high-propensity users)Interpret ROI cautiously, consider incrementality tests

    Propensity-targeted channels (automatically flagged for selection bias):

    • Retail media: amazon, retail_media
    • Remarketing: remarketing, google_rmkt, facebook_rmkt, display_rmkt
    • Owned channels: crm, email, mailer, shop_app

    How flags appear in reports:

    • Channel Performance Summary table includes a “Caveats” column with short tags
    • Flagged channels show [!rank] markers next to their names
    • Detailed explanations appear in “Notes” section below the table
  3. Commercial Insights Focus

    • Recommendations emphasize commercial insights inferred from data
    • Second-order effects highlighted (auction dynamics, spillovers, saturation, seasonality)
    • Each recommendation tied to ROI bands, saturation levels, or contribution share
    • Generic advice (e.g., “Implement A/B testing”) explicitly excluded
  4. Executive-ready Outputs

    • Technical report: results/markdown/ammm_report.md (evidence-backed diagnostics)
    • Business report: results/markdown/business_report.md (executive summary with AI insights)
    • Interpretations JSON: results/json/llm_interpretations.json (structured data for downstream use)

Policy:

  • Only two markdown reports generated per run (technical and business)
  • No additional markdown files created by LLM module
  • All claims in reports backed by model outputs with citations to source CSVs

Key outputs under results/:

  • markdown/ammm_report.md, business_report.md
  • json/llm_interpretations.json (if LLM enabled)
  • csv/ — summaries, diagnostics, performance metrics
  • model.nc — saved ArviZ InferenceData (NetCDF)
  • model.dill — optional full model object (binary)

Optional CSVs:

  • media_conversion_efficiency.csv and media_cost_per_conversion.csv are treated as optional. Missing files are not warned at high log levels and do not block report generation.

Core modules:

  • src/driver - Exports MMMBaseDriverV2 (implementation in src/driver/base.py)
  • src/core/mmm_model_v2.py - Model implementation
  • src/prepro/ - Data preprocessing
  • src/sketch/ - Visualisation

See inline docstrings for detailed API documentation.

  • Sample code: demo/ directory
  • Common issues: TROUBLESHOOTING.md
  • Bug reports: GitHub issues

See LICENSE.md.