AMMM User Guide
Version: 2.5.1
Overview
Section titled “Overview”AMMM is a Python library for building Bayesian marketing mix models using PyMC. It quantifies marketing channel effectiveness and optimises budget allocation.
Key Features:
- Bayesian inference with uncertainty quantification
- Saturation and carryover effect modelling
- Budget optimization and scenario planning
- Built-in diagnostics and validation
Installation
Section titled “Installation”Requirements:
- Python 3.10+
- 8GB RAM minimum (16GB recommended)
git clone https://github.com/tandpds/ammm.gitcd ammmpip install -r requirements.txtVerify:
from src.driver import MMMBaseDriverV2 # The driver class is exported at src.driver for convenienceprint("AMMM installed successfully")Quick Start
Section titled “Quick Start”Minimal run (CLI):
python -u runme.py [--no-scenarios] [--scenarios "-20,-15,-10,-5,0,5,10,15,20"]Advanced flow (Python):
from src.driver import MMMBaseDriverV2 # exported for convenience
driver = MMMBaseDriverV2( config_filename='demo/demo_config.yml', input_filename='demo/demo_data.csv', holidays_filename='demo/holidays.xlsx')
model = driver.fit_model()r2 = driver.calculate_train_r_squared()print(f"Model R²: {r2:.3f}")
driver.init_output()driver.visualize()Data Requirements
Section titled “Data Requirements”CSV Structure:
| Column | Type | Description | Required |
|---|---|---|---|
| date | date | Date of observation (YYYY-MM-DD) | Yes |
| target | float | Target variable (sales/revenue) | Yes |
| media_channel_* | float | Spend or impressions | Yes |
| control_var_* | float | Control variables | No |
Quality Requirements:
- Minimum 52 observations (one year weekly data)
- No missing values in critical columns
- Positive values for revenue and media spend
- Consistent time frequency (daily/weekly/monthly)
Configuration
Section titled “Configuration”Basic YAML structure:
raw_data_granularity: weeklydate_col: "date"target_col: "revenue"
media: - display_name: "TV" impressions_col: "tv_impressions" spend_col: "tv_spend"
prophet: include_holidays: true holiday_country: 'US' yearly_seasonality: true trend: true
tune: 2000draws: 2000chains: 4ad_stock_max_lag: 8target_accept: 0.95seed: 42See Configuration Reference for all parameters.
Model Fitting
Section titled “Model Fitting”Standard fitting:
model = driver.fit_model()Note: Data and configuration validation occurs during driver initialisation via internal preprocessing. Errors will raise with clear messages. Fit the model with:
model = driver.fit_model()Predictions
Section titled “Predictions”Generate predictions:
predictions = driver.predict_on_test()mean_pred = predictions.mean(axis=0)lower_bound = np.percentile(predictions, 2.5, axis=0)upper_bound = np.percentile(predictions, 97.5, axis=0)Calculate performance:
r2 = driver.calculate_train_r_squared()Budget Optimisation
Section titled “Budget Optimisation”Budget scenarios are produced by the pipeline when running via the CLI. Use --scenarios to specify percentage changes and review results/csv/budget_scenario_results.csv for allocations and impacts.
Example:
python -u runme.py --scenarios "-20,-10,0,10,20"Then inspect the CSV described in the Output Schema for results.
Multi-Period Budget Optimization
Section titled “Multi-Period Budget Optimization”NEW: Plan budgets across multiple time periods (e.g., 13 weeks, 12 months) with automatic seasonality adjustments.
Enable via CLI:
# 13-week planning with seasonality (default)python runme.py --multiperiod
# Custom planning horizon (26 weeks)python runme.py --multiperiod --multiperiod-weeks 26
# Without seasonality adjustmentspython runme.py --multiperiod --no-seasonalityVia Python API:
import src as ammm
# After model fittingammm.optimize_marketing_budget( model=driver.model, data=driver.processed_data, config=driver.config, results_dir=driver.results_dir, multiperiod_mode=True, use_seasonality=True, n_time_periods=13, frequency='W')Key Features:
- Time-varying budget allocation based on expected channel effectiveness
- Prophet seasonality integration (yearly + weekly patterns)
- Seasonal effectiveness multipliers (1.0 = baseline, >1.0 = more effective, <1.0 = less effective)
- Support for per-period budget constraints (min/max per period)
- Backward compatible (existing code unchanged)
Outputs:
-
CSV:
results/csv/multiperiod_optimization_results.csv- Columns: period, period_date, channel, budget, contribution, seasonal_multiplier, roi
- Contains results for every period × channel combination
-
PNG Visualizations (5 files):
multiperiod_budget_heatmap.png- Budget allocation heatmap across periods and channelsmultiperiod_contribution_over_time.png- Contribution trends with stacked area chartmultiperiod_seasonal_patterns.png- Seasonal effectiveness multipliers by channelmultiperiod_period_comparison.png- Side-by-side budget vs contribution comparisonmultiperiod_budget_vs_contribution.png- Dual-axis trends with ROI overlays
Advanced Options:
from src.driver.opt import optimize_multiperiod_budget
results_df = optimize_multiperiod_budget( model=driver.model, data=driver.processed_data, config=driver.config, results_dir='results', n_periods=13, total_budget=12_500_000, # £12.5M across all periods use_seasonality=True, frequency='W', start_date='2025-01-06', period_budget_limits=(800_000, 1_200_000) # £800K-£1.2M per week)When to Use:
- Planning budgets for multiple weeks/months ahead
- Business has significant seasonality (retail, travel, etc.)
- Need to optimize across a planning horizon (Q1, full year)
- Want to respect time-varying budget constraints
See Multi-Period Optimization Guide for detailed usage and examples.
Visualisation
Section titled “Visualisation”Generate all plots:
driver.init_output()driver.visualize()Individual plots:
driver.plot_model_trace()driver.plot_posterior_predictive()driver.plot_components_contributions()driver.plot_waterfall_components_decomposition()Error Handling
Section titled “Error Handling”Common errors:
# File not foundtry: driver = MMMBaseDriverV2(config_filename='config.yml', ...)except FileNotFoundError as e: print(f"Error: {e}")
# Data validationtry: driver = MMMBaseDriverV2(...)except DataValidationError as e: print(f"Data issue: {e.details}")
# Model not fittedtry: results = driver.predict_on_test()except ModelNotFittedError as e: driver.fit_model()Troubleshooting
Section titled “Troubleshooting”Convergence issues:
driver.config['tune'] = 2000driver.config['target_accept'] = 0.99model = driver.fit_model()Memory issues:
driver.config['chains'] = 2driver.config['draws'] = 500model = driver.fit_model()Cache health (optional): Use the cache monitor utilities to inspect or tidy PyTensor cache if you switch model shapes frequently.
from src.utils.cache_monitor import CacheMonitor
cm = CacheMonitor()info = cm.get_cache_info()print(info)
# Optimise or clear cache when neededcm.optimize_cache()cm.clear_cache(confirm=True) # irreversible, deletes compiled functionsAdvanced Usage
Section titled “Advanced Usage”Load pre-fitted model:
model_path = 'saved_model.nc'driver.model.save(model_path)
driver_new = MMMBaseDriverV2(...)loaded_model = driver_new.fit_model(model_filename=model_path)Batch processing:
configs = ['config1.yml', 'config2.yml', 'config3.yml']results = []
for config_file in configs: driver = MMMBaseDriverV2( config_filename=config_file, input_filename='data.csv', holidays_filename='holidays.xlsx' )
model = driver.fit_model() r2 = driver.calculate_train_r_squared() results.append({'config': config_file, 'r2': r2})
print(pd.DataFrame(results))Running the Pipeline
Section titled “Running the Pipeline”You can run the full pipeline via the runner or convenience scripts:
CLI (recommended):
python -u runme.py [--no-scenarios] [--scenarios "-20,-15,-10,-5,0,5,10,15,20"]Scripts:
- Linux:
./run_pipeline_linux.sh [--no-scenarios] [--scenarios "-10,-5,0,5,10"] - Windows:
run_pipeline_windows.bat [--no-scenarios] [--scenarios "-10,-5,0,5,10"]
Notes:
runme.pyloads.envat startup before defaults, so environment variables set there apply to the entire run (LLM/JAX settings).- Use
--no-scenariosto skip budget planning.
GPU Acceleration (JAX + NumPyro)
Section titled “GPU Acceleration (JAX + NumPyro)”AMMM supports PyMC’s JAX backend for faster sampling on GPU.
- Install CUDA-enabled JAX and NumPyro (CUDA 12 wheels):
pip install -U pippip uninstall -y jax jaxlibpip install -U "jax[cuda12]" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.htmlpip install -U numpyro- Set env vars (preferably in
.env):
JAX_PLATFORMS=cudaAMMM_USE_JAX=1AMMM_JAX_CHAIN_METHOD=vectorized# OptionalXLA_PYTHON_CLIENT_PREALLOCATE=falseXLA_PYTHON_CLIENT_MEM_FRACTION=0.8- Verify GPU:
python -c "import jax; print(jax.devices())" # Expect [CudaDevice(id=0)]If JAX or numpyro is unavailable, AMMM automatically falls back to CPU pm.sample().
LLM-driven Reports
Section titled “LLM-driven Reports”AMMM generates AI-powered business insights that focus on commercial value and statistical rigour.
Enable in YAML:
agentic_report: trueSet LLM provider in .env (OpenAI preferred if OPENAI_API_KEY is set):
# OpenAIOPENAI_API_KEY=...LLM_PROVIDER=openaiLLM_MODEL=gpt-5.2 # e.g., gpt-5.2, gpt-5, gpt-4o
# (Optional) GeminiGEMINI_API_KEY=...LLM_PROVIDER=geminiOther useful settings:
LLM_MAX_TOKENS=8000LLM_TEMPERATURE=0.3LLM_REASONING_EFFORT=mediumLLM_ENABLE_CACHE=trueLLM_DAILY_LIMIT=10LLM_MONTHLY_LIMIT=50Features:
-
ROI-based Performance Classification
- Top 2 channels: “Top Performer”
- ROI > 1: “Solid Performer”
- Otherwise: “Review Needed”
- Performance tables display actual ROI values (from
media_contribution_per_spend.csv)
-
Statistical Quality Flags
Channels are automatically flagged when statistical quality concerns are detected:
Flag Threshold Meaning Action High ROI ROI ≥ 10 Potential selection bias or data sparsity Verify channel targeting and spend levels Wide Uncertainty p95/p5 ≥ 5 Unreliable posterior estimates Increase spend or collect more data Low Spend <1% of total budget Inference unstable due to sparse data Scale up or consolidate with similar channels Selection Bias Propensity-targeted channel Endogeneity risk (targets high-propensity users) Interpret ROI cautiously, consider incrementality tests Propensity-targeted channels (automatically flagged for selection bias):
- Retail media:
amazon,retail_media - Remarketing:
remarketing,google_rmkt,facebook_rmkt,display_rmkt - Owned channels:
crm,email,mailer,shop_app
How flags appear in reports:
- Channel Performance Summary table includes a “Caveats” column with short tags
- Flagged channels show
[!rank]markers next to their names - Detailed explanations appear in “Notes” section below the table
- Retail media:
-
Commercial Insights Focus
- Recommendations emphasize commercial insights inferred from data
- Second-order effects highlighted (auction dynamics, spillovers, saturation, seasonality)
- Each recommendation tied to ROI bands, saturation levels, or contribution share
- Generic advice (e.g., “Implement A/B testing”) explicitly excluded
-
Executive-ready Outputs
- Technical report:
results/markdown/ammm_report.md(evidence-backed diagnostics) - Business report:
results/markdown/business_report.md(executive summary with AI insights) - Interpretations JSON:
results/json/llm_interpretations.json(structured data for downstream use)
- Technical report:
Policy:
- Only two markdown reports generated per run (technical and business)
- No additional markdown files created by LLM module
- All claims in reports backed by model outputs with citations to source CSVs
Results Structure
Section titled “Results Structure”Key outputs under results/:
markdown/—ammm_report.md,business_report.mdjson/—llm_interpretations.json(if LLM enabled)csv/— summaries, diagnostics, performance metricsmodel.nc— saved ArviZInferenceData(NetCDF)model.dill— optional full model object (binary)
Optional CSVs:
media_conversion_efficiency.csvandmedia_cost_per_conversion.csvare treated as optional. Missing files are not warned at high log levels and do not block report generation.
API Reference
Section titled “API Reference”Core modules:
src/driver- ExportsMMMBaseDriverV2(implementation insrc/driver/base.py)src/core/mmm_model_v2.py- Model implementationsrc/prepro/- Data preprocessingsrc/sketch/- Visualisation
See inline docstrings for detailed API documentation.
Support
Section titled “Support”- Sample code:
demo/directory - Common issues: TROUBLESHOOTING.md
- Bug reports: GitHub issues
License
Section titled “License”See LICENSE.md.