AMMM Pipeline CSV Output Schema
This document describes all CSV files generated by the AMMM (Advanced Marketing Mix Modelling) pipeline in the results/csv/ directory.
Table of Contents
Section titled “Table of Contents”- Overview
- Data Exploration & Diagnostics
- Model Results
- Performance Metrics
- Budget Optimization
- Decomposition & Attribution
media_contribution_per_spend.csv
Section titled “media_contribution_per_spend.csv”Purpose: ROI-like metric for revenue-target models: total revenue contribution per unit spend for each media channel.
Generated During: Phase 7 - Post-Analysis (Performance Calculation)
Structure:
channel,total_spend,total_contribution,contribution_per_unit_spend,roi_rankColumn Definitions:
channel(string): Media channel nametotal_spend(float): Total spend on channeltotal_contribution(float): Total model-attributed revenuecontribution_per_unit_spend(float): Revenue per unit spend (contribution / spend)roi_rank(int): Rank by contribution-per-spend (1 = highest)
Notes:
- Produced when
config.target_typeisrevenue. - For conversion-target models, see
media_conversion_efficiency.csv.
media_cost_per_revenue_unit.csv
Section titled “media_cost_per_revenue_unit.csv”Purpose: Cost per revenue unit (inverse of contribution-per-spend) for revenue-target models.
Generated During: Phase 7 - Post-Analysis (Performance Calculation)
Structure:
channel,total_spend,total_contribution,cost_per_revenue_unit,cpr_rankColumn Definitions:
channel(string): Media channel nametotal_spend(float): Total spend on channeltotal_contribution(float): Total model-attributed revenuecost_per_revenue_unit(float): Spend per unit of attributed revenue (spend / contribution)cpr_rank(int): Rank by cost efficiency (1 = lowest cost per revenue unit)
Notes:
- Produced when
config.target_typeisrevenue. - For conversion-target models, see
media_cost_per_conversion.csv.
Overview
Section titled “Overview”The AMMM pipeline generates 13 CSV files across different phases:
| Phase | Files |
|---|---|
| Data Exploration | stationarity_summary.csv, vif_summary.csv, transfer_entropy_summary.csv |
| Model Results | model_summary.csv, ELPD_summary.csv |
| Performance | media_performance_effect.csv, media_conversion_efficiency.csv, media_cost_per_conversion.csv, media_contribution_per_spend.csv, media_cost_per_revenue_unit.csv, response_curve_fit_combined.csv |
| Budget Optimization | budget_scenario_results.csv |
| Decomposition | all_decomp.csv, waterfall_decomposition_data.csv |
Data Exploration & Diagnostics
Section titled “Data Exploration & Diagnostics”stationarity_summary.csv
Section titled “stationarity_summary.csv”Purpose: Tests for stationarity in time series data using Augmented Dickey-Fuller (ADF) tests.
Generated During: Phase 4 - Data Exploration (Pre-diagnostics)
Structure:
variable,adf_statistic,p_value,is_stationary,lags_used,n_observationsColumn Definitions:
variable(string): Name of the time series variable (media channel, control, or target)adf_statistic(float): Augmented Dickey-Fuller test statisticp_value(float): P-value for the ADF testis_stationary(boolean): Whether the series is stationary (True/False)lags_used(int): Number of lags used in the ADF testn_observations(int): Number of observations in the test
Use Cases:
- Identify non-stationary variables that may require differencing or transformation
- Validate that time series assumptions are met before modelling
- Diagnose potential data quality issues
Example:
variable,adf_statistic,p_value,is_stationary,lags_used,n_observationstv_spend,-3.21,0.019,True,3,104vif_summary.csv
Section titled “vif_summary.csv”Purpose: Variance Inflation Factor (VIF) analysis to detect multicollinearity between features.
Generated During: Phase 4 - Data Exploration (Pre-diagnostics)
Structure:
variable,vif,is_multicollinearColumn Definitions:
variable(string): Name of the feature (media channel or control variable)vif(float): Variance Inflation Factor valueis_multicollinear(boolean): Whether VIF exceeds threshold (typically VIF > 10)
Use Cases:
- Identify highly correlated features that may cause model instability
- Guide feature engineering decisions
- Validate model assumptions
Interpretation:
- VIF = 1: No correlation
- VIF < 5: Low correlation (acceptable)
- VIF 5-10: Moderate correlation (caution)
- VIF > 10: High multicollinearity (problematic)
Example:
variable,vif,is_multicollinearsearch_spend,4.8,Falsetransfer_entropy_summary.csv
Section titled “transfer_entropy_summary.csv”Purpose: Measures information transfer between variables using transfer entropy.
Generated During: Phase 4 - Data Exploration (Pre-diagnostics)
Structure:
source,target,transfer_entropy,is_significantColumn Definitions:
source(string): Source variable nametarget(string): Target variable name (typically the dependent variable)transfer_entropy(float): Transfer entropy value (bits)is_significant(boolean): Whether the transfer is statistically significant
Use Cases:
- Identify causal relationships between media channels and target
- Understand information flow in the marketing system
- Guide model structure decisions
Example:
source,target,transfer_entropy,is_significanttv_spend,revenue,0.042,TrueModel Results
Section titled “Model Results”model_summary.csv
Section titled “model_summary.csv”Purpose: Detailed summary of all fitted model parameters with posterior statistics.
Generated During: Phase 5 - Model Fitting (After MCMC sampling)
Structure:
parameter,mean,sd,hdi_5%,hdi_95%,mcse_mean,mcse_sd,ess_bulk,ess_tail,r_hat,medianColumn Definitions:
parameter(string): Parameter name (intercept, beta_channel, alpha, lam, likelihood_sigma)mean(float): Posterior mean estimatesd(float): Posterior standard deviationhdi_5%(float): 5th percentile of Highest Density Intervalhdi_95%(float): 95th percentile of Highest Density Intervalmcse_mean(float): Monte Carlo standard error of the meanmcse_sd(float): Monte Carlo standard error of the standard deviationess_bulk(float): Bulk Effective Sample Sizeess_tail(float): Tail Effective Sample Sizer_hat(float): Gelman-Rubin convergence diagnostic (should be ≈ 1.0)median(float): Posterior median estimate
Parameter Types:
intercept: Model intercept (baseline effect)likelihood_sigma: Noise/error standard deviationbeta_channel[channel_name]: Channel effectiveness coefficientalpha[channel_name]: Adstock retention parameter (0-1)lam[channel_name]: Saturation steepness parameter
Use Cases:
- Assess parameter convergence (check r_hat ≈ 1.0)
- Evaluate parameter uncertainty (SD and HDI intervals)
- Identify strongest media channels (high beta values)
- Understand carryover effects (alpha values)
- Export results for reporting
Example (excerpt):
parameter,mean,sd,hdi_5%,hdi_95%,mcse_mean,mcse_sd,ess_bulk,ess_tail,r_hat,medianbeta_channel[tv_spend],0.012,0.003,0.007,0.018,0.0001,0.0001,2500,2200,1.00,0.012ELPD_summary.csv
Section titled “ELPD_summary.csv”Purpose: Expected Log Pointwise Predictive Density (ELPD) and model diagnostics.
Generated During: Phase 7 - Post-Analysis (Model Diagnostics)
Structure:
metric,valueColumn Definitions:
metric(string): Name of the diagnostic metricvalue(float/int/bool): Metric value
Metrics Included:
n_samples: Number of posterior samples usedn_data_points: Number of data points in the modelgood_k: Proportion of good Pareto k values (should be > 0.7)elpd_loo: Expected log pointwise predictive density (LOO-CV)p_loo: Effective number of parameterswarning: Whether LOO diagnostic warnings were raisedr_squared: Model R-squared value
Use Cases:
- Evaluate model fit quality
- Compare different model specifications
- Assess out-of-sample predictive accuracy
- Identify overfitting (if p_loo >> actual parameters)
Example:
metric,valueelpd_loo,-1234.56Performance Metrics
Section titled “Performance Metrics”media_performance_effect.csv
Section titled “media_performance_effect.csv”Purpose: Media channel effectiveness and contribution metrics.
Generated During: Phase 7 - Post-Analysis (Performance Calculation)
Structure:
channel,mean_effect,median_effect,sd_effect,hdi_5%,hdi_95%,total_contribution,pct_of_totalColumn Definitions:
channel(string): Media channel namemean_effect(float): Mean contribution per time periodmedian_effect(float): Median contribution per time periodsd_effect(float): Standard deviation of effecthdi_5%(float): 5th percentile of HDIhdi_95%(float): 95th percentile of HDItotal_contribution(float): Total contribution over entire periodpct_of_total(float): Percentage of total media contribution
Use Cases:
- Rank channels by effectiveness
- Calculate marketing ROI
- Allocate budget across channels
- Identify underperforming channels
Example:
channel,mean_effect,median_effect,sd_effect,hdi_5%,hdi_95%,total_contribution,pct_of_totaltv_spend,112.4,110.2,15.7,85.1,140.3,5845.0,0.27media_conversion_efficiency.csv
Section titled “media_conversion_efficiency.csv”Purpose: Conversion efficiency metrics for each media channel.
Generated During: Phase 7 - Post-Analysis (Performance Calculation)
Structure:
channel,total_spend,total_contribution,conversions_per_unit_spend,efficiency_rankColumn Definitions:
channel(string): Media channel nametotal_spend(float): Total spend on channeltotal_contribution(float): Total contribution (conversions/revenue)conversions_per_unit_spend(float): Efficiency metric (contribution / spend)efficiency_rank(int): Rank by efficiency (1 = most efficient)
Use Cases:
- Identify most efficient channels
- Optimize budget allocation
- Calculate marketing efficiency ratios
- Compare channel performance
Example:
channel,total_spend,total_contribution,conversions_per_unit_spend,efficiency_ranksocial_spend,250000,620.5,0.00248,2media_cost_per_conversion.csv
Section titled “media_cost_per_conversion.csv”Purpose: Cost per conversion (CPA/CPO) for each media channel.
Generated During: Phase 7 - Post-Analysis (Performance Calculation)
Structure:
channel,total_spend,total_contribution,cost_per_conversion,cpa_rankColumn Definitions:
channel(string): Media channel nametotal_spend(float): Total spend on channeltotal_contribution(float): Total conversions/outcomescost_per_conversion(float): Cost per conversion (spend / contribution)cpa_rank(int): Rank by CPA (1 = lowest CPA)
Use Cases:
- Calculate CPA/CPO metrics
- Compare channel efficiency
- Set performance benchmarks
- Identify cost-effective channels
Example:
channel,total_spend,total_contribution,cost_per_conversion,cpa_ranksearch_spend,300000,900.0,333.33,1response_curve_fit_combined.csv
Section titled “response_curve_fit_combined.csv”Purpose: Fitted response curves for all media channels showing diminishing returns.
Generated During: Phase 5 - Model Fitting (Visualisation Phase)
Structure:
channel,spend_level,predicted_contribution,lower_bound,upper_bound,saturation_pctColumn Definitions:
channel(string): Media channel namespend_level(float): Spend level (x-axis)predicted_contribution(float): Predicted contribution at this spend levellower_bound(float): Lower bound of prediction intervalupper_bound(float): Upper bound of prediction intervalsaturation_pct(float): Percentage of maximum saturation reached
Use Cases:
- Visualize diminishing returns
- Find optimal spend levels
- Identify saturation points
- Guide budget recommendations
Note: Values represent direct response curves. No additional scaling is required.
Example:
channel,spend_level,predicted_contribution,lower_bound,upper_bound,saturation_pcttv_spend,150000,210.5,180.2,240.8,64.2Budget Optimization
Section titled “Budget Optimization”budget_scenario_results.csv
Section titled “budget_scenario_results.csv”Purpose: Results from budget scenario planning across different budget levels.
Generated During: Phase 8 - Budget Optimization
Structure:
scenario,description,channel,budget,contribution,pct_change_from_baseline,total_budget,pct_of_totalColumn Definitions:
scenario(string): Scenario identifier (e.g., ‘baseline’, ‘scenario_-10’, ‘scenario_+20’)description(string): Human-readable scenario descriptionchannel(string): Media channel name or ‘TOTAL’ for aggregatesbudget(float): Allocated budget for this channel in this scenariocontribution(float): Predicted contribution/conversionspct_change_from_baseline(float): Percentage change from baseline scenariototal_budget(float): Total budget across all channelspct_of_total(float): This channel’s percentage of total budget
Use Cases:
- Compare different budget allocation strategies
- Quantify impact of budget changes
- Optimize budget distribution
- Create “what-if” scenarios for planning
Scenario Types:
baseline: Current spend levelsscenario_-X: X% decrease in total budgetscenario_+X: X% increase in total budget
Example (excerpt):
scenario,description,channel,budget,contribution,pct_change_from_baseline,total_budget,pct_of_totalscenario_+10,+10% total budget,tv_spend,330000,980.2,6.4,1100000,0.30Decomposition & Attribution
Section titled “Decomposition & Attribution”all_decomp.csv
Section titled “all_decomp.csv”Purpose: Complete time-series decomposition of target variable into components.
Generated During: Phase 7 - Post-Analysis (Decomposition)
Structure:
date,actual,predicted,baseline,media_total,channel_1,channel_2,...,control_1,control_2,...Column Definitions:
date(datetime): Time periodactual(float): Actual observed target valuepredicted(float): Model predicted valuebaseline(float): Baseline contribution (intercept + trend)media_total(float): Total media contribution across all channelschannel_name(float): Individual channel contribution (one column per channel)control_name(float): Control variable contribution (one column per control)
Use Cases:
- Understand contribution breakdown over time
- Create waterfall charts
- Validate model fit
- Identify seasonal patterns
- Generate attribution reports
Example (excerpt):
date,actual,predicted,baseline,media_total,tv_spend,search_spend,control_temp2024-06-03,1200.0,1185.4,720.1,465.3,210.2,155.1,10.0waterfall_decomposition_data.csv
Section titled “waterfall_decomposition_data.csv”Purpose: Aggregated decomposition data for waterfall visualizations.
Generated During: Phase 7 - Post-Analysis (Visualization)
Structure:
component,contribution,component_type,orderColumn Definitions:
component(string): Component name (baseline, channel name, or control variable)contribution(float): Total contribution of this componentcomponent_type(string): Type of component (‘baseline’, ‘media’, ‘control’)order(int): Display order for waterfall chart
Use Cases:
- Generate waterfall charts
- Create attribution visualizations
- Present decomposition results
- Report marketing contribution
Example:
component,contribution,component_type,orderbaseline,37520.0,baseline,0tv_spend,11240.0,media,1Data Types
Section titled “Data Types”- All CSV files use UTF-8 encoding
- Numeric values use standard float representation
- Boolean values are represented as True/False
- Dates follow ISO 8601 format (YYYY-MM-DD) or inferred format from config
Missing Values
Section titled “Missing Values”- Missing numeric values may appear as NaN or empty strings
- Missing categorical values appear as empty strings
File Generation
Section titled “File Generation”- Files are generated in the order of pipeline phases
- Some files may not be generated if specific analyses are disabled
- All files are overwritten on each pipeline run
Version Compatibility
Section titled “Version Compatibility”- Schema is valid for AMMM v2.x
- V1 legacy results may have different schema (see
results_legacy/)
Related Documentation
Section titled “Related Documentation”- USER_GUIDE.md: Complete pipeline usage guide
- TROUBLESHOOTING.md: Common issues and solutions
- file_organization.md: Project structure overview
Last updated: Oct 2025 AMMM Version: 2.5.1