Skip to content

Explanation: Optimisation Concepts

This document explains the core concepts behind the budget optimisation process in ammm, including the response curve models used.

ammm leverages distinct response curve models, derived from the fitted MMM parameters, to capture the relationship between marketing spend and its contribution for the purpose of optimisation. These models exhibit diminishing returns.

Originating from enzyme kinetics, this model assumes diminishing returns to scale.

  • Formula:
    C(S) = \frac{L \times S}{k + S}
    Where S is spend, L is the maximum contribution (saturation), and k is the spend at half-maximum contribution (lower k means higher efficiency).
  • Characteristics: Always positive, decreasing marginal return (dC/dS > 0, d^2C/dS^2 < 0). Elasticity E = k / (k + S).
  • Parameters (derived for optimisation): beta (max effect), lam (shape/steepness).

*(Note: Image path adjusted relative to new location)*

Provides a classic S-shaped curve.

  • Formula:
    C(S) = \frac{\alpha(1 - e^{-\lambda S})}{1 + e^{-\lambda S}}
    Where S is spend, α (alpha) is the saturation parameter representing the maximum achievable contribution as spending approaches infinity, and λ (lambda) is the rate parameter that controls how quickly the curve approaches saturation.
  • Characteristics: S-shaped, diminishing returns, approaches a maximum value.
  • Parameters (derived for optimisation): alpha (max effect/saturation), lam (shape/steepness).

(Note: The specific functional forms used internally for optimisation might be simplified representations derived from the full model’s transformations.)

The optimiser needs a per-channel function mapping spend to expected contribution, C_i(S).

In the current AMMM implementation, those response-curve parameters are estimated in a two-step process:

  1. Estimate contributions from the fitted Bayesian MMM

    The MMM produces per-channel contribution time series (on the original target scale) from posterior samples. In practice we often use the posterior mean contributions as a point estimate.

  2. Fit a simple saturating curve to (spend, contribution) pairs

    For each channel i, we fit a parametric function C_i(S) (e.g. sigmoid or Michaelis–Menten) to the relationship between:

    • observed spend S_{i,t} in the training data, and
    • estimated contribution contrib_{i,t} from the MMM.

    Implementation detail: this uses scipy.optimize.curve_fit (see core.utils.estimate_sigmoid_parameters and core.utils.estimate_menten_parameters).

This is an approximation of the full MMM transformations (adstock → saturation → coefficient) but it makes the constrained optimisation problem tractable and fast.

The optimisation itself is typically run on a single set of fitted curve parameters (e.g. from posterior means), so uncertainty is not fully propagated through the optimiser. However, the library can also fit curves to posterior quantiles (e.g. 5% and 95%) to produce uncertainty bands for scenario plots.

The goal is to find the spend allocation (S_1, S_2, ..., S_n) across n channels that maximises the total expected contribution, subject to constraints.

  • Objective Function:
    maximize \sum_{i=1}^{n} C_i(S_i)
    Where C_i(S_i) is the contribution predicted by the estimated response curve for channel i given spend S_i.

Because the total budget B is fixed (see constraints below), “maximise total contribution” is equivalent to maximising overall ROI (total_contribution / total_spend) for that fixed budget.

The optimiser’s intuition can be understood in terms of marginal ROI:

\text{mROI}_i(S) = \frac{dC_i(S)}{dS}

Under an unconstrained interior optimum (and a fixed total budget), the solution tends to allocate budget so that the mROI values are equalised across channels (KKT conditions), subject to bounds and other constraints.

For the implemented response curves:

  • Michaelis–Menten:

    C(S) = \frac{L S}{k + S}
    \qquad\Rightarrow\qquad
    \frac{dC}{dS} = \frac{L k}{(k + S)^2}
  • Sigmoid saturation (as implemented in core.utils.sigmoid_saturation):

    C(S) = \alpha \cdot \frac{1 - e^{-\lambda S}}{1 + e^{-\lambda S}}
    \qquad\Rightarrow\qquad
    \frac{dC}{dS} = \frac{\alpha\lambda}{2}\operatorname{sech}^2\left(\frac{\lambda S}{2}\right)
  • Total Budget: The sum of allocated spends must equal the total available budget (B).
    \sum_{i=1}^{n} S_i = B
  • Non-Negativity: Spend for each channel must be non-negative.
    S_i \ge 0
  • Optional Bounds: User-defined minimum (S_i^{min}) and maximum (S_i^{max}) spend limits can be applied to individual channels.
    S_i^{min} \leq S_i \leq S_i^{max}

For the standard single-period optimiser, AMMM uses scipy.optimize.minimize(..., method="SLSQP"). SLSQP is well-suited for:

  • smooth non-linear objectives (our saturating response curves),
  • a linear equality constraint (Σ S_i = B), and
  • box constraints (per-channel min/max bounds).

At a high level, SLSQP:

  1. Starting with an initial guess.
  2. Calculating gradients (rates of change) of the objective and constraints.
  3. Determining a search direction to improve the objective while respecting constraints.
  4. Calculating a step size.
  5. Updating the allocation.
  6. Repeating until convergence criteria (optimality and feasibility tolerances) are met.

Multi-period optimisation (simultaneous vs sequential)

Section titled “Multi-period optimisation (simultaneous vs sequential)”

Multi-period optimisation allocates budgets across periods and channels:

\max_{\{S_{p,i}\}} \sum_{p=1}^{P}\sum_{i=1}^{n} w_{p,i}\,C_i(S_{p,i})

Where:

  • p indexes periods (weeks/months),
  • w_{p,i} is an optional seasonal multiplier (defaults to 1), and
  • constraints include the overall budget, per-channel bounds per period, and optional ramp limits between periods.

AMMM supports two approaches:

  • Simultaneous optimisation (all periods at once) using scipy.optimize.minimize:

    • default method is SLSQP for moderate-sized problems
    • for large problems (high number of variables), it can switch to trust-constr for robustness
  • Sequential optimisation (greedy, period-by-period) for numerical stability on larger problems:

    • optimises each period separately while enforcing ramp constraints against the previous period
    • not globally optimal, but is much more robust in practice

Adstock-aware multi-period objective (advanced)

Section titled “Adstock-aware multi-period objective (advanced)”

The codebase includes an adstock-aware multi-period objective that maintains a per-channel state:

s_{p,i} = a_i\, s_{p-1,i} + S_{p,i}
\qquad\text{and}\qquad
\text{objective uses } C_i(s_{p,i})

This captures carryover effects in the optimisation problem, but it increases non-linearity and the number of effective constraints. Treat it as an advanced/experimental option: start with few periods and relaxed constraints.

Once the optimal budget S_i^* for each channel is found, the expected contribution for each channel C_i(S_i^*) and the total contribution sum(C_i(S_i^*)) can be calculated using the fitted response curves. Uncertainty quantification around these forecasts can involve considering the uncertainty in the estimated response curve parameters.