Skip to content

Hypothesis Testing: Black Friday Buildup (Pragmatic Option)

This guide shows the simplest, pragmatic approach to test if a pre-event buildup effect exists (e.g., purchase delay before Black Friday) by configuring a neutral, heavy‑tailed prior for control variables.


(hypothesis-testing-introduction)=

  • Goal: Test whether a buildup dummy (e.g., black_friday_buildup) has a non‑zero effect on the target.
  • Approach: Use StudentT prior on gamma_control so control coefficients can be positive or negative without bounds.
  • Why StudentT? Heavy tails (robust), full real line support, avoids vectorised priors when adding one extra feature.

(hypothesis-testing-requirements)=

  1. Add your buildup dummy to the data (e.g., data-config/statlas_data.csv), column name: black_friday_buildup.
  2. Expose it in the config under extra_features_cols.

Example (data-config/statlas_config_v3.yml):

extra_features_cols:
- black_friday_buildup
# (Optional) add an event-week dummy too, e.g. black_friday_event

(hypothesis-testing-configuration)=

Set a neutral, heavy‑tailed prior for all control variables via custom_priors.gamma_control:

custom_priors:
gamma_control:
dist: StudentT
kwargs:
nu: 3 # Heavy tails (robust to outliers)
mu: 0 # Centered at zero (neutral prior)
sigma: 1 # Moderate width (lets data speak)

Properties:

  • Allows negative values → Full real‑line support (no bounds needed)
  • Heavy tails (nu=3) → Robust to extreme values
  • Centered at zero → No directional bias
  • Sigma=1 → Moderate width (simple hypothesis test)

Notes:

  • This applies the same prior to all controls. For this pragmatic test, that’s fine and avoids vectorised sigma.
  • If you later add an event dummy and want a wider prior for it, you can vectorise sigma in the exact order of extra_features_cols. For hypothesis testing with a single buildup dummy, keep it simple and skip vectorisation.

(hypothesis-testing-run)=

  • Use your normal pipeline (e.g., python -u runme.py).
  • Ensure your config includes the extra_features_cols and custom_priors.gamma_control above.

(hypothesis-testing-interpretation)=

Let γ_buildup = gamma_control['black_friday_buildup'].

  1. Buildup exists:

    • Posterior mean is negative (e.g., −0.15)
    • 95% credible interval excludes 0 on the negative side (e.g., [−0.25, −0.05])
    • Interpretation: purchase delay (customers hold off buying)
  2. No buildup effect:

    • Posterior mean near 0 (e.g., −0.02)
    • 95% credible interval includes 0 (e.g., [−0.10, 0.06])
    • Interpretation: no evidence of purchase delay
  3. Event effect (if modeled separately):

    • gamma_control['black_friday_event'] positive (e.g., +0.50)
    • 95% credible interval excludes 0 on positive side (e.g., [0.35, 0.65])
    • Interpretation: event drives sales spike
  • Credible interval excludes zero → effect is significant at 95% level
  • Magnitude matters → compare effect sizes, not just significance
  • Compare buildup vs event → net effect = event boost − buildup delay

(hypothesis-testing-extraction)=

Example after fitting (using the v2 model’s InferenceData):

import numpy as np
import xarray as xr
if 'model' not in globals():
class _Idata:
pass
class _Model:
pass
_idata = _Idata()
_idata.posterior = xr.Dataset(
{
'gamma_control': (
('chain', 'draw', 'control'),
np.random.normal(size=(1, 50, 1)),
)
},
coords={'control': ['black_friday_buildup']},
)
model = _Model()
model.idata = _idata
coef = model.idata.posterior['gamma_control']
vals = coef.sel(control='black_friday_buildup').values.flatten()
mean = np.mean(vals)
ci_low, ci_high = np.percentile(vals, [2.5, 97.5])
p_neg = (vals < 0).mean()
print(f"buildup: {mean:.3f} [{ci_low:.3f}, {ci_high:.3f}] P(γ<0)={p_neg:.3f}")

Tips:

  • Report both the 95% credible interval and P(γ<0 | data) for a Bayesian view of evidence.
  • Coefficients are on the model’s scaled target space (target is max‑abs scaled in‑graph). For hypothesis testing (sign and non‑zero effect), this is generally sufficient.

(hypothesis-testing-pitfalls)=

  • Forgetting to add the column to extra_features_cols → the model won’t include your dummy.
  • Over‑wide priors (e.g., very large sigma) → slower discrimination; start with sigma: 1.
  • Vectorised priors order mismatch → only vectorise if you really need different widths; keep it scalar for simple tests.

(hypothesis-testing-see-also)=