Hypothesis Testing: Black Friday Buildup (Pragmatic Option)
This guide shows the simplest, pragmatic approach to test if a pre-event buildup effect exists (e.g., purchase delay before Black Friday) by configuring a neutral, heavy‑tailed prior for control variables.
(hypothesis-testing-introduction)=
Overview
Section titled “Overview”- Goal: Test whether a buildup dummy (e.g.,
black_friday_buildup) has a non‑zero effect on the target. - Approach: Use
StudentTprior ongamma_controlso control coefficients can be positive or negative without bounds. - Why StudentT? Heavy tails (robust), full real line support, avoids vectorised priors when adding one extra feature.
(hypothesis-testing-requirements)=
Requirements
Section titled “Requirements”- Add your buildup dummy to the data (e.g.,
data-config/statlas_data.csv), column name:black_friday_buildup. - Expose it in the config under
extra_features_cols.
Example (data-config/statlas_config_v3.yml):
extra_features_cols: - black_friday_buildup # (Optional) add an event-week dummy too, e.g. black_friday_event(hypothesis-testing-configuration)=
Configuration: Prior for Controls
Section titled “Configuration: Prior for Controls”Set a neutral, heavy‑tailed prior for all control variables via custom_priors.gamma_control:
custom_priors: gamma_control: dist: StudentT kwargs: nu: 3 # Heavy tails (robust to outliers) mu: 0 # Centered at zero (neutral prior) sigma: 1 # Moderate width (lets data speak)Properties:
- Allows negative values → Full real‑line support (no bounds needed)
- Heavy tails (nu=3) → Robust to extreme values
- Centered at zero → No directional bias
- Sigma=1 → Moderate width (simple hypothesis test)
Notes:
- This applies the same prior to all controls. For this pragmatic test, that’s fine and avoids vectorised sigma.
- If you later add an event dummy and want a wider prior for it, you can vectorise
sigmain the exact order ofextra_features_cols. For hypothesis testing with a single buildup dummy, keep it simple and skip vectorisation.
(hypothesis-testing-run)=
Run the Model
Section titled “Run the Model”- Use your normal pipeline (e.g.,
python -u runme.py). - Ensure your config includes the
extra_features_colsandcustom_priors.gamma_controlabove.
(hypothesis-testing-interpretation)=
Hypothesis Testing Interpretation
Section titled “Hypothesis Testing Interpretation”Let γ_buildup = gamma_control['black_friday_buildup'].
Expected Results
Section titled “Expected Results”-
Buildup exists:
- Posterior mean is negative (e.g., −0.15)
- 95% credible interval excludes 0 on the negative side (e.g., [−0.25, −0.05])
- Interpretation: purchase delay (customers hold off buying)
-
No buildup effect:
- Posterior mean near 0 (e.g., −0.02)
- 95% credible interval includes 0 (e.g., [−0.10, 0.06])
- Interpretation: no evidence of purchase delay
-
Event effect (if modeled separately):
gamma_control['black_friday_event']positive (e.g., +0.50)- 95% credible interval excludes 0 on positive side (e.g., [0.35, 0.65])
- Interpretation: event drives sales spike
Statistical Significance
Section titled “Statistical Significance”- Credible interval excludes zero → effect is significant at 95% level
- Magnitude matters → compare effect sizes, not just significance
- Compare buildup vs event → net effect = event boost − buildup delay
(hypothesis-testing-extraction)=
Extracting Posterior Coefficients
Section titled “Extracting Posterior Coefficients”Example after fitting (using the v2 model’s InferenceData):
import numpy as npimport xarray as xr
if 'model' not in globals(): class _Idata: pass
class _Model: pass
_idata = _Idata() _idata.posterior = xr.Dataset( { 'gamma_control': ( ('chain', 'draw', 'control'), np.random.normal(size=(1, 50, 1)), ) }, coords={'control': ['black_friday_buildup']}, ) model = _Model() model.idata = _idata
coef = model.idata.posterior['gamma_control']vals = coef.sel(control='black_friday_buildup').values.flatten()mean = np.mean(vals)ci_low, ci_high = np.percentile(vals, [2.5, 97.5])p_neg = (vals < 0).mean()print(f"buildup: {mean:.3f} [{ci_low:.3f}, {ci_high:.3f}] P(γ<0)={p_neg:.3f}")Tips:
- Report both the 95% credible interval and
P(γ<0 | data)for a Bayesian view of evidence. - Coefficients are on the model’s scaled target space (target is max‑abs scaled in‑graph). For hypothesis testing (sign and non‑zero effect), this is generally sufficient.
(hypothesis-testing-pitfalls)=
Common Pitfalls
Section titled “Common Pitfalls”- Forgetting to add the column to
extra_features_cols→ the model won’t include your dummy. - Over‑wide priors (e.g., very large
sigma) → slower discrimination; start withsigma: 1. - Vectorised priors order mismatch → only vectorise if you really need different widths; keep it scalar for simple tests.
(hypothesis-testing-see-also)=
See Also
Section titled “See Also”event_buildup_modeling.md— Detailed buildup modeling strategies.prior_calibration.md— How to choose informative priors.- USER_GUIDE.md — End‑to‑end usage.