Prior Predictive Check (`src.diagnostics.prior_predictive`)
Overview
Section titled “Overview”run_prior_predictive_check samples from the prior predictive distribution and checks whether simulated target values are plausible against observed data scale. Outputs are written to 10_pre_diagnostics/.
Function Signature
Section titled “Function Signature”from src.diagnostics.prior_predictive import run_prior_predictive_check
run_prior_predictive_check( model: Any, X_train: pd.DataFrame, y_train: pd.Series | np.ndarray, config: dict[str, Any], results_dir: str, samples: int = 500,) -> dict[str, Any]Parameters
Section titled “Parameters”| Parameter | Description |
|---|---|
model | ModelBuilder-compatible model with sample_prior_predictive(...). |
X_train | Training feature matrix used for prior predictive simulation. |
y_train | Observed training target values (reference distribution). |
config | Run configuration dictionary. |
results_dir | Run root directory; outputs are saved in 10_pre_diagnostics/. |
samples | Number of prior predictive draws (default 500). |
Artefacts Produced
Section titled “Artefacts Produced”| Filename | Stage folder | Description |
|---|---|---|
prior_predictive_summary.csv | 10_pre_diagnostics/ | Summary stats for prior predictive vs observed (mean, sd, min, max, p05, p95). |
prior_predictive_check.png | 10_pre_diagnostics/ | Prior predictive histogram with observed mean and observed range overlay. |
Plausibility Metric
Section titled “Plausibility Metric”The key metric is:
plausibility_ratio = fraction of prior draws inside [obs_min - 2*obs_sd, obs_max + 2*obs_sd].
Current warning rule in code:
warning = (plausibility_ratio < 0.5)
Interpretation:
>= 0.5: priors generate data broadly consistent with observed scale.< 0.5: priors may be too diffuse, too concentrated, or mis-centred for the current target.
Scale Handling
Section titled “Scale Handling”If target standardisation is applied in-graph, the function attempts inverse-transform before summarising and plotting prior predictive draws.
Usage Example
Section titled “Usage Example”from src.diagnostics.prior_predictive import run_prior_predictive_check
pp_result = run_prior_predictive_check( model=driver.model, X_train=driver.X_train, y_train=driver.y_train, config=driver.config, results_dir=driver.results_dir, samples=500,)
print(pp_result["plausibility_ratio"], pp_result["warning"])Relationship to Workflow Stages and Gates
Section titled “Relationship to Workflow Stages and Gates”- Stage:
10_pre_diagnostics/. - Feeds prior predictive gate
g1before posterior interpretation. - A warning does not automatically terminate execution by itself, but it indicates prior revision should be considered before relying on downstream decisions.