Convergence Diagnostics (`src.diagnostics.convergence`)

Overview

run_convergence_diagnostics performs structured posterior convergence checks and writes machine-readable outputs to 50_diagnostics/. Its converged field is the primary convergence gate consumed downstream.

Function Signature

from src.diagnostics.convergence import run_convergence_diagnostics

run_convergence_diagnostics(
    model: Any,
    config: dict[str, Any],
    results_dir: str,
) -> dict[str, Any]

Parameters

Parameter	Description
`model`	Fitted model object exposing `idata` (or `fit_result`) with posterior and sample stats.
`config`	Run config dictionary (currently not used for thresholds).
`results_dir`	Run root directory; artefacts are saved under `50_diagnostics/`.

Artefacts Produced

Filename	Stage folder	Description
`convergence_report.json`	`50_diagnostics/`	Structured report with per-check outcomes and top-level `converged`.
`convergence_report.csv`	`50_diagnostics/`	Flat check table (`check`, `value`, `threshold`, `ok`).
`rank_trace.png`	`50_diagnostics/`	Rank-normalised trace plot (`az.plot_trace(..., kind="rank_vlines")`).
`energy_diagnostic.png`	`50_diagnostics/`	Energy diagnostic overlay (`az.plot_energy`).

Thresholds

Module constants:

Constant	Value	Meaning
`RHAT_WARN`	`1.01`	Preferred upper bound for split rank-normalised R-hat.
`RHAT_FAIL`	`1.05`	Severe R-hat boundary for escalation policy.
`ESS_PER_CHAIN`	`100`	Minimum effective draws per chain; total threshold = `100 * n_chains`.
`DIVERGENCE_WARN`	`1`	Any divergence indicates geometry risk.
`DIVERGENCE_FAIL`	`10`	Severe divergence count threshold.

Current converged logic in code is:

R-hat: all parameters <= RHAT_WARN.
ESS bulk and tail: all parameters >= ESS_PER_CHAIN * n_chains.
Divergences: exactly 0.

JSON Report Schema

{
  "converged": true,
  "n_chains": 4,
  "rhat": {
    "max": 1.003,
    "ok": true,
    "problematic_params": []
  },
  "ess_bulk": {
    "min": 910,
    "ok": true,
    "threshold": 400,
    "problematic_params": []
  },
  "ess_tail": {
    "min": 855,
    "ok": true,
    "threshold": 400,
    "problematic_params": []
  },
  "divergences": {
    "count": 0,
    "pct": 0.0,
    "ok": true
  },
  "recommendation": "Model has converged. Results are reliable."
}

Interpretation Guidance

converged = true: posterior exploration is acceptable for downstream interpretation.
converged = false: inspect problematic_params and divergence counts before trusting decomposition or optimisation.
Energy and rank plots are diagnostic evidence; they do not replace the machine gate fields.

Usage Example

from src.diagnostics.convergence import run_convergence_diagnostics

report = run_convergence_diagnostics(
    model=driver.model,
    config=driver.config,
    results_dir=driver.results_dir,
)

if not report["converged"]:
    print(report["recommendation"])

Relationship to Workflow Stages and Gates

Stage: 50_diagnostics/.
Feeds convergence gates (g2/g3/g4) through convergence_report.json.
converged is the key machine-readable status used by the workflow (including strict gating behaviour when configured).