Monte Carlo Uncertainty Analysis (Hardened API)¶
This notebook demonstrates Monte Carlo uncertainty quantification for HEC-RAS hydraulic
models using the hardened RasMonteCarlo public API. The workflow generates parameter
samples from user-defined distributions, runs an ensemble of simulations via RasPermutation,
surfaces ensemble health, checks convergence, and computes percentile/prediction statistics
from the resulting HDF outputs.
What You'll Learn¶
- Generate parameter samples with Latin hypercube (LHS) and truncated-normal methods
- Build an
apply_fnwith the shipped factoryRasMonteCarlo.make_mannings_apply_fn()(no inline reimplementation) - Run an ensemble and read its status histogram (
completed/completed_with_errors/failed) - Check Monte Carlo convergence with
RasMonteCarlo.convergence()before trusting the statistics - Compute full-domain percentiles with
exceedance_probabilities()(with the default 95% valid-fraction guard) - Compute a labelled prediction (percentile) interval with
prediction_intervals() - Compute point-level WSE risk with
risk_at_points()and read itsstatus_accountingattrs
How 2D Land-Cover Roughness Is Perturbed (Critical)¶
For a 2D land-cover model the roughness Monte Carlo perturbs the plain-text .g##
LCMann Table base overrides via RasMonteCarlo.make_mannings_apply_fn(path="plaintext").
For those perturbations to actually reach the solver, run_ensemble() MUST be called with:
clone_geom=True-- gives every sample its own geometry, so each sample edits its own.g##LCMann Tableinstead of all samples sharing one geometry.clear_geompre=True-- regenerates the cached per-cellCells Center Manning's nfrom the perturbedLCManntable; without it HEC-RAS reuses the stale preprocessednand the ensemble is hydraulically inert (zero WSE spread).
force_geompre is deliberately NOT used: it deletes the .g##.hdf land-cover association
and would detach the model from its sidecar. The WSE-sensitivity guard cell after Example 1
asserts the band is non-trivial so an inert ensemble cannot slip through silently.
Sample Sizes¶
- Example 1 (breach + inflow): N = 100 — production-scale ensemble distributed across remote workers. This is large enough for defensible P10/P50/P90 estimation on the primary uncertainty drivers. The convergence diagnostic below reports whether the statistic stabilizes.
- Example 2 (roughness contrast): N = 30 — demonstration-scale ensemble showing that roughness is a secondary/tertiary driver for this dam-breach scenario. Kept small since the point is the contrast against Example 1, not standalone percentile reliability.
The repository research (feature_dev_notes/Monte_Carlo_Uncertainty/research_findings.md §9)
calls for ~500–1000 LHS samples for robust tail percentiles (P01/P99). N=100 is adequate for
P10/P90 but not for extreme tails.
Two Examples¶
| Example | Sampling method | Parameters varied | Output analysis |
|---|---|---|---|
| 1 | Latin hypercube (N=100) | Breach formation (width, time, slopes) + inflow multiplier | Full-domain percentiles + convergence + prediction interval + WSE-sensitivity guard |
| 2 | Truncated normal (N=30) | Manning's n (correlated multiplier, all wetted classes) | Point-level WSE risk statistics |
Prerequisites¶
- HEC-RAS 6.x installed and resolvable by ras-commander
ras-commanderwith the hardenedRasMonteCarlo(this branch / ≥ the montecarlo-hardened release)- Expected runtime: Example 1 uses remote workers (~8-10 hours with full fleet); Example 2 runs locally (~2-3 hours)
- Disk space: cloned batch folders are created under the project; reuse logic skips completed runs
Dev Mode — Local Source Install¶
Uncomment the block in the next cell to load ras-commander from a local clone instead of the
installed package. Leave it commented when running from the published package. Never leave both
the pip-install cell and the dev-mode block active at the same time.
# Dev-mode toggle: uncomment to use a local ras-commander clone
# import sys, importlib
# sys.path.insert(0, r"G:/GH/ras-commander")
# import ras_commander
# importlib.reload(ras_commander)
from pathlib import Path
import sys
# Flexible imports for development vs installed package
try:
from ras_commander import (
init_ras_project, ras, RasExamples, RasPlan,
RasMonteCarlo, RasPermutation,
)
from ras_commander.geom import GeomLandCover
from ras_commander.hdf import HdfMesh
except ImportError:
current_file = Path.cwd()
parent_directory = current_file.parent
sys.path.append(str(parent_directory))
from ras_commander import (
init_ras_project, ras, RasExamples, RasPlan,
RasMonteCarlo, RasPermutation,
)
from ras_commander.geom import GeomLandCover
from ras_commander.hdf import HdfMesh
import re
import shutil
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from IPython.display import display
plt.style.use("default")
pd.set_option("display.max_columns", 20)
pd.set_option("display.width", 120)
# ── Configuration ─────────────────────────────────────────────────────
PROJECT_NAME = "BaldEagleCrkMulti2D" # 2D unsteady example project
RAS_VERSION = "6.6" # installed HEC-RAS version
TEMPLATE_PLAN = "01" # plan used as the MC template
N_SAMPLES_EX1 = 100 # Ex1 ensemble size (production scale for breach+inflow)
N_SAMPLES_EX2 = 30 # Ex2 ensemble size (demo scale — roughness contrast)
SEED = 42 # reproducible sampling
MAX_WORKERS = 2 # parallel HEC-RAS processes (2x2=4 cores, coexist-safe)
NUM_CORES = 2 # cores per HEC-RAS process
TIMEOUT_SEC = 86400 # 24 h hard cap -- covers slow geometry-preprocessing plans
# ── Breach / inflow configuration (for Example 1) ─────────────────────
# Plan 01 has three user-defined breaches: Dam (primary), Upper Levee, Lock Haven reach.
# Breach Geom (Dam) = 5700, 200, 595, 0.5, 0.5, True, 0.5, 630, ..., 2.6
# initial_width (index 1): 200 ft
# final_bottom_elev (index 2): 595 ft
# left/right slope (idx 3/4): 0.5 H:V
# formation_time (index 9): 2.6 hr
# Inflow BC: "Upstream Q" lateral inflow into Reservoir Pool (unsteady file u01).
DAM_STRUCTURE = "Dam" # primary breach structure name in plan file
INFLOW_BC = "Upstream Q" # bc_line_name in ras.boundaries_df
BREACH_INIT_WIDTH_BASE = 200.0 # ft (index 1 in Breach Geom)
BREACH_FORM_TIME_BASE = 2.6 # hr (index 9 in Breach Geom; library _BREACH_GEOM_INDEX)
BREACH_SLOPE_BASE = 0.5 # H:V side slopes
print(f"N_SAMPLES_EX1 = {N_SAMPLES_EX1}, N_SAMPLES_EX2 = {N_SAMPLES_EX2}")
print(f"DAM_STRUCTURE={DAM_STRUCTURE!r} INFLOW_BC={INFLOW_BC!r}")
Step 1 — Extract Project and Inspect Manning's n Table¶
We use BaldEagleCrkMulti2D, a 2D unsteady example bundled with ras-commander. Before defining
Monte Carlo parameters, we inspect the land-cover-based Manning's n table in the geometry file
so we know which class names are available to perturb. The class names matter: the shipped
make_mannings_apply_fn() factory matches sample columns to land-cover class names exactly.
import json
from pathlib import Path
# ── RasRemote Worker Setup ─────────────────────────────────────────────────────
# Loads a worker fleet from working/remote_workers_clb.json (gitignored, has real credentials).
# Falls back to examples/remote_workers_clb_template.json (committed, redacted) for reference.
#
# To enable distributed execution:
# 1. Copy examples/remote_workers_clb_template.json -> working/remote_workers_clb.json
# 2. Fill in real credentials and set "enabled": true on each desired worker
# 3. Verify shares and sessions: query session /server:HOSTNAME (expected: session_id 2)
# 4. Set USE_REMOTE_WORKERS = True below
#
# When USE_REMOTE_WORKERS = False the notebook uses local compute_parallel (current default).
USE_REMOTE_WORKERS = True
# Resolve relative to the examples/ directory (works in both script and notebook contexts)
_nb_dir = Path.cwd() # nbconvert sets cwd to examples/
_workers_json_real = _nb_dir.parent / "working" / "remote_workers_clb.json"
_workers_json_template = _nb_dir / "remote_workers_clb_template.json"
def _mask_credentials(cfg: dict) -> dict:
"""Return a copy of the config dict with passwords replaced by ***."""
import copy
masked = copy.deepcopy(cfg)
for w in masked.get("workers", []):
if w.get("password"):
w["password"] = "***"
if w.get("username") and w["username"] not in ("", ".\\***REDACTED***"):
pass # usernames are not secret
return masked
remote_workers = []
if USE_REMOTE_WORKERS:
try:
from ras_commander.remote import load_workers_from_json
_cfg_path = _workers_json_real if _workers_json_real.exists() else _workers_json_template
remote_workers = load_workers_from_json(_cfg_path, enabled_only=True)
# Exclude local workers (CLB08) — run only on remote machines
remote_workers = [w for w in remote_workers if getattr(w, "worker_type", "") != "local"]
# Display masked config so passwords never appear in notebook output
with open(_cfg_path) as _f:
_cfg_raw = json.load(_f)
_cfg_masked = _mask_credentials(_cfg_raw)
total_slots = sum(
w.cores_total // max(w.cores_per_plan, 1) for w in remote_workers
)
print(f"Loaded {len(remote_workers)} remote workers ({total_slots} concurrent slots, local excluded):")
for w in _cfg_masked["workers"]:
if w.get("enabled", False) and w.get("worker_type", "") != "local":
_type = w.get("worker_type", "?")
_cores = f"{w.get('cores_total','?')} total / {w.get('cores_per_plan','?')} per plan"
_user = f" user={w.get('username','')}" if w.get("username") else ""
print(f" [{_type:8s}] {w['name']:45s} cores: {_cores}{_user}")
except Exception as _e:
print(f"Warning: could not load remote workers ({_e}) — falling back to local execution.")
USE_REMOTE_WORKERS = False
remote_workers = []
else:
print("USE_REMOTE_WORKERS = False — using local compute_parallel.")
print(f"Worker config available at: {_workers_json_real}")
# MAX_WORKERS / NUM_CORES govern local parallel execution when USE_REMOTE_WORKERS = False.
# When remote workers are active, RasMonteCarlo.run_ensemble uses the worker fleet instead.
# Extract a fresh copy so we don't pollute the cached example
project_path = RasExamples.extract_project(PROJECT_NAME, suffix="116mcfix")
init_ras_project(project_path, RAS_VERSION)
print(f"Project folder: {project_path}")
print(f"Plans in project: {len(ras.plan_df)}")
print()
plan_overview = ras.plan_df[["plan_number", "Plan Title", "geometry_number", "unsteady_number"]]
display(plan_overview.sort_values("plan_number").reset_index(drop=True))
# Confirm the template plan exists
assert TEMPLATE_PLAN in ras.plan_df["plan_number"].values, (
f"Template plan {TEMPLATE_PLAN!r} not found. Update TEMPLATE_PLAN above."
)
# Inspect the base Manning's n land-cover table from the geometry file
template_row = ras.plan_df[ras.plan_df["plan_number"] == TEMPLATE_PLAN].iloc[0]
geom_path = RasPlan.get_geom_path(template_row["geometry_number"])
mannings_base = GeomLandCover.get_base_mannings_n(geom_path)
if mannings_base.empty:
raise RuntimeError(
f"No 'LCMann Table' found in {Path(geom_path).name}. "
"Verify the geometry file uses land-cover-based Manning's n."
)
# Drop any blank class names so the apply_fn always has a real class to target
mannings_named = mannings_base[
mannings_base["Land Cover Name"].str.strip() != ""
].reset_index(drop=True)
print(f"\nManning's n land-cover classes in {Path(geom_path).name}: {len(mannings_named)}")
display(mannings_named)
Helper — Reuse Completed Ensemble Results¶
Ensemble runs are expensive, so each example reuses completed result HDFs when they already
exist on disk (idempotent re-runs). The helper below reconstructs an ensemble-result dict whose
shape matches RasMonteCarlo.run_ensemble() output: a results_df with a status column, plus
completed / completed_with_errors / failed counts and a status_histogram. The hardened
statistics methods read these fields, so the reuse stub must populate them honestly.
import h5py
_PLAN_HDF_RE = re.compile(r"\.p\d{2,3}\.hdf$", re.IGNORECASE)
def _hdf_has_results(hdf_path: Path) -> bool:
"""Return True only if the HDF contains a completed Results/Unsteady group (C-3 fix)."""
try:
with h5py.File(hdf_path, "r") as f:
return "Results/Unsteady" in f
except Exception:
return False
def _load_perm_log(batch_folder: Path) -> "pd.DataFrame | None":
"""Load permutations_log.csv from a batch folder, return None if missing (C-2 fix)."""
log_path = batch_folder / "permutations_log.csv"
if log_path.exists():
return pd.read_csv(log_path)
return None
def discover_completed_hdfs(project_folder, suffix):
"""
Return sorted plan-result HDFs across all batch folders for a suffix.
C-3 fix: only counts HDFs that actually contain Results/Unsteady (not truncated stubs).
Returns (batches, hdfs, perm_log) where perm_log maps plan_number -> sample_id/params.
"""
batches = list(RasPermutation.discover_batch_folders(project_folder, suffix=suffix))
perm_log_frames = []
hdfs = []
for b in batches:
log_df = _load_perm_log(Path(b))
if log_df is not None:
perm_log_frames.append(log_df)
for h in sorted(Path(b).rglob("*.hdf")):
if _PLAN_HDF_RE.search(h.name) and not h.name.endswith(".tmp.hdf"):
if _hdf_has_results(h): # C-3: verify completion
hdfs.append(h)
perm_log = pd.concat(perm_log_frames, ignore_index=True) if perm_log_frames else None
return batches, hdfs, perm_log
def build_reuse_ensemble(hdf_paths, n_samples, perm_log=None):
"""
Reconstruct a run_ensemble()-shaped dict from completed HDFs.
C-1/C-2 fix: uses permutations_log.csv to recover the original sample_id and
parameter draw for each HDF, preserving the LHS sequence order so convergence()
computes over the true Monte Carlo ordering rather than an arbitrary filename sort.
If perm_log is unavailable the fallback is filesystem order with a warning.
"""
rows = []
if perm_log is not None and "plan_number" in perm_log.columns:
# Build map: plan_number (zero-padded) -> log row
log_map = {
str(row["plan_number"]).zfill(2): row
for _, row in perm_log.iterrows()
}
for h in hdf_paths[:n_samples]:
# Extract plan number from filename (e.g. BaldEagleDamBrk.p07.hdf -> "07")
m = re.search(r"\.p(\d{2,3})\.hdf$", h.name, re.IGNORECASE)
plan_str = m.group(1) if m else None
log_row = log_map.get(plan_str) if plan_str else None
if log_row is not None:
rows.append({
"sample_id": float(log_row["sample_id"]),
"absolute_perm_id": int(log_row["absolute_perm_id"]),
"status": "completed",
"hdf_path": str(h),
"runtime_seconds": None,
})
else:
# HDF present but not in log -- flag as attribution unknown
rows.append({
"sample_id": float("nan"),
"absolute_perm_id": -1,
"status": "completed",
"hdf_path": str(h),
"runtime_seconds": None,
})
unmapped = sum(1 for r in rows if r["absolute_perm_id"] == -1)
if unmapped:
print(f" WARNING (C-2): {unmapped} reused HDF(s) could not be matched to "
"permutations_log -- their sample ordering is unknown.")
# Sort by sample_id to restore LHS draw order for convergence() (C-1 fix)
rows.sort(key=lambda r: (float("inf") if pd.isna(r["sample_id"]) else r["sample_id"]))
else:
print(" WARNING (C-1/C-2): No permutations_log found -- reuse uses filesystem order. "
"convergence() results may not reflect true LHS sequence.")
rows = [
{
"sample_id": float(i + 1),
"absolute_perm_id": -1,
"status": "completed",
"hdf_path": str(h),
"runtime_seconds": None,
}
for i, h in enumerate(hdf_paths[:n_samples])
]
results_df = pd.DataFrame(rows)
n_ok = (results_df["absolute_perm_id"] != -1).sum()
print(f" Reuse: {len(rows)} HDFs loaded, {n_ok} with verified sample attribution.")
return {
"total_samples": n_samples,
"completed": len(rows),
"completed_with_errors": 0,
"failed": 0,
"status_histogram": {"total": len(rows), "completed": len(rows)},
"results_df": results_df,
}
def report_ensemble_health(ensemble_result, label):
"""Surface the hardened ensemble-health accounting (C2 / M1)."""
histogram = RasMonteCarlo.status_histogram(ensemble_result)
print(f"{label} ensemble health")
print(f" total samples : {ensemble_result.get('total_samples')}")
print(f" completed : {ensemble_result.get('completed')}")
print(f" completed_with_errors : {ensemble_result.get('completed_with_errors')}")
print(f" failed : {ensemble_result.get('failed')}")
print(f" status_histogram : {histogram}")
cwe = ensemble_result.get("completed_with_errors", 0) or 0
if cwe:
print(
f" NOTE: {cwe} run(s) finished with compute errors and are "
"EXCLUDED from statistics by default (include_error_runs=False)."
)
return histogram
print("Reuse / health helpers defined (C-1/C-2/C-3 hardened).")
Example 1 — Latin Hypercube Manning's n Sweep¶
Latin hypercube sampling (LHS) partitions each parameter's range into equal-probability strata
and draws one sample per stratum, giving better parameter-space coverage than random sampling at
the same N. We vary Manning's n for two land-cover classes and tag each spec with
kind="mannings_n" so the hardened sampler emits a physical-range warning (H3) if any bound
leaves the plausible roughness range ~[0.01, 0.2].
Workflow: generate_samples → make_mannings_apply_fn → run_ensemble → status_histogram
→ convergence → exceedance_probabilities → prediction_intervals.
Choosing Classes That Actually Move WSE¶
We perturb Cultivated Crops and Open Water because they cover the wetted floodplain and channel of this model:
- Cultivated Crops is roughly ~30% of the inundated floodplain area, so its roughness directly governs out-of-bank conveyance and floodplain storage timing.
- Open Water is the main-channel conveyance class -- channel roughness is the dominant control on in-bank stage and the rising/falling limb.
Perturbing insignificant classes (e.g. NoData, Barren Land) that barely intersect
the wetted area yields zero WSE sensitivity -- the samples vary an input the solver
never integrates over the flooded domain. Class selection is therefore the first-order
decision in a land-cover roughness Monte Carlo.
# ── Example 1 parameter specification: breach formation + inflow ──────────────
#
# For a dam-breach scenario the dominant uncertainty drivers are, in order:
# 1. Breach formation parameters (width, formation time, side slopes)
# → set the peak breach outflow directly
# 2. Inflow hydrograph (reservoir upstream inflow)
# → scales the released volume
# 3. Roughness (floodplain Manning's n)
# → secondary conveyance term; see Example 2
#
# Ranges are demonstration-reasonable (±30–50% for breach, ±20% for inflow).
# For a production breach study, source breach-parameter uncertainty from
# Froehlich (2008), MacDonald & Langridge-Monopolis, or Xu & Zhang regression scatter.
param_specs_ex1 = {
# 1st-order: breach formation — largest authority over downstream peak stage
"breach_width": {
"min": BREACH_INIT_WIDTH_BASE * 0.70, # 140 ft
"max": BREACH_INIT_WIDTH_BASE * 1.30, # 260 ft
"mean": BREACH_INIT_WIDTH_BASE,
"std": BREACH_INIT_WIDTH_BASE * 0.12,
},
"breach_formation_time": { # hr — biggest single peak-Q lever
"min": BREACH_FORM_TIME_BASE * 0.50, # 1.3 hr (faster → higher peak)
"max": BREACH_FORM_TIME_BASE * 1.50, # 3.9 hr
"mean": BREACH_FORM_TIME_BASE,
"std": BREACH_FORM_TIME_BASE * 0.18,
},
"breach_left_slope": {"min": 0.25, "max": 0.75, "mean": BREACH_SLOPE_BASE, "std": 0.10},
"breach_right_slope": {"min": 0.25, "max": 0.75, "mean": BREACH_SLOPE_BASE, "std": 0.10},
# 2nd-order: inflow volume — UNIFORM peak+volume scaler (not AEP/flow-frequency)
"flow_qmult": {"min": 0.80, "max": 1.20, "mean": 1.0, "std": 0.08, "kind": "flow_qmult"},
}
samples_ex1 = RasMonteCarlo.generate_samples(
param_specs = param_specs_ex1,
n_samples = N_SAMPLES_EX1,
method = "latin_hypercube",
seed = SEED,
)
print(f"Generated {len(samples_ex1)} samples (Latin hypercube, breach+inflow):")
display(samples_ex1.round(4).head(10))
# Visualise parameter coverage
fig, axes = plt.subplots(1, 2, figsize=(12, 4))
axes[0].scatter(
samples_ex1["breach_width"], samples_ex1["breach_formation_time"],
s=60, edgecolors="steelblue", facecolors="lightblue",
)
axes[0].axvline(BREACH_INIT_WIDTH_BASE, color="grey", linestyle="--", linewidth=0.8, label="Baseline")
axes[0].axhline(BREACH_FORM_TIME_BASE, color="grey", linestyle="--", linewidth=0.8)
axes[0].set_xlabel("Initial breach width (ft)")
axes[0].set_ylabel("Formation time (hr)")
axes[0].set_title("Breach formation — primary driver")
axes[0].legend(fontsize=9)
axes[1].hist(samples_ex1["flow_qmult"], bins=15, color="coral", edgecolor="white")
axes[1].axvline(1.0, color="grey", linestyle="--", linewidth=0.8, label="Baseline")
axes[1].set_xlabel("Inflow QMult (uniform peak+volume scaler — NOT an AEP sample)")
axes[1].set_ylabel("Sample count")
axes[1].set_title("Inflow multiplier — secondary driver")
axes[1].legend(fontsize=9)
plt.suptitle(f"Example 1 — Breach + Inflow LHS (N={N_SAMPLES_EX1})", fontsize=11)
plt.tight_layout()
plt.show()
# Composite apply_fn: breach formation parameters + inflow multiplier.
#
# make_breach_apply_fn → writes perturbed breach geometry into the plan file (.p##)
# make_flow_multiplier_apply_fn → writes Flow Hydrograph QMult= into the unsteady file
# make_composite_apply_fn → chains both; each sample applies breach then inflow
#
# clone_geom=True is required so each sample gets its own geometry (needed if
# a roughness leg is added later). clear_geompre=True forces per-cell n regeneration.
apply_breach = RasMonteCarlo.make_breach_apply_fn(
structure_name = DAM_STRUCTURE,
param_column_map = {
"initial_width": "breach_width",
"formation_time": "breach_formation_time",
"left_slope": "breach_left_slope",
"right_slope": "breach_right_slope",
},
)
apply_inflow = RasMonteCarlo.make_flow_multiplier_apply_fn(bc_name=INFLOW_BC)
apply_fn_ex1 = RasMonteCarlo.make_composite_apply_fn(apply_breach, apply_inflow)
# Smart reuse: C-1/C-2/C-3 hardened -- verifies Results/Unsteady, maps via permutations_log
_ex1_batches, _ex1_done_hdfs, _ex1_perm_log = discover_completed_hdfs(project_path, suffix="mc_breach")
if len(_ex1_done_hdfs) >= N_SAMPLES_EX1:
print(f"Found {len(_ex1_done_hdfs)} completed result HDFs — reusing.")
ensemble_ex1 = build_reuse_ensemble(_ex1_done_hdfs, N_SAMPLES_EX1, perm_log=_ex1_perm_log)
else:
if _ex1_batches:
print(f"Removing {len(_ex1_batches)} incomplete batch folder(s)...")
for folder in _ex1_batches:
shutil.rmtree(folder, ignore_errors=True)
ensemble_ex1 = RasMonteCarlo.run_ensemble(
template_plan = TEMPLATE_PLAN,
samples_df = samples_ex1,
apply_fn = apply_fn_ex1,
suffix = "mc_breach",
max_workers = MAX_WORKERS,
num_cores = NUM_CORES,
timeout_sec = TIMEOUT_SEC,
clone_geom = True,
clear_geompre = True,
workers = remote_workers if USE_REMOTE_WORKERS else None,
max_plans_per_batch = 40,
)
report_ensemble_health(ensemble_ex1, "Example 1 (Breach + Inflow)")
display(ensemble_ex1["results_df"][["sample_id", "status"]].head(10))
Convergence Diagnostic (C3)¶
Before reporting any percentile band we ask whether the statistic has stabilized at this N.
RasMonteCarlo.convergence() recomputes a running statistic (here the domain-aggregated P90 WSE)
over the first k samples for k = 1..N and flags stabilized=True when the maximum relative
change across the trailing window samples falls below tolerance (2% here). At N=30 the result
is reported honestly — if stabilized is False, the bands below are not trustworthy and a
production run would need many more samples.
conv_ex1 = RasMonteCarlo.convergence(
ensemble_result = ensemble_ex1,
variable = "wse",
statistic = "p90",
window = 5,
tolerance = 0.02,
aggregate = "mean", # reduce per-cell P90 to one scalar per sample count
)
sample_counts = np.asarray(conv_ex1["sample_counts"])
running_statistic = np.asarray(conv_ex1["running_statistic"])
stabilized = conv_ex1["stabilized"]
final_rel_change = conv_ex1["final_relative_change"]
print(f"Convergence statistic : {conv_ex1['statistic']} (aggregate mean)")
print(f"Samples used : {conv_ex1['n_samples_used']}")
print(f"Stabilization window / tol : {conv_ex1['window']} samples / {conv_ex1['tolerance']:.0%}")
print(f"Final relative change : {final_rel_change:.4f}")
print(f"Stabilized at N={N_SAMPLES_EX1}? : {stabilized}")
if stabilized:
print(" -> Running P90 has stabilized within tolerance at this N (still a small ensemble).")
else:
print(" -> NOT stabilized: the P90 band below is demonstration-only; add samples for production.")
fig, ax = plt.subplots(figsize=(8, 4.5))
ax.plot(sample_counts, running_statistic, marker="o", markersize=4, color="steelblue")
ax.set_xlabel("Number of samples")
ax.set_ylabel("Running P90 WSE (domain mean, ft)")
ax.set_title(f"Example 1 — Convergence of P90 WSE (stabilized={stabilized})")
ax.grid(True, alpha=0.35)
plt.tight_layout()
plt.show()
# Full-domain percentiles. The default min_valid_fraction=0.95 guard (C2) raises if too many
# samples were dropped/failed; we rely on it rather than overriding allow_low_valid_fraction.
exc_ex1 = RasMonteCarlo.exceedance_probabilities(
ensemble_result = ensemble_ex1,
variable = "wse",
percentiles = [90, 50, 10],
)
n_used = exc_ex1["n_samples_used"]
mean_wse = exc_ex1["mean"]
p90_wse = exc_ex1["percentiles"][90.0]
p50_wse = exc_ex1["percentiles"][50.0]
p10_wse = exc_ex1["percentiles"][10.0]
# Mask dry cells (NaN) for plotting
wet = ~np.isnan(mean_wse)
spread = p90_wse[wet] - p10_wse[wet] # percentile band per cell
print(f"Samples used in aggregation : {n_used} (of {ensemble_ex1['total_samples']})")
print(f"Status accounting : {exc_ex1['status_accounting']}")
print(f"Wet cells : {wet.sum():,}")
print(f"Mean WSE range : {mean_wse[wet].min():.2f} - {mean_wse[wet].max():.2f} ft")
print(f"P90 - P10 band (median) : {np.nanmedian(spread):.4f} ft")
print("Percentiles are NON-EXCEEDANCE quantiles (value not exceeded N% of the time across samples).")
fig, axes = plt.subplots(1, 2, figsize=(12, 4))
axes[0].hist(mean_wse[wet], bins=40, color="steelblue", edgecolor="white")
axes[0].set_title("Distribution of Mean WSE across mesh cells")
axes[0].set_xlabel("Mean WSE (ft)")
axes[0].set_ylabel("Cell count")
axes[1].hist(spread, bins=40, color="tomato", edgecolor="white")
axes[1].set_title("P90 - P10 percentile band per cell")
axes[1].set_xlabel("WSE band (ft)")
axes[1].set_ylabel("Cell count")
plt.suptitle(f"Example 1 — LHS Manning's n (full-domain, n_used={n_used})", fontsize=11)
plt.tight_layout()
plt.show()
Prediction (Percentile) Interval (M3)¶
prediction_intervals() returns the central confidence_level band of the realized variable.
Despite the legacy confidence_intervals name, this is a prediction / percentile interval on
the variable, NOT a confidence interval on a statistic — the returned interval_type is
"prediction". We label it accordingly and report n_samples_used. At N=30 the method stops
warning (its min_samples_warn default is 30), but the band is still demonstration-scale.
pred_ex1 = RasMonteCarlo.prediction_intervals(
ensemble_result = ensemble_ex1,
variable = "wse",
confidence_level = 0.90, # central 90% band -> 5th/50th/95th percentiles
)
lower_band = pred_ex1["lower"]
upper_band = pred_ex1["upper"]
wet_pi = ~np.isnan(lower_band) & ~np.isnan(upper_band)
band_width = upper_band[wet_pi] - lower_band[wet_pi]
print(f"Interval type : {pred_ex1['interval_type']} (prediction/percentile, NOT a CI on a statistic)")
print(f"Confidence level : {pred_ex1['confidence_level']:.0%}")
print(f"Lower/upper pctile : {pred_ex1['lower_percentile']:.1f} / {pred_ex1['upper_percentile']:.1f}")
print(f"Samples used : {pred_ex1['n_samples_used']}")
print(f"Median band width : {np.nanmedian(band_width):.4f} ft (90% prediction band per cell)")
fig, ax = plt.subplots(figsize=(8, 4))
ax.hist(band_width, bins=40, color="mediumpurple", edgecolor="white")
ax.set_title("Example 1 — 90% PREDICTION band width per cell (P95 - P5 WSE)")
ax.set_xlabel("Prediction band width (ft)")
ax.set_ylabel("Cell count")
plt.tight_layout()
plt.show()
Inert-ensemble guard: confirm the land-cover roughness sampling moved WSE.¶
¶
H-3 fix: assert on BOTH max band AND the fraction of wet cells with band > threshold.¶
Asserting on max alone is trivially satisfied by a single breach-front cell -- it gives¶
false confidence when the median band is 0.000 ft (i.e. most of the domain is inert).¶
The fraction guard catches a nearly-inert ensemble that slips through the max check.¶
pred_guard = RasMonteCarlo.prediction_intervals( ensemble_result = ensemble_ex1, variable = "wse", ras_object = ras, )
_lower = pred_guard["lower"] _upper = pred_guard["upper"] _wet = ~np.isnan(_lower) & ~np.isnan(_upper) _band = _upper[_wet] - _lower[_wet]
median_band = float(np.nanmedian(_band)) if _band.size else float("nan") max_band = float(np.nanmax(_band)) if _band.size else float("nan") n_gt_0p1 = int(np.count_nonzero(_band > 0.1)) frac_gt_0p1 = n_gt_0p1 / _band.size if _band.size else 0.0
print("90% WSE prediction band (Example 1):") print(f" wet cells evaluated : {_band.size:,}") print(f" median band width : {median_band:.4f} ft") print(f" max band width : {max_band:.4f} ft") print(f" cells with band>0.1ft : {n_gt_0p1:,} ({frac_gt_0p1:.1%} of wet domain)")
Guard 1: max band must exceed 0.05 ft (original check)¶
assert max_band > 0.05, ( f"INERT ENSEMBLE: max 90% WSE band = {max_band:.4f} ft (<= 0.05 ft). " "The roughness perturbation did not reach the solver." )
Guard 2 (H-3 fix): at least 1% of wet cells must show band > 0.05 ft.¶
This catches an ensemble where only a handful of breach-front cells move¶
while >99% of the flooded domain is hydraulically inert to roughness.¶
MIN_RESPONSIVE_FRAC = 0.01 frac_responsive = np.count_nonzero(_band > 0.05) / _band.size if _band.size else 0.0 if frac_responsive < MIN_RESPONSIVE_FRAC: print(f"\nWARNING (H-3): only {frac_responsive:.2%} of wet cells show band > 0.05 ft " f"(threshold {MIN_RESPONSIVE_FRAC:.0%}). The ensemble may be effectively inert " "for most of the flooded domain even though max_band passes. " "Consider: (a) verifying per-cell Manning's n actually differs across samples, " "(b) sampling breach/inflow parameters instead of (or in addition to) roughness.") else: print(f"\nPASS: roughness sampling moved WSE (max band {max_band:.3f} ft > 0.05 ft, " f"{frac_responsive:.1%} of wet cells responsive).")
# ── Per-cell Manning's n propagation diagnostic ───────────────────────────────
# This check separates two completely different failure modes:
#
# "The perturbation didn't reach the solver" (implementation failure)
# → per-cell n is IDENTICAL across samples despite different LCMann tables
# → fix: verify clone_geom=True + clear_geompre=True
#
# "The model is genuinely insensitive to roughness" (correct physics)
# → per-cell n DIFFERS across samples, but WSE barely moves in the deep core
# → this is expected for a breach wave; the deep core is momentum/volume dominated
#
# The prior roughness-only run (mc_ex2) is the right batch to check since
# those samples perturb ONLY Manning's n.
import h5py, pathlib
def _get_cell_mannings_n(geom_hdf_path):
"""Read 'Cells Center Manning's n' from a preprocessed geometry HDF."""
with h5py.File(geom_hdf_path, "r") as f:
# Standard HEC-RAS 6.x path
for area_key in f.get("Geometry/2D Flow Areas", {}).keys():
ds_path = f"Geometry/2D Flow Areas/{area_key}/Cells Center Manning's n"
if ds_path in f:
return np.array(f[ds_path])
return None
_rough_folder = pathlib.Path(project_path).parent
# Find first two completed geom HDFs from the roughness batch
_rough_batch_dirs = [d for d in _rough_folder.glob("*116mcfix_mc_rough*")
if d.is_dir() and "Worker" not in d.name]
_geom_hdfs = []
for _bd in _rough_batch_dirs:
_geom_hdfs.extend(sorted(_bd.glob("*.g??.hdf")))
if len(_geom_hdfs) >= 2:
break
if len(_geom_hdfs) >= 2:
_n0 = _get_cell_mannings_n(_geom_hdfs[0])
_n1 = _get_cell_mannings_n(_geom_hdfs[1])
if _n0 is not None and _n1 is not None:
_differ = not np.allclose(_n0, _n1, equal_nan=True)
_max_dn = float(np.nanmax(np.abs(_n0 - _n1)))
print("Per-cell Manning's n propagation check:")
print(f" Sample 1 geom HDF : {_geom_hdfs[0].name}")
print(f" Sample 2 geom HDF : {_geom_hdfs[1].name}")
print(f" Per-cell n differs: {_differ} (max |Δn| = {_max_dn:.5f})")
if _differ:
print(" PASS: roughness perturbation reached per-cell n (propagation confirmed).")
print(" A near-zero WSE median band is therefore CORRECT PHYSICS, not a pipeline bug.")
else:
print(" FAIL: per-cell n is identical across samples — perturbation did NOT propagate.")
print(" Check: clone_geom=True and clear_geompre=True must both be set.")
else:
print("Could not read per-cell n — geom HDF may not be preprocessed yet.")
else:
print(f"No roughness batch geom HDFs found in {_rough_folder} — run Example 2 first.")
# Inert-ensemble guard: confirm the land-cover roughness sampling moved WSE.
pred_guard = RasMonteCarlo.prediction_intervals(
ensemble_result = ensemble_ex1,
variable = "wse",
ras_object = ras,
)
_lower = pred_guard["lower"]
_upper = pred_guard["upper"]
_wet = ~np.isnan(_lower) & ~np.isnan(_upper)
_band = _upper[_wet] - _lower[_wet]
median_band = float(np.nanmedian(_band)) if _band.size else float("nan")
max_band = float(np.nanmax(_band)) if _band.size else float("nan")
n_gt_0p1 = int(np.count_nonzero(_band > 0.1))
print("90% WSE prediction band (Example 1):")
print(f" wet cells evaluated : {_band.size:,}")
print(f" median band width : {median_band:.4f} ft")
print(f" max band width : {max_band:.4f} ft")
print(f" cells with band>0.1ft : {n_gt_0p1:,}")
assert max_band > 0.05, (
f"INERT ENSEMBLE: max 90% WSE band = {max_band:.4f} ft (<= 0.05 ft). "
"The roughness perturbation did not reach the solver -- verify clone_geom=True "
"and clear_geompre=True so each sample regenerates per-cell Manning's n."
)
print(f"\nPASS: roughness sampling moved WSE (max band {max_band:.3f} ft > 0.05 ft).")
Example 2 — Truncated Normal + Point-Level Risk¶
The truncated normal concentrates samples near a calibrated baseline while respecting physical
min/max bounds — ideal for local sensitivity around a well-characterised prior. We vary three
land-cover classes, again building the apply_fn from the shipped factory, then compute WSE risk
statistics at a handful of mesh-cell locations with risk_at_points(). The point-level output is
a compact DataFrame alternative to the full-domain arrays from exceedance_probabilities(), and
carries the same bias accounting via df.attrs.
# ── Example 2: Roughness as a secondary driver — correlated single multiplier ──
#
# A single shared multiplier k applied to ALL wetted land-cover classes models
# the realistic dominant roughness epistemic error: "my entire roughness field
# is systematically biased by factor k" (e.g. all n values set too low/high).
#
# This is more defensible than independent per-class sampling (which severs
# hydraulic coupling and is neither an upper nor lower bound on the true band).
#
# Purpose of this example: contrast the roughness-only band against the
# breach+inflow band from Example 1 to DEMONSTRATE that roughness is a
# third-order term for this dam-breach scenario. The median band is expected
# to be ~0 ft in the deep breach core (physically correct, not a bug).
# Read all land-cover classes from the geometry Manning's n table
lc_names_all = mannings_named["Land Cover Name"].tolist()
lc_base_all = mannings_named["Base Mannings n Value"].tolist()
print("Land-cover classes to be scaled by shared multiplier:")
for name, base in zip(lc_names_all, lc_base_all):
print(f" {name!r:44s} base n = {base:.4f}")
# Single shared multiplier column (correlated)
param_specs_ex2 = {
"roughness_multiplier": {
"min": 0.75, "max": 1.25, "mean": 1.0, "std": 0.10,
# kind is deliberately omitted here -- we perturb as a dimensionless
# multiplier, not as an absolute Manning's n value.
},
}
samples_ex2_raw = RasMonteCarlo.generate_samples(
param_specs = param_specs_ex2,
n_samples = N_SAMPLES_EX2,
method = "latin_hypercube",
seed = SEED,
)
# Expand: compute per-class absolute n = base_n * k for make_mannings_apply_fn
samples_ex2 = samples_ex2_raw.copy()
for lc_name, lc_base in zip(lc_names_all, lc_base_all):
samples_ex2[lc_name] = samples_ex2["roughness_multiplier"] * lc_base
print(f"\nGenerated {len(samples_ex2)} samples (correlated roughness multiplier, LHS).")
display(samples_ex2[["roughness_multiplier"] + lc_names_all[:3]].round(5).head(10))
fig, ax = plt.subplots(figsize=(7, 4))
ax.hist(samples_ex2["roughness_multiplier"], bins=12, color="mediumpurple", edgecolor="white")
ax.axvline(1.0, color="grey", linestyle="--", linewidth=0.8, label="Baseline (k=1)")
ax.set_xlabel("Roughness multiplier k (all classes × k)")
ax.set_ylabel("Sample count")
ax.set_title(f"Example 2 — Correlated roughness multiplier (n={N_SAMPLES_EX2}, demo scale)")
ax.legend(fontsize=9)
plt.tight_layout()
plt.show()
print(f"\nNote: n=30 demo ensemble. Contrast this band with Example 1 to see the driver hierarchy.")
# Build apply_fn: all classes scaled by the shared roughness_multiplier k.
# We use make_mannings_apply_fn with the expanded samples_ex2 (which already
# carries per-class absolute n values = base_n × k for each class).
apply_fn_ex2 = RasMonteCarlo.make_mannings_apply_fn(
zone_column_map = {name: name for name in lc_names_all},
path = "plaintext",
)
# Smart reuse (C-1/C-2/C-3 hardened)
_ex2_batches, _ex2_done_hdfs, _ex2_perm_log = discover_completed_hdfs(project_path, suffix="mc_rough")
if len(_ex2_done_hdfs) >= N_SAMPLES_EX2:
print(f"Found {len(_ex2_done_hdfs)} completed result HDFs — reusing.")
ensemble_ex2 = build_reuse_ensemble(_ex2_done_hdfs, N_SAMPLES_EX2, perm_log=_ex2_perm_log)
else:
if _ex2_batches:
print(f"Removing {len(_ex2_batches)} incomplete batch folder(s)...")
for folder in _ex2_batches:
shutil.rmtree(folder, ignore_errors=True)
ensemble_ex2 = RasMonteCarlo.run_ensemble(
template_plan = TEMPLATE_PLAN,
samples_df = samples_ex2, # expanded: per-class columns carry base_n * k
apply_fn = apply_fn_ex2,
suffix = "mc_rough",
max_workers = MAX_WORKERS,
num_cores = NUM_CORES,
timeout_sec = TIMEOUT_SEC,
clone_geom = True,
clear_geompre = True,
workers = remote_workers if USE_REMOTE_WORKERS else None,
)
report_ensemble_health(ensemble_ex2, "Example 2 (Roughness — secondary driver)")
# Derive representative mesh-cell coordinates for point-level risk analysis.
cells_gdf = HdfMesh.get_mesh_cell_points(TEMPLATE_PLAN, ras_object=ras)
# Sample 4 cells evenly spaced by index to span the domain
n_cells = len(cells_gdf)
step = max(1, n_cells // 4)
poi_idx = list(range(0, n_cells, step))[:4]
poi_pts = [
(cells_gdf.geometry.iloc[i].x, cells_gdf.geometry.iloc[i].y)
for i in poi_idx
]
print(f"Mesh cells available: {n_cells:,}")
print("Points of interest (x, y):")
for j, (x, y) in enumerate(poi_pts):
print(f" POI-{j}: ({x:.2f}, {y:.2f})")
# Point-level WSE statistics across completed ensemble members.
# Default min_valid_fraction=0.95 guard applies; status accounting is attached via df.attrs.
risk_df = RasMonteCarlo.risk_at_points(
ensemble_result = ensemble_ex2,
points = poi_pts,
variable = "wse",
)
print(f"\nn_samples_used (attrs) : {risk_df.attrs.get('n_samples_used')}")
print(f"status_accounting (attrs): {risk_df.attrs.get('status_accounting')}")
print("\nPoint-level risk summary:")
display(
risk_df[["point_index", "x", "y", "n_samples",
"mean", "std", "p10", "p50", "p90"]].round(3)
)
# Plot WSE percentile curves at each point
fig, ax = plt.subplots(figsize=(9, 5))
pct_cols = ["p01", "p10", "p50", "p90", "p99"]
pct_vals = [1, 10, 50, 90, 99]
colors = plt.cm.viridis(np.linspace(0.15, 0.85, len(risk_df)))
for (_, row), c in zip(risk_df.iterrows(), colors):
wse_pcts = [row[col] for col in pct_cols]
ax.plot(
pct_vals, wse_pcts,
marker="o", markersize=5, linewidth=1.6,
color=c, label=f"POI-{int(row['point_index'])}",
)
ax.set_xlabel("Percentile (non-exceedance quantile)")
ax.set_ylabel("Water Surface Elevation (ft)")
ax.set_title("Example 2 — Truncated Normal Manning's n | WSE Percentile Curves at Selected Points")
ax.legend(title="Point of Interest", fontsize=9)
ax.grid(True, alpha=0.35)
plt.tight_layout()
plt.show()
Summary¶
| Example 1 | Example 2 | |
|---|---|---|
| Sampling method | Latin hypercube | Truncated normal |
| Parameters | Breach formation (width, time, slopes) + inflow multiplier | Correlated roughness multiplier (all wetted classes × k) |
| Ensemble size | 100 (N_SAMPLES_EX1) — production scale |
30 (N_SAMPLES_EX2) — demo scale contrast |
| apply_fn source | make_breach_apply_fn() + make_flow_multiplier_apply_fn() |
make_mannings_apply_fn() (shipped factory) |
| Execution | Remote worker fleet (USE_REMOTE_WORKERS=True) |
Local parallel or remote |
| Ensemble health | status_histogram() printed |
status_histogram() printed |
| Convergence | convergence(p90) — see stabilized flag printed above |
(shared diagnostic) |
| Result analysis | exceedance_probabilities() + prediction_intervals() |
risk_at_points() |
2D Land-Cover Roughness Propagation (Required Settings)¶
For a 2D land-cover model the roughness MC perturbs the plain-text .g## LCMann Table
base overrides through make_mannings_apply_fn(path="plaintext"). This only reaches the
solver when run_ensemble() is called with both:
clone_geom=True-- per-sample geometry, so each sample edits its own.g##LCMann Table.clear_geompre=True-- regenerates the cached per-cellCells Center Manning's nfrom the perturbed table. Without it the preprocessednis stale and the ensemble is inert (zero WSE spread).
force_geompre is NOT used because it destroys the land-cover association in .g##.hdf. The
WSE-sensitivity guard after Example 1 asserts the 90% band exceeds 0.05 ft, so an inert
ensemble fails loudly rather than silently reporting a meaningless converged band.
Hardened-API Methods Exercised¶
RasMonteCarlo.generate_samples(...)— LHS and truncated-normal, single threadeddefault_rng(H5),kind="mannings_n"bounds warning (H3)RasMonteCarlo.make_breach_apply_fn(...)— breach formation parameter perturbationRasMonteCarlo.make_flow_multiplier_apply_fn(...)— inflow QMult scalingRasMonteCarlo.make_composite_apply_fn(...)— chains breach + inflow apply_fnsRasMonteCarlo.make_mannings_apply_fn(...)— shipped factory exercised instead of an inline apply_fn (L1)RasMonteCarlo.run_ensemble(...)— returnscompleted/completed_with_errors/failed+status_histogram(M1)RasMonteCarlo.status_histogram(...)— surfaces ensemble health before statistics (C2)RasMonteCarlo.convergence(...)— running P90 stabilization check (C3)RasMonteCarlo.exceedance_probabilities(...)— default 0.95 valid-fraction guard,status_accountingreported (C2)RasMonteCarlo.prediction_intervals(...)— labelledinterval_type="prediction",n_samples_usedreported (M3)RasMonteCarlo.risk_at_points(...)— point statistics withn_samples_used/status_accountingindf.attrs(C2)
Limitations¶
- N=100 is adequate for P10/P50/P90 but not extreme tails. Robust P01/P99 estimation needs ~500–1000 LHS samples (research §9). Trust the convergence flag before citing any band.
- Independent per-class sampling (Example 2). Manning's n classes are sampled independently (no correlation structure), which can overstate spatial uncertainty relative to correlated-zone calibration error. The correlated multiplier approach mitigates this.
- Percentiles are non-exceedance quantiles (value not exceeded N% of the time across samples), not annual exceedance probabilities, and mix aleatory + epistemic uncertainty in one ensemble.
- Error runs are excluded by default. Runs with status
completed_with_errorsare dropped from statistics unlessinclude_error_runs=True; the dropped fraction biases the tail and is reported via the status histogram /status_accounting. - Flow-multiplier note (H2):
make_flow_multiplier_apply_fnwritesFlow Hydrograph QMult=, a UNIFORM ordinate multiplier scaling peak AND volume together — it is NOT an AEP / flow-frequency sample. - Sensitivity analysis (Morris/Sobol) is out of scope in this version of
RasMonteCarlo.