HRRR Precipitation Forecast Download¶

This notebook demonstrates how to download and process HRRR (High-Resolution Rapid Refresh) precipitation forecast data using the PrecipHrrr class in ras-commander.

HRRR is NOAA's operational numerical weather prediction model with: - 3km horizontal resolution over CONUS - Hourly forecast cycles (00z through 23z) - 18-hour forecast horizon for standard cycles - 48-hour forecast horizon for extended cycles (00z, 06z, 12z, 18z) - Data source: NOAA NOMADS server (publicly available, no account required)

HRRR is particularly valuable for: - Short-range flood forecasting - Real-time operational HEC-RAS model forcing - Ensemble precipitation inputs for hydraulic models

Setup and Imports¶

Python

# =============================================================================
# DEVELOPMENT MODE TOGGLE
# =============================================================================
# Set USE_LOCAL_SOURCE based on your setup:
#   True  = Use local source code (for developers editing ras-commander)
#   False = Use pip-installed package (for users)
# =============================================================================

USE_LOCAL_SOURCE = True  # <-- TOGGLE THIS

# -----------------------------------------------------------------------------
if USE_LOCAL_SOURCE:
    import sys
    from pathlib import Path
    local_path = str(Path.cwd().parent)  # Parent of examples/ = repo root
    if local_path not in sys.path:
        sys.path.insert(0, local_path)  # Insert at position 0 = highest priority
    print(f"LOCAL SOURCE MODE: Loading from {local_path}/ras_commander")
else:
    print("PIP PACKAGE MODE: Loading installed ras-commander")

# Verify which version loaded
import ras_commander
print(f"Loaded: {ras_commander.__file__}")

Python

from pathlib import Path
import pandas as pd
import numpy as np
from datetime import datetime, timedelta

from ras_commander.precip import PrecipHrrr

print("Imports complete.")

What You'll Learn¶

How to query HRRR data availability on NOAA NOMADS
How to download the latest HRRR forecast cycle automatically
How to download a specific date and cycle
How to extract precipitation data from GRIB2 files to xarray
How to compute basin-average precipitation time series
HRRR cycle timing and forecast horizon reference

HRRR Overview¶

The High-Resolution Rapid Refresh (HRRR) model runs every hour at 3km resolution. Understanding cycle timing and data latency is important for operational use:

Cycle	Type	Forecast Horizon	Typical Availability
00z, 06z, 12z, 18z	Extended	48 hours	~2 hours after cycle
All other hours	Standard	18 hours	~2 hours after cycle

File format: GRIB2 (.grib2)

Key precipitation variable: APCP (Accumulated Precipitation, kg/m²)

Data server: NOAA NOMADS

Retention: NOMADS retains approximately 48 hours of recent HRRR output. For archived HRRR data, use the NOAA HRRR archive on AWS.

Python

# Get HRRR system information
info = PrecipHrrr.get_info()
print("HRRR System Information:")
print(f"  Data source: {info.get('base_url', 'NOAA NOMADS')}")
print(f"  Resolution: {info.get('spatial_resolution', '3km')}")
print(f"  Standard forecast: {info.get('forecast_horizon_standard', '18 hours')}")
print(f"  Extended forecast: {info.get('forecast_horizon_extended', '48 hours')}")
print(f"  Update frequency: {info.get('update_frequency', 'Hourly')}")
print(f"  CONUS bounds: {info.get('bounds', 'Full CONUS')}")

Example 1: Single Forecast Cycle¶

In this example we use get_latest_forecast() to automatically find and download the most recent available HRRR cycle. This is the recommended approach for operational workflows.

Check Availability¶

Before downloading, verify that the desired HRRR cycle is available on NOMADS. This is useful for operational monitoring and retry logic.

Python

# Check if the latest HRRR cycle is available
try:
    available = PrecipHrrr.check_availability()
    print(f"Latest HRRR cycle available: {available}")

    # Check a specific date/cycle
    yesterday = datetime.now() - timedelta(days=1)
    available_12z = PrecipHrrr.check_availability(
        date=yesterday.strftime("%Y-%m-%d"),
        cycle=12
    )
    print(f"Yesterday's 12z cycle available: {available_12z}")
    print(f"  (Checked date: {yesterday.strftime('%Y-%m-%d')})")
except Exception as e:
    print(f"Availability check requires internet: {e}")
    print("\nExpected output:")
    print("  Latest HRRR cycle available: True")
    print("  Yesterday's 12z cycle available: True")

Download Latest HRRR Forecast¶

get_latest_forecast() searches backwards through recent cycles (controlled by max_lookback_hours) to find the most recently published HRRR data. This handles the ~2 hour latency between cycle time and data availability.

Python

# Download latest HRRR cycle (18 hours ahead)
output_dir = Path("example_data/hrrr_single")
output_dir.mkdir(parents=True, exist_ok=True)

try:
    grib_files = PrecipHrrr.get_latest_forecast(
        output_dir=output_dir,
        hours=18,
        max_lookback_hours=6  # Look back up to 6 hours for available data
    )
    print(f"Downloaded {len(grib_files)} GRIB2 files")
    print(f"\nFirst 5 files:")
    for f in grib_files[:5]:
        size_mb = f.stat().st_size / (1024 * 1024)
        print(f"  {f.name} ({size_mb:.0f} MB)")
except Exception as e:
    print(f"Download requires internet access to NOAA NOMADS: {e}")
    print("\nExpected output:")
    print("  Downloaded 18 GRIB2 files")
    print("  hrrr.t12z.wrfsubhf01.grib2 (~100-150 MB)")
    print("  hrrr.t12z.wrfsubhf02.grib2 (~100-150 MB)")
    print("  ...")
    grib_files = []

Extract Precipitation Data¶

extract_precipitation() reads the GRIB2 files and returns an xarray.Dataset containing the precipitation variable(s). An optional bounds parameter (west, south, east, north) clips the data to a region of interest, which significantly reduces memory usage.

Python

if grib_files:
    try:
        # Extract precipitation for a specific region (Bald Eagle Creek, PA)
        bounds = (-77.9, 40.8, -77.3, 41.1)  # west, south, east, north

        precip_ds = PrecipHrrr.extract_precipitation(
            grib_files=grib_files,
            bounds=bounds
        )
        print("Extracted precipitation dataset:")
        print(f"  Variables: {list(precip_ds.data_vars)}")
        print(f"  Dimensions: {dict(precip_ds.dims)}")
        time_dim = precip_ds.dims.get('time', len(grib_files))
        print(f"  Time steps: {time_dim}")
        print(f"  Dataset:\n{precip_ds}")
    except Exception as e:
        print(f"Extraction requires cfgrib/xarray: {e}")
        print("\nInstall with: pip install cfgrib eccodes")
else:
    print("Skipping extraction (no GRIB files downloaded)")
    print("\nExpected output:")
    print("  Variables: ['APCP']")
    print("  Dimensions: {'time': 18, 'latitude': 36, 'longitude': 72}")
    print("  APCP units: kg m-2 (equivalent to mm)")

Basin-Average Precipitation¶

get_basin_average() spatially averages the gridded HRRR precipitation over a watershed geometry and returns a pandas.DataFrame with hourly precipitation and cumulative totals. The geometry can be any Shapely geometry (polygon, multipolygon, etc.).

The returned DataFrame has columns: - forecast_hour - Hours from cycle start (1 through N) - precip_mm - Incremental precipitation in millimeters - precip_inches - Incremental precipitation in inches - cumulative_mm - Cumulative precipitation in millimeters - cumulative_inches - Cumulative precipitation in inches

Python

if grib_files:
    try:
        from shapely.geometry import box
        basin_geom = box(-77.9, 40.8, -77.3, 41.1)

        avg_precip = PrecipHrrr.get_basin_average(
            grib_files=grib_files,
            geometry=basin_geom
        )
        print("Basin-average precipitation time series:")
        print(avg_precip.to_string(index=False))
        print(f"\nTotal forecast precipitation: {avg_precip['cumulative_inches'].iloc[-1]:.3f} inches")
        print(f"Peak hourly precipitation:    {avg_precip['precip_inches'].max():.3f} inches/hr")
    except Exception as e:
        print(f"Basin average calculation failed: {e}")
else:
    print("Skipping basin average (no GRIB files)")
    print("\nExpected output (DataFrame columns):")
    print("  forecast_hour | precip_mm | precip_inches | cumulative_mm | cumulative_inches")
    print("  1             | 0.5       | 0.02          | 0.5           | 0.02")
    print("  2             | 1.2       | 0.05          | 1.7           | 0.07")
    print("  ...")

Example 2: Specific Date/Cycle Download¶

When you need a specific historical cycle (within NOMADS retention window, typically ~48 hours), use download_forecast() with explicit date and cycle parameters. This is useful for: - Comparing model forecasts to observations after an event - Reproducible analysis with a known dataset - Operational post-processing pipelines

Python

# Download a specific date and cycle
output_dir_specific = Path("example_data/hrrr_specific")
output_dir_specific.mkdir(parents=True, exist_ok=True)

# Use yesterday's 00z cycle with 18-hour forecast
yesterday = (datetime.now() - timedelta(days=1)).strftime("%Y-%m-%d")

try:
    grib_files_specific = PrecipHrrr.download_forecast(
        output_dir=output_dir_specific,
        date=yesterday,
        cycle=0,   # 00z cycle
        hours=18,
        overwrite=False  # Skip if already downloaded
    )
    print(f"Downloaded {len(grib_files_specific)} files for {yesterday} 00z")
    if grib_files_specific:
        total_mb = sum(f.stat().st_size for f in grib_files_specific) / (1024 * 1024)
        print(f"Total download size: {total_mb:.1f} MB")
except Exception as e:
    print(f"Download requires internet: {e}")
    print(f"\nWould download 18 GRIB2 files for {yesterday} 00z cycle")
    print("Typical file size: ~100-150 MB per GRIB2 file (full CONUS)")
    print("Total download: ~2-3 GB for full 18-hour cycle")

HRRR Cycle Timing Reference¶

Understanding HRRR cycle timing is critical for operational use. The following cell provides a reference table for all 24 daily cycles.

Python

# HRRR cycles and forecast horizons (all 24 hourly cycles)
print("HRRR Forecast Cycles:")
print("=" * 65)
print(f"{'Cycle':<10} {'Type':<12} {'Forecast Hours':<20} {'Approx Availability (UTC)'}")
print("-" * 65)

EXTENDED_CYCLES = [0, 6, 12, 18]

for cycle in range(0, 24):
    if cycle in EXTENDED_CYCLES:
        cycle_type = "Extended"
        hours = "48 hours"
    else:
        cycle_type = "Standard"
        hours = "18 hours"
    avail_hour = (cycle + 2) % 24
    availability = f"{avail_hour:02d}:00 - {avail_hour:02d}:30 UTC"
    print(f"{cycle:02d}z       {cycle_type:<12} {hours:<20} {availability}")

print()
print("Notes:")
print("  - HRRR runs every hour (24 cycles per day, 00z through 23z)")
print("  - Data typically available ~2 hours after cycle start time")
print("  - Example: 12z cycle data available ~14:00 UTC")
print("  - Extended cycles (00/06/12/18z) produce 48-hour forecasts")
print("  - All other cycles produce 18-hour forecasts")
print("  - NOMADS retains ~48 hours of recent cycles")
print("  - For older data: https://storage.googleapis.com/high-resolution-rapid-refresh/")
print()
print("File size reference (full CONUS wrfsubhf):")
print("  ~100-150 MB per GRIB2 file (single forecast hour)")
print("  ~2-3 GB for complete 18-hour standard cycle")
print("  ~5-7 GB for complete 48-hour extended cycle")
print("  Spatial subsetting via 'bounds' parameter reduces size significantly")

Key Takeaways¶

HRRR Data Characteristics: - 3km resolution, hourly cycles, publicly available from NOAA NOMADS - Standard cycles produce 18-hour forecasts; extended cycles (00/06/12/18z) produce 48-hour forecasts - NOMADS retains approximately 48 hours of recent output; use the HRRR archive on AWS for older data

Workflow Summary: 1. Use check_availability() to verify data is on NOMADS before downloading 2. Use get_latest_forecast() for operational workflows that always need current data 3. Use download_forecast(date=..., cycle=...) when you need a specific historical cycle 4. Use extract_precipitation(bounds=...) to clip to your region of interest (reduces memory) 5. Use get_basin_average() to compute watershed-averaged precipitation for HEC-RAS boundary conditions

Common Pitfalls: - HRRR precipitation (APCP) is accumulated from the start of the cycle — convert to incremental depth before summing - The overwrite=False default prevents redundant downloads in operational pipelines - For production use, consider caching GRIB2 files locally; full CONUS files are ~100-150 MB each (~2-3 GB per 18-hour cycle) - cfgrib and eccodes are required for extract_precipitation(): pip install cfgrib eccodes

Integration with HEC-RAS: - Use get_basin_average() output as upstream boundary condition hyetographs - Combine with RasUnsteady.set_precipitation_hyetograph() to write directly to unsteady flow files - For spatially distributed rainfall, export extract_precipitation() output to DSS using RasDss

Cleanup¶

Python

import shutil

for d in ["example_data/hrrr_single", "example_data/hrrr_specific"]:
    p = Path(d)
    if p.exists():
        shutil.rmtree(p)
        print(f"Cleaned up: {d}")
    else:
        print(f"Not found (nothing to clean): {d}")

print("Done!")