Sovereign AI Weather Forecasting for Paraguay

Phase 3 v3.1 ensemble (richer EMOS variance link) · 60-date hindcast (2024-10 → 2025-04) · Built 2026-05-07

Headline statistics (calibrated)

Kill metric — full × ERA5
+25.7%
95% CI [+17.5%, +34.1%]
CRPS (calibrated)
2.56 mm
SSR = 1.00 (1.0 = perfectly calibrated)
FSS @ 5 mm, 140 km
0.46
Above 0 = real spatial skill
twCRPS @ 25 mm
0.45 mm
Heavy-rain detection skill

Full per-view scorecard (4 evaluation views, 60 dates each)

ViewRMSE (mm) vs GFS (95% CI)FSS@5mm 140km CRPSSSR
full_era56.88+25.7% [+17.5%, +34.1%]0.462.561.00
east_era57.50+23.9% [+12.8%, +35.7%]0.442.851.00
full_chirps8.03+17.4% [+9.1%, +25.8%]0.353.461.00
east_chirps9.09+13.4% [+4.7%, +23.0%]0.313.821.00

Methodology

Ensemble of 3 global AI weather models: FCN3 + GraphCast + raw GFS, each producing 24-hour precipitation forecasts at 25 km resolution. Per-member quantile mapping corrects each member's dry/wet bias against the ERA5 reanalysis truth distribution (leave-one-out across 60 dates). EMOS-NGR (Non-homogeneous Gaussian Regression, Gneiting et al. 2005) calibrates the predictive distribution by minimum-CRPS estimation, producing μ and σ per cell. Post-hoc variance inflation ensures spread-skill ratio = 1. RAINFARM (Rebora et al. 2006) provides stochastic spatial disaggregation from 25 km to 5 km, preserving coarse aggregates and matching CHIRPS climatology spectrum. All scoring uses WeatherBench 2 canonical RMSE (lat-weighted, sqrt-after-time-mean) and bootstrap 95% CIs.

Stage A — Gauge validation (the credibility test)

The headline +25.7% kill metric (v3.1) was computed against ERA5 reanalysis — a model-truth source, not ground observations. Stage A tests whether the headline survives validation against actual gauge measurements from NOAA GHCN-Daily and Brazilian INMET archives.

Coverage gap (read this first)

0 Paraguay stations exist in GHCN-Daily — Paraguay's DMH operates the country's gauge network but does not contribute to NOAA's archive. Validation rests on 20 border stations (Argentina + Brazil within 1° of the Paraguay border), of which 10 stations × 258 records fall within the cropped Paraguay forecast grid. Stage B (DMH archive + Itaipu hydroelectric network via Fran) is required for representative Paraguay-interior coverage.

Pooled skill at gauges
-1.7%
vs GFS, all 258 records
Stations beating GFS
4 / 8
50% of stations
Ensemble RMSE vs gauge
15.0 mm
GFS: 14.7 mm; ERA5: 13.7 mm (floor)
Dates covered
60 / 60
Of the 60-date hindcast body

The geographic signal — regime matters

The pooled number hides a clear pattern: the AI ensemble wins in transitional climate zones (north Argentina, west Paraguay border, where smooth-mean predictions match observations) and loses in heavy-convection valleys (eastern Paraná state, where the documented dry-bias of FCN3+AFNO and GraphCast+AFNO is most penalized). This is consistent with the threshold-skill diagnostics and with member-bias analyses; it is not random sampling noise.

StationCountryLat, Lon N Ens RMSE (mm) GFS RMSE (mm) Skill % vs GFS Verdict
FORMOSAAR-26.21, -58.232411.516.0+28.0%Strong win
CATARATAS INTLBR-25.60, -54.49469.611.3+15.1%Strong win
PRESIDENCIA ROQUE SAENZ PENAAR-26.73, -60.481216.016.9+5.1%Win
LAS LOMITASAR-24.70, -60.581816.116.5+2.7%Tie
RESISTENCIA AEROAR-27.45, -59.052216.815.9-5.7%Loss
POSADASAR-27.39, -55.972531.228.5-9.5%Loss
PLANALTOBR-25.72, -53.756011.89.7-20.8%Strong loss
MAL. CANDIDO RONDONBR-24.53, -54.02448.66.9-23.8%Strong loss

Honest product implication

Showcase events

Five events from the 60-date body, spanning weather regimes and skill levels. Distribution sampled to demonstrate range, not selected to flatter: 29 / 60 dates show STRONG skill (> 30%), 9 GOOD (15-30%), 20 TIE (−15 to 15%), 2 WORSE (< -15%). Three of the five events below are STRONG-skill; one is TIE; one is intentionally a borderline case to show honest behavior.

2024-11-01 HEAVY STRONG

Skill vs GFS: +45.6%

Heavy precipitation event (8.1 mm domain mean, peak 98 mm). Ensemble beat GFS by 46% — the kind of event where AI adds the most value over the operational baseline.

Scorecard 4-panel for 2024-11-01
Scorecard: forecast μ, calibrated uncertainty σ, observed truth (ERA5), and error map (forecast − truth).
GFS delta for 2024-11-01
AI vs GFS: green = AI ensemble closer to truth, brown = GFS closer.
Fine-grid forecast for 2024-11-01
Fine-grid forecast (5 km): RAINFARM spectral disaggregation from coarse 25 km ensemble.
Fine-grid P(>25mm) for 2024-11-01
P(>25 mm/24h) at 5 km: probabilistic heavy-rain risk per fine-grid cell.

Department-level forecast (top 8 by mean precipitation)

DepartmentMean μ (mm) P10 / P90 (mm) P>5mmP>25mm
Alto Paraguay14.36.6 / 19.862%27%
Boquerón14.15.9 / 28.458%28%
Presidente Hayes8.52.7 / 15.552%17%
Concepción6.84.9 / 9.753%12%
Amambay4.53.6 / 5.648%2%
Canindeyú4.32.4 / 7.144%1%
Alto Paraná3.02.6 / 3.341%1%
San Pedro3.01.3 / 5.336%1%

Demo farm locations (centroids of major soybean-belt departments)

Farm locationμ (mm) σ (mm) P>5mmP>25mm GFS (mm)Truth (mm)
Itapúa centroid2.58.839%1%0.32.0
Alto Paraná centroid3.29.442%1%0.93.9
Canindeyú centroid3.97.344%0%6.117.1
Caaguazú centroid1.75.929%0%0.61.3
Asunción metro0.73.19%0%1.21.5
Concepción centroid7.117.855%16%21.159.8
Boquerón (Chaco) centroid9.422.458%24%28.04.3

2025-03-15 MODERATE GOOD

Skill vs GFS: +29.4%

Moderate precipitation (1.8 mm domain mean). Ensemble beat GFS by 29% — representative of the system's day-to-day operational behavior.

Scorecard 4-panel for 2025-03-15
Scorecard: forecast μ, calibrated uncertainty σ, observed truth (ERA5), and error map (forecast − truth).
GFS delta for 2025-03-15
AI vs GFS: green = AI ensemble closer to truth, brown = GFS closer.
Fine-grid forecast for 2025-03-15
Fine-grid forecast (5 km): RAINFARM spectral disaggregation from coarse 25 km ensemble.
Fine-grid P(>25mm) for 2025-03-15
P(>25 mm/24h) at 5 km: probabilistic heavy-rain risk per fine-grid cell.

Department-level forecast (top 8 by mean precipitation)

DepartmentMean μ (mm) P10 / P90 (mm) P>5mmP>25mm
Boquerón2.20.4 / 4.031%2%
Alto Paraguay1.90.5 / 3.427%0%
Amambay1.20.7 / 1.621%0%
Canindeyú1.00.5 / 1.518%0%
Alto Paraná0.80.4 / 1.313%0%
Central0.60.3 / 0.87%0%
Paraguarí0.50.2 / 0.95%0%
Concepción0.50.3 / 0.86%0%

Demo farm locations (centroids of major soybean-belt departments)

Farm locationμ (mm) σ (mm) P>5mmP>25mm GFS (mm)Truth (mm)
Itapúa centroid0.11.00%0%0.21.4
Alto Paraná centroid0.94.015%0%2.20.3
Canindeyú centroid1.25.123%0%1.50.9
Caaguazú centroid0.52.75%0%1.50.3
Asunción metro0.21.70%0%0.81.6
Concepción centroid0.32.11%0%0.50.0
Boquerón (Chaco) centroid0.53.17%0%0.70.2

2024-12-20 HEAVY TIE

Skill vs GFS: +10.5%

Heavy event with modest skill (+10% vs GFS). The ensemble called the regime correctly but didn't crush GFS — honest example of where the system delivers value without over-claiming.

Scorecard 4-panel for 2024-12-20
Scorecard: forecast μ, calibrated uncertainty σ, observed truth (ERA5), and error map (forecast − truth).
GFS delta for 2024-12-20
AI vs GFS: green = AI ensemble closer to truth, brown = GFS closer.
Fine-grid forecast for 2024-12-20
Fine-grid forecast (5 km): RAINFARM spectral disaggregation from coarse 25 km ensemble.
Fine-grid P(>25mm) for 2024-12-20
P(>25 mm/24h) at 5 km: probabilistic heavy-rain risk per fine-grid cell.

Department-level forecast (top 8 by mean precipitation)

DepartmentMean μ (mm) P10 / P90 (mm) P>5mmP>25mm
Boquerón9.23.8 / 16.456%18%
Alto Paraguay8.74.0 / 15.056%15%
Presidente Hayes2.9-0.1 / 6.528%4%
Concepción2.20.9 / 3.432%0%
Amambay0.60.2 / 1.414%0%
Canindeyú0.50.1 / 1.010%0%
San Pedro0.3-0.1 / 1.26%0%
Alto Paraná0.30.0 / 0.87%0%

Demo farm locations (centroids of major soybean-belt departments)

Farm locationμ (mm) σ (mm) P>5mmP>25mm GFS (mm)Truth (mm)
Itapúa centroid-0.11.50%0%0.00.2
Alto Paraná centroid0.23.27%0%0.40.0
Canindeyú centroid0.43.38%0%0.80.4
Caaguazú centroid-0.02.11%0%0.00.0
Asunción metro-0.11.20%0%0.00.0
Concepción centroid3.99.946%2%7.00.0
Boquerón (Chaco) centroid7.917.257%16%13.55.0

2024-11-12 DRY TIE

Skill vs GFS: -11.1%

Dry day correctly forecast (truth 0.00 mm, ensemble 0.01 mm). Demonstrates the system doesn't false-alarm on dry days — important for irrigation and harvest scheduling.

Scorecard 4-panel for 2024-11-12
Scorecard: forecast μ, calibrated uncertainty σ, observed truth (ERA5), and error map (forecast − truth).
GFS delta for 2024-11-12
AI vs GFS: green = AI ensemble closer to truth, brown = GFS closer.
Fine-grid forecast for 2024-11-12
Fine-grid forecast (5 km): RAINFARM spectral disaggregation from coarse 25 km ensemble.
Fine-grid P(>25mm) for 2024-11-12
P(>25 mm/24h) at 5 km: probabilistic heavy-rain risk per fine-grid cell.

Department-level forecast (top 8 by mean precipitation)

DepartmentMean μ (mm) P10 / P90 (mm) P>5mmP>25mm
Alto Paraná0.00.0 / 0.00%0%
Canindeyú0.00.0 / 0.00%0%
Itapúa0.00.0 / 0.00%0%
Caaguazú0.00.0 / 0.00%0%
Boquerón0.00.0 / 0.00%0%
Alto Paraguay0.00.0 / 0.00%0%
Presidente Hayes0.00.0 / 0.00%0%
Misiones0.00.0 / 0.00%0%

Demo farm locations (centroids of major soybean-belt departments)

Farm locationμ (mm) σ (mm) P>5mmP>25mm GFS (mm)Truth (mm)
Itapúa centroid0.00.10%0%0.00.0
Alto Paraná centroid0.00.40%0%0.00.0
Canindeyú centroid0.00.10%0%0.00.0
Caaguazú centroid0.00.10%0%0.00.0
Asunción metro0.00.10%0%0.00.0
Concepción centroid0.00.10%0%0.00.0
Boquerón (Chaco) centroid0.00.10%0%0.00.0

2024-10-19 DRY STRONG

Skill vs GFS: +69.7%

Case study: ensemble and GFS diverged most strongly (+70% skill, truth 0.2 mm). Useful as a meteorological discussion case.

Scorecard 4-panel for 2024-10-19
Scorecard: forecast μ, calibrated uncertainty σ, observed truth (ERA5), and error map (forecast − truth).
GFS delta for 2024-10-19
AI vs GFS: green = AI ensemble closer to truth, brown = GFS closer.
Fine-grid forecast for 2024-10-19
Fine-grid forecast (5 km): RAINFARM spectral disaggregation from coarse 25 km ensemble.
Fine-grid P(>25mm) for 2024-10-19
P(>25 mm/24h) at 5 km: probabilistic heavy-rain risk per fine-grid cell.

Department-level forecast (top 8 by mean precipitation)

DepartmentMean μ (mm) P10 / P90 (mm) P>5mmP>25mm
Alto Paraguay1.50.4 / 2.623%1%
Boquerón0.50.1 / 1.59%0%
Amambay0.50.3 / 0.77%0%
Concepción0.40.1 / 0.84%0%
Caazapá0.30.2 / 0.33%0%
Alto Paraná0.30.2 / 0.33%0%
Itapúa0.30.2 / 0.33%0%
Guairá0.20.2 / 0.32%0%

Demo farm locations (centroids of major soybean-belt departments)

Farm locationμ (mm) σ (mm) P>5mmP>25mm GFS (mm)Truth (mm)
Itapúa centroid0.32.53%0%0.00.5
Alto Paraná centroid0.32.53%0%0.00.1
Canindeyú centroid0.22.22%0%0.00.3
Caaguazú centroid0.32.42%0%0.00.1
Asunción metro0.21.70%0%0.00.1
Concepción centroid0.21.50%0%0.00.2
Boquerón (Chaco) centroid0.31.91%0%0.40.1

Honest disclosures