Automated Weather Prediction Markets

I identified systematic mispricings in weather prediction markets and built a fully automated system to exploit them — from data ingestion to live execution on Kalshi.

67%Win Rate

+14%ROI

26%Brier Improvement

Context

Weather prediction markets on Kalshi let you trade on whether temperature in a city will exceed a threshold. Most participants use gut feel or basic weather apps. Public ensemble forecast data from NWS and ECMWF contains systematic calibration errors — creating exploitable mispricings for anyone willing to do the quantitative work.

System Architecture

Data Ingestion

NWS & ECMWF ensemble APIs

Forecast Calibration

Seasonal bias correction + isotonic regression

Edge Detection

Model probability vs. market price

Position Sizing

Fractional Kelly criterion with caps

Execution

Automated orders via Kalshi API

Risk Monitoring

3σ filters & drawdown limits

DATA → CALIBRATE → DETECT → SIZE → EXECUTE → MONITOR

Key Decisions

The technical choices that shaped the system — and why I made them.

Why isotonic regression for calibration?

Raw weather forecasts have non-linear calibration errors — they're overconfident in some temperature ranges and underconfident in others. I chose isotonic regression over logistic recalibration because it handles this non-linearity without parametric assumptions. Result: 26% Brier score improvement.

Why fractional Kelly with caps?

Full Kelly criterion is mathematically optimal but assumes perfect edge estimation — which no model has. I used fractional Kelly (conservative multiplier) plus a hard 30% market divergence cap to protect against model overconfidence. The principle: start conservative, increase sizing only as evidence accumulates.

Why 3σ thresholds for trade entry?

Only trade when the detected edge exceeds 3 standard deviations from historical model error. This dramatically reduces false positives at the cost of trade frequency — a deliberate tradeoff favoring precision over volume. In practice, this meant fewer but higher-conviction trades.

Results

67%Win Rate

+14%ROI on Live Capital

26%Brier Score Improvement

Fully AutomatedExecution

Performance Over Time

Total ROI

+14.0%

Win Rate

67%

Total Trades

150+

Reflection

The most important lesson: the model wasn’t the hard part — the execution layer was. I spent two weeks trying to improve forecast accuracy before realizing the edge was being lost to position sizing and entry timing. Rebuilding the execution layer without touching the model turned a losing system into a profitable one.

Tech Stack

PythonSQLKalshi APINWS / ECMWF DataIsotonic RegressionKelly CriterionCron Automation

← Back to Projects View on GitHub