Residential Electricity Demand Forecasting from Weather

End-to-end pipeline. DSGrid demand profiles plus ERA5 weather via Open-Meteo for NYC.

time series
regression
clustering
energy
end-to-end
Published

December 10, 2025

Summary. Solo end-to-end project. I built a pipeline that integrates DSGrid synthetic residential demand profiles with ERA5 daily weather data via the Open-Meteo API for New York City, producing a multi-year aligned dataset. I benchmarked supervised regression and classification baselines (linear, logistic, gradient boosting) alongside unsupervised methods (PCA, t-SNE, K-means, DBSCAN, hierarchical clustering), and published the full reproducible workflow.

Note

DSAN 5000, Data Science & Analytics, Fall 2025. My first end-to-end project at Georgetown.

Why this project

[TODO one paragraph on the practical motivation. Utility planning, demand response, the tension between weather-driven peaks and grid stability. Make it about a real-world question, not just “I wanted to learn the pipeline.”]

Data engineering

The unglamorous half of this project was getting two data sources with very different shapes to align cleanly. Synthetic demand profiles at one resolution, ERA5 daily weather at another, all keyed to NYC.

[TODO quick paragraph on the joining strategy, time-zone handling, and missing-data treatment.]

Models

[TODO brief tour through the supervised baselines, then the unsupervised clustering, and what each was for. The interesting story is usually the contrast between supervised performance and what the clusters revealed about the residuals.]

Findings

[TODO 2 to 3 concrete results. Lead with effect sizes, not p-values.]

What this project taught me

[TODO honest reflection. This was your first big end-to-end project. What did you do right, what would you do differently now that you know more?]

Code

Repository on GitHub