Week 12: Household forecast models

ainergyy
Dec 23, 2021
2 min read

Updated: Dec 27, 2021

Forecasting the energy consumption of a single household is a difficult subject to tackle, since it is highly vulnerable even to slight variance in the household's habits. As an example, if a household decided to use the oven for a special meal night, that could jeopardize an otherwise good prediction. The team was already aware of this issue, but was still interested in creating models that could simulate a household of the community. This interest stemmed from it being easier and faster to test various machine learning techniques as well as getting familiar with the dataset. There was also the possibility that this approach could lead to surprising results.

In order to achieve this week's goal, the team experimented with different machine learning algorithms and feature engineering techniques. The team then analysed its results in order to understand which were better fit for our data.

The first approach was using Support Vector Regression (SVR) which tries to predict the consumption of a period based on (a fixed amount of) previous periods (the support vector). From this experiment the most promising result was the following:

In spite of the team having high hopes for this model and the fact that it does do well on most of the periods, it falters greatly on peak periods.

Other approaches were also tested, namely, gradient descent, neural network, k-nearest neighbors and random forest, of each the latter stands out on consumption forecasting. As seen below, this technique managed to better cope with peaks.

It should be noted that this specific household is the very best with a r2 score of 90%. At the time of this post not every household was tested yet, but the majority of households have r2 scores ranging from 65% to 90%. There are a few household which don't seem to fit the model (having r2 scores as low as 15%) and the team will be working on identifying them and finding solutions to this cases.

Nevertheless, the forecast of energy generation yielded promising results in most of the algorithms that were tested. The chart below is the results of using the random forest algorithm, which was once again the most effective.

This time, most households tested had similar r2 scores, hovering around 90%. It is also worth mentioning that when the algorithm did miss, it was mostly lowballs, i.e. the actual energy generation was greater than the forecast.

The following weeks the team will be tweaking the current models as well as experiment with clustering and community-wide forecast models.

Week 12: Household forecast models

Recent Posts

Comments