Getting the data to forecast electricity prices

7 min readJul 2, 2020

On Octopus Energy’s Agile tariff, I can save money by using electricity at the cheapest times of the day. Often this is just by shifting consumption to the cheapest times of the day, which I can easily do because each afternoon they notify customers of the prices for each half hour the following day. But increasingly I’m looking to save even more by shifting demand from one day to another, which is more difficult because I don’t know what the prices will be on the late day.

There are good companies out there that offer hourly or half-hourly price forecasts — often making use of proprietary weather and electricity systems data. However, I was interested to see if I could do a reasonably good job of forecasting UK electricity prices based on the data that is freely available.

Which prices to forecast?

Firstly, it is necessary to distinguish between different prices we might be trying to predict. The UK market has day-ahead hourly prices published at 11am, day-ahead half-hourly prices published at 3:45pm, intraday prices that you have to pay to view, and final imbalance prices that are published afterwards. Octopus’s retail prices, which I ultimately pay, are a function of the day-ahead half hourly prices.

The final imbalance prices are strongly influenced by short-term operating factors that are difficult to model, and which won’t ultimately feed into my retail prices. Intraday prices and the day-ahead half-hourly prices are less traded, so sometimes display unpredictable noise. I therefore decided to focus on forecasting the day-ahead hourly prices — these are a lot easier to model, and are probably a pretty good reflection of what the half-hourly day-ahead, intraday and balancing prices are expected to be.

The basic idea

My basic intuition (which needs testing) is that most of the variation in hourly day ahead prices can be explained by variation in ‘demand minus wind generation minus solar generation’. Other factors which play a part will be: how much dispatchable generation capacity (eg gas, nuclear, biomass) is available, what gas prices are, and how cheap power from France, Belgium and the Netherlands is.

For my first attempt, I was hoping to estimate what the day ahead demand, wind generation and solar generation would be up to a week ahead, and from these and the gas price, what the day-ahead price would be. This would require me to calibrate a model to predict what day-ahead prices would be as a function of day ahead demand, wind generation, solar generation and the gas price.

Data Sources

The main source I have used is Elexon’s BMRS (https://www.bmreports.com/bmrs/), which has a lot of good openly available data, and an API. A second source I have used in National Grid ESO’s data portal (https://data.nationalgrideso.com/data-groups/), which is currently in beta (the original source was https://demandforecast.nationalgrid.com/efs_demand_forecast/faces/DataExplorer ).

Defining demand

Before we get to estimating demand, we have to be clear on what we mean by it. For example, if a household uses 3kwh from their rooftop solar panels, was their demand 3kwh (the gross demand) or 0kwh (net demand)? Ideally we would measure gross demand and generation separately. Unfortunately, though, we don’t have any data about how much electricity was generated from rooftop PV each hour.

Even beyond this, some solar and wind capacity that is separately measured gets reported by National Grid / BMRS as reduction in demand rather than generation. I haven’t quite worked out the difference between different demand metrics, but it seems that some are net of any generation that is connected to the distribution network, while others are net of any generation that is below a certain size.

For the purpose of my analysis, I am going to assume that all solar generation is included as a reduction in demand, while all wind generation is separate, but this may have to change if I get a better breakdown.

Estimating wind generation

Wind generation varies a lot. The biggest driver is how windy it is going to be in the locations where we have wind turbines (onshore and offshore). There will be occasions when we don’t generate as much as we could, because the transmission network can’t get that power where it is needed, however I this has much less impact on day-ahead prices than imbalance prices.

I am taking two report of wind generation forecasts from BMRS. Firstly, I look at a report called ‘Wind Generation Forecast’, or ‘WINDFOR’. This provides hourly wind generation forecasts for the next 48 hours. The next report I look at the 2–14 day usable output forecasts, or ‘FOU2T14D’, which provides the daily maximum wind forecast for each of the 13 days starting the day after tomorrow.

Ideally I would get the 2–14 day usable output forecasts at hourly granularity. Unfortunately BMRS do not publish this, so I have to do my own smoothing. For example, if a day’s maximum is above those of the day before or after, I assume it occurs in the middle of the day. Otherwise, I assume it occurs at the start or end of the day, depending on which of the previous or next day is higher. While it would be nice if BMRS published 2–14 day forecasts at hourly granularity, I suspect predicting the timing of peaks and troughs has limited accuracy 2 or more days ahead.

A further limitation of the BMRS data is the lack of historical data. Ideally, for example, I would be able to see how much forecast for a day was likely to change, from 7 days ahead to day ahead. I would need to do my own work of scraping and storing this data to see this. Similarly, there is no way to know what the day ahead wind generation forecasts were at the time of the day ahead auctions.

In addition to wind generation forecasts, I could look at wind forecasts and try to estimate how much electricity would be generated. I decided that with the data I could get, I would be unlikely to produce a better estimate than the forecasts produced by National Grid / BMRS.

Estimating solar generation

Because most of the solar generation is included as a reduction in demand, it is very difficult to forecast it directly. I decided to make the assumption that for a given time of year and day, there was a fixed amount of electricity that would be generated on a clear day, and then I could reduce that on cloudy days.

As a result, I thought I would do is estimate how much solar generation would be produced on a sunny day for this time of year and day, and then how much it would be reduced on less sunny days. I could combine this data with open weather forecasts (eg wonderground.com) to predict how sunny each of the next 14 days would be.

Unfortunately I had very little data to work with. BMRS do provide what they call a day ahead solar generation forecast, but I couldn’t get more than the first 2 hours of the following day forecast. I couldn’t find any sites that had historical day ahead weather forecasts, just actual weather outturn. As a result, I expect these my solar forecasts to be pretty poor. However, for UK solar generation has a much smaller impact on net demand than in many other countries.

Estimating demand

Gross demand is primarily driven by the day of the week, and the time of day. Demand in the UK is higher in winter, partly due to the need for heating on cold days. (In hot countries high temperatures can also lead to higher demand, however air conditioning does not currently play a significant role in the UK.) Finally, because we are measuring demand net of solar generation, we can expect net demand to be further reduced during sunny days, especially in summer.

BMRS publish multiple demand forecasts, and it is hard to really understand which is most appropriate for this project. I’ve decided to use National Grid’s National Demand Forecast (NDF). BMRS publish half hourly forecasts for the next 36 hours, and then forecasts of the daily maximum for each day from 2–14 days out. Unfortunately there is again no history of these forecasts although there is lots of data showing actual history.

One big challenge with estimating demand is how to balance the desire to use lots of historical data, with the recognition that demand changes over time with economic growth, changes to how people heat, growing rollout of electric vehicle, and energy efficiency. As an example, the demand shape in Spring 2020 is significantly different from Spring 2019 due to the Covid-19 lockdowns.

At this point I am thinking of taking the last two weeks of historical demand data forecasts, removing the effect of sun and temperature, and then applying new sunniness and temperature estimates to form demand forecasts for the next 7 days. It may be that the 2–14 day maximum demand forecasts prove useful, but given that they don’t have a shape they may not.

Concluding Thoughts

I started this exercise with hope that the BMRS data would include the components necessary to form a reasonable 7 day forecast of hourly wind, solar and demand, sufficient to estimate prices. Unfortunately the data that isn’t as useful as I would have hoped, and is particularly lacking in historical data.

My next step is likely to be to collect historical actual demand to see how well I can forecast this, taking into weather forecasts.

And I will need to start collecting day ahead demand and wind forecasts, in order to see how well I can use these to generate price forecasts.