GB Electricity Generation Data
As I was writing my previous post on carbon intensity data, I realised I was making a number of assertions based on when I thought different generators would be running, and I hadn’t looked at any actual generation data. I know that some of the people I talk to are experienced in retrieving GB generation data and performing analysis, but I’m sure some people would be interested in knowing how to do it. If so, this is the post for you.
Please note, I’ve done my best to work out how these things work. However, it is entirely likely that I’ve oversimplified things, or misunderstood things. So please check anything I’ve written before relying on it to make important decisions, and please let me know if anything I’ve written looks wrong so I can correct it.
Introduction to the GB’s Generation Market
National Grid ESO and Elexon jointly operate Great Britain’s electricity system (I refer throughout this note to Great Britain rather than UK, as Northern Ireland is managed separately). They track generation and consumption data at the level of balancing mechanism units (BMUs), of which several hundred correspond to generation units. For the purpose of my analysis I will look at BMU level data, but I note that some large generators may include more than one BMU, and some very small generators (eg solar panels on roofs) won’t appear in this data at all (being a reduction in a consumption BMU’s data).
Most generators ‘decide’ how much electricity to generate each half hour, subject to operational constraints, though different generators have different constraints and decision-making factors. For example, most coal and gas generators will consider their fuel costs and expected power prices, as well as the time and cost to ramp up and down. Photovoltaic solar and wind generators will generally generate as much as they can, but will be limited by solar and wind levels. Generators are expected to submit an Initial Physical Notification the day ahead, advising National Grid of how much they are expecting to generate, followed by a Final Physical Notification an hour before the start of each delivery hour.
Clearly it is unlikely that this generation will exactly match consumption. National Grid therefore use a balancing mechanism to increase/decrease generation (it can also decrease/increase consumption, but I’ll ignore that possibility for the moment). This works by certain generators providing National Grid with offers to sell extra electricity, and bids to ‘buy back’ electricity. National Grid will use these to balance the grid, and Elexon will calculate the imbalance price based on the bids/offers accepted (the imbalance price is sometimes called the balancing price or the system price).
Generators that supply electricity according to their Final Physical Notification would, in the absence of other trading, receive the imbalance price for their electricity. However, this exposes them to considerable risk, that the imbalance price is extremely volatile, and has been as low as £-150/MWh in the past year — meaning that generators would have had to pay for having generated. To avoid this risk, most generators forward sell their expected generation, at least in the day ahead auctions but most likely months ahead.
The relationship between forward selling and generation is complex. In one sense, they are independent: just because you forward sell doesn’t mean you have to generate. For example, if you expect electricity prices to be below your cost of generation, you might decide not to generate, and your short position will be cashed out at the imbalance price. However, forward selling behaviour does tend to influence generation behaviour: if you forward sell and don’t generate, you risk imbalance prices being high. As a result, many generators generate based on what they have forward sold.
Analysis Part 1: Identifying Generating Units
As mentioned in the previous section, all of Elexon’s data is stored at BMU level. Each BMU has an identifier.
To get details of each of the BMUs I registered (for free) with a site called https://netareports.com/. With my username and password, I was then able to retrieve a list of fuel types, and a list of generation units using the following python code.
import requests
import pandas as pd
import xml.etree.ElementTree as ETurl = ('https://www.netareports.com/dataService?' +
'rt=mdd&username={}&password={}&datatype=fueltype')
r=requests.post(url.format(username, password))
e = ET.fromstring(r.text)
fueltypes = [ex.attrib for ex in e ]
fueltypes = {f['code']: f['description'] for f in fueltypes}url = ('https://www.netareports.com/dataService?' +
'rt=bmunit&username={}&password={}')
r=requests.post(url.format(username, password))
e = ET.fromstring(r.text)
genunits = [ex.attrib for ex in e ]
genunits = pd.DataFrame(genunits)
genunits['fueltypelong'] = genunits.fueltype.map(fueltypes)
This code created a dataframe (table) of 3079 records, for example:
bmunitid T_DRAXX-1 T_SEAB-1
fueltype COAL_OPT_IN CCGT
name Drax Seabank 1
ngcbmunitid DRAXX-1 SEAB-1
fueltypelong Coal (LCPD Opt-In) Combined Cycle Gas Turbine
It looks to me like this dataframe also includes consumption and interconnector units, however this isn’t a problem for my analysis. Also, I wasn’t sure what to make of the ngcbmunitids — they seem to be a way of grouping bmunits. However, as not all bmunits belonged to a ngcbmunit, and most ngcbmunits I looked at contained a single bmunit, I’ve ignored the ngcbmunits.
Secondly, I’m not sure how accurate some of the data is. My understanding is that T_DRAXX-1 is actually biomass rather than coal. There is a test report at https://test.bmreports.com/bmrs/?q=foregeneration/capacityperunit which seems to give some more accurate fuel types, but for other bmunits it just lists them as “generation”, which isn’t so helpful. Finally, the test report is missing some of the bmunits, for example the Walney Offshore Windfarm T_WLNYO-4 (the production report is missing even more). As a result, I suspect we’d need to do a bit of data cleansing to get an accurate classification of unit types.
Another thing that it would be useful to collect would be typical carbon intensities for each of the generator BMUs — I’m sure that must be available somewhere.
Analysis Part 2: Generation Data
Elexon’s BMReports site contains a great deal of data, most of which can be retrieved programmatically via an API. For the purpose of this analysis I looked at the actual generation by unit (report B1610). If you are using the API you will need to register with the site and get an API key — instructions for this are in their userguide.
I initially created a function which I have used for retrieving quite a lot of the data using the API:
import pandas as pd
import requestsdef load_data(report, date):
from io import StringIO
l1 = 'https://api.bmreports.com/BMRS/'
l2 = report
l3 = '/V1?APIKey=' + APIKEY + '&'
l7 = 'ServiceType=csv'
d = date.strftime('%Y-%m-%d')
if report in ['DERSYSDATA']:
l4 = "FromSettlementDate="+d+"&ToSettlementDate="+d+"&"
elif report in ['FORDAYDEM', 'INDOITSDO']:
l4 = "FromDate="+d+"&ToDate="+d+"&"
elif report in ['B1440','B1610', 'B1620', 'B1780', 'B0620']:
l4 = "SettlementDate=" + d + '&Period=*&'
else:
l4 = ""
r = requests.get(l1+l2+l3+l4+l7 )
if report in ['B1440','B1610','B1620', 'B1780','B0620',
'DEMMF2T14D','FOU2T14D']:
data = pd.read_csv(StringIO(r.text),
header=None,
skiprows=5)
else:
data = pd.read_csv(StringIO(r.text),
header=None,
skiprows=1)data = data.iloc[:-1]
return data
I then used this function to retrieve a week’s worth of generation data:
data = []
for date in pd.date_range('2019-03-01', '2019-03-07'):
data.append(load_data('B1610', date)[[7,8,12,4]])
d = pd.concat(data)
d = d.rename(columns={7: 'date', 8: 'period', 12: 'bmunitid', 4: 'volume'})
d = d.groupby(['date','period','bmunitid']).sum()
d = d.unstack(level=-1, fill_value=0)
This generates a dataframe with 229 columns (one per bmunit that generated during the period) and 336 rows (one for each of the 48 periods for each of 7 days). Note that period 1 is actually the half hour from 11–11:30pm on the previous day (prevailing UK time). I have intentionally avoided any days with daylight savings shifts (which would have 46 or 50 periods).
2__PSTAT001 2__PSTAT002 ... T_WLNYW-1
date period ...
2019-03-01 1.0 0.000 0.000 ... 7.340
2.0 0.000 0.000 ... 6.860
3.0 0.000 0.000 ... 9.520
4.0 0.000 0.000 ... 5.300
... ... ... ...
2019-03-07 45.0 10.214 73.848 ... 99.440
46.0 8.308 72.244 ... 122.220
47.0 6.274 60.046 ... 86.520
48.0 2.138 50.364 ... 29.400
I then wanted to look at which generators produced the most, and also include the full name and fuel type (as specified by the NETA data — which I mentioned above may not be entirely accurate). I used the following python code:
summary = pd.DataFrame(d.volume.sum(), columns=['volume'])
summary = summary.merge(genunits, left_index=True,
right_on='bmunitid', how='left')
summary['percvolume'] = summary['volume'] / summary['volume'].sum()
summary = summary.sort_values('percvolume', ascending=False)
summary = summary[['bmunitid','name','fueltypelong','percvolume']]
The first 20 rows of the data are as follows:
I next plotted charts of the generation over the week for four bmunits: T_SPLN-1 is Spalding (a CCGT), T_COTPS-2 is Cottam (coal), T_TORN-1 is Torness Generator 1 (nuclear) and T_WLNYO-4 is the Walney Offshore Windfarm Extension.
You can plot these in python, for example:
d['volume']['T_SPLN-1'].plot(ylim=(0,900), title='T_SPLN-1')
d['volume']['T_COTPS-2'].plot(ylim=(0,900), title='T_COTPS-2')
d['volume']['T_TORN-1'].plot(ylim=(0,900), title='T_TORN-1')
d['volume']['T_WLNYO-4'].plot(ylim=(0,900), title='T_WLNYO-4')
We can look at these production volumes against day ahead hourly prices (which I downloaded in Excel format from N2EX).
A few observations:
- Torness was generating at full capacity the whole time, which makes sense for a nuclear generator.
- Walney (the wind farm) is generating the most when prices are lowest. In fact, the causal direction is the other way — that prices are lowest when wind turbines are generating the most.
- Spalding (CCGT Gas) is on most of the time, except the second night (when prices are lowest). Production varies, I’d guess being highest in the hours with the highest prices (though the scale makes it difficult to verify that).
- Cottam (coal) is less flexible than Spalding. It doesn’t come on at all on the 3rd and 4th days, and on the other days it is mostly on for a block of hours during the day.
Analysis Part 3: Balancing Markets
The next thing I’d like to look at is which generators take part in the balancing markets, and possibly how this affects the overall level of production. That will be the subject of my next blog post on analysing electricity prices.