Analysis of Thames Water’s open data on sewage discharges into UK rivers, lakes, ground water and the sea.
- Over 7k sewage discharges in the first three months of 2023.
- Discharges on every day, even during the exceptionally dry February.
- When the rainfall volume in a given month is at least as much as the long-term average, over 50% of Thames Water’s sewage treatment works spill into the environment.
- There is moderate statistical correlation between daily rainfall and sewage discharge events, but there are many unexplained features in the data, which creates a demand on Thames Water and the Environment Agency to properly understand the high number of discharges. I suggest several topics for investigation around the operation of the infrastructure.
- There were 7,189 sewage discharge events in total during the period.
- There were discharge events on every day during the period.
- Average discharges per day over whole period was 79 (January: 126, February, 17, March, 89).
- Overall discharge rate was 2.4 events per location per day.
- 72% of all Thames Water locations with monitoring discharged at least once.
- 12 locations discharged for more than four weeks (non-consecutive days) in the period.
- Longest consecutive period of daily discharges was 34 days by Marlborough station, in Marlborough, feeding into the river Kennet, between 8 January and 10 February.
- Most number of discharges on a single day was 369 on 31 March, which was also the day with the most number of locations discharging.
Analysis and discussion
The UK has had quite variable rainfall in the first three months of 2023, with January having about the same volume as the long-term average, February being very dry and March being very wet (~155% of long-term average). Notably, the first half of January was very wet and the second half was very dry.
Sewage discharges by water companies has, understandably, been in the news a lot over the past few years. A recent release by the UK’s Environment Agency reports on the discharges during 2022. This current article brings the public knowledge up-to-date for discharges by Thames Water (via its open data API).
The first chart (above) shows daily discharge counts as well as the number of monitoring locations discharging per day. It’s worth noting, as Thames Water explains, that the discharge event counts should be considered as indicative, due to limitations in the monitoring technology (e.g. they can be faulty, over-sensitive, etc). Furthermore, I’m taking the raw count of events rather than the aggregated counting method used for regulatory reporting to the Environment Agency.
The proportion of Thames Water’s estate of sewage treatment locations (with active monitoring) that discharged during the period, was often above 20% per day during periods of rainfall:
Looking at the discharges as a rate of events per location per day (and excluding low event periods (<5 per day)), it can be seen that there is an overall sustained rate of discharge of sewage:
Aggregating the data by month shows that there were nearly 4k discharge events in January 2023 alone:
A monthly aggregation also shows a starker view of the proportion of Thames Water’s locations that are discharging. On the current data, over 50% discharged whenever there is at least the long-term average volume of rainfall in a month. Would this relationship hold persistently? Perhaps yes, unless there is technological improvement of the sewage treatment works.
Justification of discharges — rainfall correlation
Comparing the daily discharges with daily rainfall data for the South-East region (as a proxy), the graph below shows that, on the whole, there is visual coincidence of rainfall with discharges.
I’ll come back to the period highlighted in green shortly.
The data appear to correlate with a time lag. Calculating a Pearson correlation with 0- to 5-day lags shows that a 1-day lag gives the strongest correlation, of +0.70. This suggests that there is a relationship between heavy rainfall and sewage spills, as is intuitively obvious (based on a basic understanding of how sewage infrastructures in the UK work) and as is argued by the water companies and the Environment Agency. However, there are several important features seen in the data that are unexplained by this relationship, namely: intensity, time-based accumulation and post-rainfall-event drop-off.
The latter is particularly concerning, when looking at the green highlighted period in the graph above. In the absence of any significant rainfall, Thames Water’s locations were discharging at a significant rate. I can imagine infrastucture-based explanations for this, but it is the responsibility of the water company to explain why this should be — and the Environment Agency to determine if this is acceptable!
Presenting the 1-day lagged discharge data in a scatter graph with the South-East region rainfall data (below) gives a sense of the correlation discussed above. The non-linearity is notable, and presumably a consequence of the features of the infrastructure.
Again, the green region needs explaining: why should there be a significant volume of discharges when there is so little rainfall?
This presentation brings the public knowledge up-to-date (as of the time of publishing) regarding recent sewage spills into the environment made by Thames Water.
There continue to be a worryingly large number of sewage discharges by Thames Water.
Statistical correlation with rainfall data is moderate, which explains some of the discharge events, but not all. There remain many areas of operational activity for Thames Water and the Environment Agency to investigate, in terms of understanding the high number of discharge events at different points in time.
It cannot be left for water companies to argue that high rainfall fully explains the extent of discharges. The data clearly contradict that. I suggest the following areas for the Environment Agency to investigate:
- The suitability of the existing infrastructure to handle long-term average rainfall, let alone periods of high rainfall (as will become more common as climate change develops).
- The operation of infrastructure during low rainfall periods.
- How infrastructure handles time-based rainfall accumulation and intensity spikes.