Accurate wind power forecasts with a human-in-the-loop

One plus one is sometimes more than two – a human being plus a machine can achieve much more than each on its own. While computers excel at processing large volumes of data at high speeds, humans bring a complementary, unparalleled depth of understanding and adaptability to the table. This is a simple definition of ‘human-in-the-loop’ (HITL).

The HITL concept is widely used in engineering and computer science. What does this mean for AI-based generation forecasting for renewable energy? And what are the benefits to quantitative analysts or short-term power traders using this technology?

In this article, I explain how we balance automated processes and human involvement in our Generation forecast for wind power. I zoom in on real-world scenarios where the subtleties of individual assets demand a nuanced, human approach and review both the benefits and challenges of this approach.

A data-driven approach

The underlying philosophy for ‘human-in-the-loop’ at Dexter Energy revolves around a meticulous focus on data quality.

As we develop our machine learning products for renewable energy companies, we navigate multiple external data sources: weather models, market data, and historical production data from asset owners. The latter is what most power forecast models are trained on.

If production data contains inaccurate or unrepresentative measurements, the model will not make the most precise predictions. And, since it’s coming from the real world, this data will often be imperfect, reflecting the complex operation of a wind park, which can include malfunctioning, maintenance, curtailment and re-dispatching, or extreme weather events.

Thus, rather than directly using the historical production data to train our models, we take the time to understand its intricacies. We do so by combining data science capabilities with domain knowledge: a good grasp of statistics and a comprehension of the physical processes behind the data, such as the meteorological phenomena or the engineering behind a wind turbine.

Within this process, we create programs to detect and filter out anomalies automatically. However, although automation handles the bulk of data processing, it is insufficient in removing all noise from the data. Figuring out the underlying reasons for some of these scenarios and acting in a way that benefits our customers requires the discerning eye of a human.

Asset tuning

From years of experience improving our generation forecast, we learned that per-asset tuning adds immense value for our customers. Here are three common scenarios.

Asset-level changes

In the image below, we see the production level of a fictive wind park significantly change halfway through the analyzed period:

Wind asset-tuning scenario: Asset-level changes

With this data on hand, what can the wind park be expected to produce over the next period? Which production pattern is representative of the wind park, and which is temporary?

A model learning from these combined signals would forecast a middle level between these two periods, which would be naive. However, a human reviewing this graph would intuitively know that the wind park likely shut down some of its turbines permanently or for maintenance.

Additionally, a human would reach out to the asset owner to understand which signal would be most useful to them: should we be forecasting the current (maybe temporary) or the previous state? This is not a question the data alone can address.

Generalizing to larger portfolios, powerful models can analyze thousands of assets and flag anomalies with a couple of wind turbines. But, it will be a forecast engineer who will consult with the renewable energy company to decide what the best way forward is for those turbines.

Faulty data

Data from an asset owner might contain measurement errors. A visually striking fourfold discrepancy like the one below is rare, but could happen if a measurement for a specific time step is submitted in KW rather than KWh.

Wind asset tuning scenario: Faulty production data

Automated filters are useful in cases like this, but they won’t detect all the possible issues. It’s worthwhile to have a human receive suggestions of plausibly faulty data and decide what to remove and keep.

The onboarding phase for a new customer is an excellent opportunity for a human-in-the-loop approach and tackling errors such as this one. During onboarding, we perform automatic checks on production data and take the time to review and flag possible issues detected by our algorithms. Clear communication during onboarding is key to a good start, and we’re glad to see our customers appreciate it.

Curtailment

In the mock-up below, production data during moments of high output is polluted with curtailments (highlighted in dark teal) – the output is reduced or turned off, common in moments of negative balancing prices. As a result, the model will start to drift and under forecast.

Wind asset tuning scenario: Curtailment

The nuance here is more complex. Beyond the historical production data, we look at time series signals around a wind park’s availability in order to understand if wind turbines are down for maintenance or turned off due to market conditions.

Using these signals to improve the training data that goes into our models requires tuning. We need to understand what is behind the data, look at the specific setup of the asset owner, and apply specific filtering. There are different ways to handle a small wind park where availability is rarely below 100% versus a large offshore park where a few turbines might often be off.

With the HITL approach, we can correct for historical curtailments to prevent under-forecasting and subsequent lower trade profits.

The customer benefits

Optimizing our models and algorithms through human intervention and contribution allows us to ensure high-quality data, i.e., clean, complete, and consistent. As we’ve seen, if there are errors or fundamental changes in the signal we’re trying to forecast due to availability, we are able to detect those and adjust faster.

As a result, we are able to deliver more accurate forecasts – currently ranked in the top three on the market.

A scaling challenge

Scaling a HITL approach is a formidable challenge, especially when dealing with thousands of wind parks across several countries and continuously onboarding new customers.

To overcome it without compromising on the benefits, we built software tooling for more efficient human-in-the-loop. We mapped out the jobs required, extracted the ones that took the most time, and turned them into a forecast tuning tool. In this way, we ‘freed’ the humans to focus on the higher value rather than tedious tasks.

This method is robust because our operations staff have a strong technical acumen, allowing them to actively contribute to or drive the improvements of our tooling. This speeds up the process of developing software and means we can keep working with a magnitude of data in a structured way, find issues, and work with our customers to fix them.

One plus one

At Dexter, we’re constantly exploring innovative ways to improve the performance of power forecasting. And, despite claims that total automation might be the endgame for AI in energy trading, we know that we’ll always need and count on human skills, augmented by intelligent tools. Our human-in-the-loop approach ensures we continue to lead the way in providing accurate, reliable forecasts for a cleaner and more affordable energy future.

Could your short-term trading operation use a reliable wind or solar generation forecast? Book a demo and start saving balancing costs!