Using multi-task learning for robust data-driven flow rate models
When production conditions are constantly changing, how can we keep a data-driven virtual flow meter (VFM) robust towards this transient behavior? In part 3/5 in this series on multi-task learning (MTL), we delve into the robustness of MTL - a key property for the application of data-driven VFMs in practice.
- Christine Foss Sjulstad
- Publish date
- · 3 min read
In the previous article of this series, we acknowledge the difficulty of predicting multiphase flow rates when the underlying process is non-stationary. In machine learning theory, the primary assumption is that future conditions resemble past ones; it’s possible to predict the future based on past data points. But what do you do when this assumption doesn't stick? The reservoir - the underlying process in this case - is continuously changing, such that future production ceases to resemble past observations and data-driven VFMs quickly become outdated. In fact, in a previous article on data-driven VFMs, we reported a mean absolute percentage error (MAPE) of the 75% best models between 16 and 24 percent on unseen future data, compared to 8 to 10 percent on historical samples. Only for a limited amount of time right after it’s trained, can such a model produce reasonable predictions. This is however not considered to be robust enough for commercializing conventional data-driven VFM. What a buzzkill…
Through solving the initial problem, this training regime requires substantial infrastructure and continuous monitoring to make it efficient, automatic, and effective - a potentially costly investment. It is therefore just as important to increase the robustness of the model itself as the wells enter previously unseen conditions. As mentioned in the second article, MTL is a potential way to solve this challenge as the wells learn from each other. Several wells might have undergone similar reservoir conditions. So even though the past data points for a well cannot explain the current flow rates of the well in question, we could have data from other wells that can. Hurray!
In a broad case study of 55 wells from four oil and gas assets and steady-state production data spanning over several years, we explored the robustness of four data-driven models on unseen data samples. Two traditional machine learning models, based on gradient boosted trees and conventional neural networks, were used as baseline models. They are referred to as “STL-GBT” and “STL-ANN”, respectively. They represent so-called single-task models, i.e. models that are trained on a single well’s data. We further introduce two types of MTL models. The first version is trained on wells from the same asset, whereas the second one is trained on all wells regardless of asset. These model types are referred to as “MTL-Asset” and “MTL-Universal”. The full study, including further results and details, can be seen in Sandnes et al. (2021).
We visualize the main findings of the study in the figure above, which illustrates how prediction errors develop with time. Unsurprisingly, the performance worsens with time for all four model types. Furthermore, they all perform well during the first few weeks. However, the benefit of MTL is apparent after six weeks, preserving fairly low and stable errors while the single-task models degrade considerably. So there you have it. MTL models withstand time much better than conventional data-driven VFMs. Perhaps not that surprising? You tell me.
Coming up is an article on how MTL models predict flow rates that are not only accurate, but also in tune with what we would expect with regards to the physics behind it all. You’re in for a treat!