On the challenges of modeling data-driven virtual flow meters
A cornerstone of ML applications is available data. Yet, in industrial applications the data may fail to suitably represent the underlying process behavior. We discuss some data challenges faced by data-driven VFMs.
- Author
- Solution Seeker
- Publish date
- · 6 min read
Numerous companies are hunting for data-driven solutions that will revolutionize their industry. In the petroleum industry, two decades of steady improvement in instrumentation of petroleum assets and committed investments in digitalization projects to improve data collection systems have set the stage for machine learning. One of the hottest candidate applications is data-driven virtual flow meters (VFMs), where machine learning promises to reduce maintenance costs without expensing on accuracy.
This is the second article in a series about data-driven VFMs. Read the introductory post here.
5%
We have several years of experience working with real production data from petroleum assets and in modeling data-driven VFMs. Solution Seeker is the first company that offers the market a commercial data-driven VFM. Our goal is less than 5% error, with 90% less effort. In our R&D work towards this goal, we have had to contend with a lot of data trouble. We have identified four prevalent challenges that degrade the performance of data-driven VFMs:
- Low data volume
- Low data variety
- Poor data quality
- Non-stationary process
In the following, we will take a deep-dive into these four challenges. Hopefully, this will shed some more light on why we don’t see more data-driven VFMs in operation - yet.
As many researchers working in the domain of AI and ML know, data-driven solutions, especially those based on high capacity models like neural networks, are data-hungry (Mishra and Datta-Gupta, 2018). They require a substantial amount of data volume. However, some petroleum assets do not have continuous sensor measurements of the multiphase flow rate, an important measurement for development of VFMs. Instead, new flow rate measurements are obtained during well-testing, at most 1-2 times per month (Monteiro et.al. 2020). These assets must establish years of production before a sufficient volume of data is acquired. This is unfortunate as VFMs are most useful for assets where the data volume is sparse, yet, development of data-driven VFMs requires a sufficient data volume before they are accurate enough for utilization.
Even for the assets that do have an appropriate data volume, the data variety is often inadequate. It is well known that many data-driven solutions extrapolate poorly, resulting in a struggle to make meaningful and acceptable predictions in previously unseen operating conditions. For a petroleum asset, data variety is largely decided by the operational practices of the operator. Operators are often concerned with maintaining stable production rates, and may not be aware that perturbing the system is beneficial to model learning. Take Figure 1 as an example. Here, the choke openings that are seen up to now are visualized for three of the wells Solution Seeker is working with. The X-axis shows the choke opening, from fully closed (0%) to fully open (100%). Notice, Well Z has data samples in almost all of the operating region, whereas Wells X and Y are lacking data above 40% choke opening. For a data-driven VFM that learns its behaviour only from patterns in data, how can we expect it to make good predictions in the unseen operating domain above 40% opening?
![](https://solutionseeker2021.imgix.net/images/old/figure-2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=398&q=90&w=600&s=dc160e678788a375641d0ef1e3a18454 600w, https://solutionseeker2021.imgix.net/images/old/figure-2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=464&q=90&w=700&s=f62810903e35aec1343f8711eb329450 700w, https://solutionseeker2021.imgix.net/images/old/figure-2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=530&q=90&w=800&s=f1197773a94f5889a230950ce1f99d5c 800w, https://solutionseeker2021.imgix.net/images/old/figure-2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=597&q=90&w=900&s=b161d4fc2216771b6e7f9147e09394a6 900w, https://solutionseeker2021.imgix.net/images/old/figure-2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=663&q=90&w=1000&s=04b197e64e85498f948ab4af4639d07b 1000w, https://solutionseeker2021.imgix.net/images/old/figure-2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=729&q=90&w=1100&s=975541b3208037a223d18d685f826a09 1100w, https://solutionseeker2021.imgix.net/images/old/figure-2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=795&q=90&w=1200&s=527fb19eed5b07762f20bb7307b4f886 1200w, https://solutionseeker2021.imgix.net/images/old/figure-2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=862&q=90&w=1300&s=d590c0e9366d62400de8c3b041a5c581 1300w, https://solutionseeker2021.imgix.net/images/old/figure-2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=928&q=90&w=1400&s=07cd3cddf639ff615b26abce39875de2 1400w, https://solutionseeker2021.imgix.net/images/old/figure-2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=994&q=90&w=1500&s=384def51e170cdf2c8b1968c807ea26f 1500w, https://solutionseeker2021.imgix.net/images/old/figure-2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1061&q=90&w=1600&s=555b886348da1eab858b034562f71911 1600w, https://solutionseeker2021.imgix.net/images/old/figure-2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1127&q=90&w=1700&s=a4cf2e707402310f34960366a9a829fe 1700w, https://solutionseeker2021.imgix.net/images/old/figure-2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1193&q=90&w=1800&s=cdb02ce7baa0cdebe283a7f39467d064 1800w, https://solutionseeker2021.imgix.net/images/old/figure-2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1259&q=90&w=1900&s=adb28b4e13a77b8c0054b3d83e0b882e 1900w, https://solutionseeker2021.imgix.net/images/old/figure-2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1326&q=90&w=2000&s=b613bb9c6bdbd2e831798774a5074ccb 2000w, https://solutionseeker2021.imgix.net/images/old/figure-2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1392&q=90&w=2100&s=bcef8f0077472c33cf53ff10366d3f92 2100w, https://solutionseeker2021.imgix.net/images/old/figure-2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1458&q=90&w=2200&s=321f506c48e72c58fa3d4d42eea70bb9 2200w, https://solutionseeker2021.imgix.net/images/old/figure-2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1525&q=90&w=2300&s=76dc0cf62e17d8b26288f96db2e9f438 2300w, https://solutionseeker2021.imgix.net/images/old/figure-2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1591&q=90&w=2400&s=e697cacf666b90f4e3c3ec74f0622fc2 2400w)
Improved well testing procedures can partially address the two first challenges. By performing well tests more frequently and prioritizing wells with uncertain predictions, the data volume can be increased for wells where it counts the most. The way testing is performed can also affect data variety. For example, ensuring sufficient test time and performing multi-rate well tests can increase the amount of information gathered from well testing. Solution Seeker offers applications for scheduling and optimizing well tests to improve data volume and variety.
The third challenge is related to the measurements that are already available. Sadly, it is not uncommon that these are noisy and biased, and may even fall out for long periods of time. Figure 2 illustrates the latter issue and shows the huge number of missing values from some of the data streams we work with at Solution Seeker. On the Y-axis we have 100 data streams, and the black markings indicates the samples where the measurement value is missing. Luckily, there are ways to handle poor quality data to a certain extent, such as preprocessing and data reconciliation. Here at Solution Seeker, we have a patented and proprietary data squashing algorithm (Grimstad et.al. 2016) that automatically handles some of these issues as data comes in in real-time.
![](https://solutionseeker2021.imgix.net/images/old/figure-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=422&q=90&w=600&s=66dd6292ce4e4c2c90cc81c3492871c5 600w, https://solutionseeker2021.imgix.net/images/old/figure-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=493&q=90&w=700&s=1a4a3334422622898789caa0813b8e5a 700w, https://solutionseeker2021.imgix.net/images/old/figure-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=563&q=90&w=800&s=601afb80c87988d2683b6d6d869765bf 800w, https://solutionseeker2021.imgix.net/images/old/figure-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=633&q=90&w=900&s=40c42f3ad3c23cf5828960707f1c4703 900w, https://solutionseeker2021.imgix.net/images/old/figure-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=704&q=90&w=1000&s=b3899b56628076af00b1e7204ed2d671 1000w, https://solutionseeker2021.imgix.net/images/old/figure-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=774&q=90&w=1100&s=c9d9671421c18d673c3d4d3e3222308b 1100w, https://solutionseeker2021.imgix.net/images/old/figure-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=845&q=90&w=1200&s=69d3e3065dde5e334ac74c1c84d25ee2 1200w, https://solutionseeker2021.imgix.net/images/old/figure-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=915&q=90&w=1300&s=c7d473aa95365ec7e73f978fdff5032c 1300w, https://solutionseeker2021.imgix.net/images/old/figure-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=985&q=90&w=1400&s=19de717698501482498b111de3a9d15f 1400w, https://solutionseeker2021.imgix.net/images/old/figure-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1056&q=90&w=1500&s=d140b5365fab31e8a4cb5c20c18ff0e1 1500w, https://solutionseeker2021.imgix.net/images/old/figure-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1126&q=90&w=1600&s=86cc6f49fc2eb9c86b9c584f0edde28e 1600w, https://solutionseeker2021.imgix.net/images/old/figure-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1196&q=90&w=1700&s=8672c8468b540e7b6c841d6a99c8ec24 1700w, https://solutionseeker2021.imgix.net/images/old/figure-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1267&q=90&w=1800&s=c6b33c0b92e5c51dddf0be7a1f66e578 1800w, https://solutionseeker2021.imgix.net/images/old/figure-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1337&q=90&w=1900&s=a0fafd14f756f5cf6c686eb0f7a0c2c6 1900w, https://solutionseeker2021.imgix.net/images/old/figure-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1408&q=90&w=2000&s=f66ac50931f37dbe7bd37bc3e6af80e3 2000w, https://solutionseeker2021.imgix.net/images/old/figure-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1478&q=90&w=2100&s=ec4afe8b0f37c2a7659dba7e72a9511d 2100w, https://solutionseeker2021.imgix.net/images/old/figure-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1548&q=90&w=2200&s=63fb213d72810e30e25e0d7ab3477705 2200w, https://solutionseeker2021.imgix.net/images/old/figure-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1619&q=90&w=2300&s=702f6f14a4f2f30f48929661d49dd084 2300w, https://solutionseeker2021.imgix.net/images/old/figure-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1689&q=90&w=2400&s=fae3911785f051bc2f9a7d93afb96ddb 2400w)
Now, imagine that you have a dataset that does have an appropriate data volume and variety, and that you have done everything in your power to reduce poor quality data.
You will then get a slap in the face because there is no way of escaping challenge four.
Unfortunately, the available production data is generated by a non-stationary process due to the reservoir emptying. The process conditions may be stationary in short periods of time, however, in time with drainage of the reservoir, the characteristics of the process will slowly change. In practice, this means that we will always be in a situation where the models will have to extrapolate to previously unseen process conditions. This is illustrated in Figure 3. Shown is the choke opening (%) and pressure in the wellhead (bar). Pressure in the wellhead decreases while the choke opening is increasing, pushing the process into previously unseen process conditions. Therefore, in time, VFMs will have to be recalibrated. How often will naturally depend on the process.
![](https://solutionseeker2021.imgix.net/images/old/figure-3-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=398&q=90&w=600&s=72a52208eae4e0f37602902238256e0a 600w, https://solutionseeker2021.imgix.net/images/old/figure-3-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=464&q=90&w=700&s=8a6b14330ca57f40f50f56e5f6e56e5d 700w, https://solutionseeker2021.imgix.net/images/old/figure-3-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=530&q=90&w=800&s=0decf092e6b53e5e4b1db9b98112594d 800w, https://solutionseeker2021.imgix.net/images/old/figure-3-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=597&q=90&w=900&s=9f4197a612511bea185d3b8475136cb2 900w, https://solutionseeker2021.imgix.net/images/old/figure-3-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=663&q=90&w=1000&s=aaa405846091b487e13182b858c350cb 1000w, https://solutionseeker2021.imgix.net/images/old/figure-3-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=729&q=90&w=1100&s=122a6515db2f9a162af2645c54eb96f2 1100w, https://solutionseeker2021.imgix.net/images/old/figure-3-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=795&q=90&w=1200&s=9aafe7fd9c8b8da32e519201d463c662 1200w, https://solutionseeker2021.imgix.net/images/old/figure-3-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=862&q=90&w=1300&s=a1299afb736d17b105addfb59351cba9 1300w, https://solutionseeker2021.imgix.net/images/old/figure-3-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=928&q=90&w=1400&s=1929ab2188f21c8c9fd6df78b1b1574b 1400w, https://solutionseeker2021.imgix.net/images/old/figure-3-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=994&q=90&w=1500&s=dbe60f5f6907ebd97f7a7362b025251c 1500w, https://solutionseeker2021.imgix.net/images/old/figure-3-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1061&q=90&w=1600&s=47caf5a8ffe926343e13eafb04abaa2c 1600w, https://solutionseeker2021.imgix.net/images/old/figure-3-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1127&q=90&w=1700&s=cd46792e1cd0d91f23b0b75689b72bb5 1700w, https://solutionseeker2021.imgix.net/images/old/figure-3-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1193&q=90&w=1800&s=550f3e1c367f5ef2ff424618f7b26c9b 1800w, https://solutionseeker2021.imgix.net/images/old/figure-3-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1259&q=90&w=1900&s=ccff086843191a72dc40573cfea3491b 1900w, https://solutionseeker2021.imgix.net/images/old/figure-3-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1326&q=90&w=2000&s=49cd926e2d58781c111c6785c650c6f1 2000w, https://solutionseeker2021.imgix.net/images/old/figure-3-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1392&q=90&w=2100&s=aa7a61a37338efa9532536fb132aa044 2100w, https://solutionseeker2021.imgix.net/images/old/figure-3-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1458&q=90&w=2200&s=7b015cab58d0a17ee07a1abb39b25db6 2200w, https://solutionseeker2021.imgix.net/images/old/figure-3-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1525&q=90&w=2300&s=6952d187884430318a8e0d531c9ad459 2300w, https://solutionseeker2021.imgix.net/images/old/figure-3-1.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1591&q=90&w=2400&s=109abb3ebe29859552a029d21100b95e 2400w)
4%
In a recent article by Solution Seeker’s research lab (Grimstad et.al. 2021), we developed data-driven VFMs for 60 petroleum wells across 5 assets. We did this with the traditional or “vanilla” approach. The results were diverse. For some of the well models we achieved an excellent performance with an average test error as low as 4%, whereas for others, the error was well above 20%. What this underlines is that robustness is a key issue, and cherry-picking and showcasing results from the best developed models might be misleading.
We believe that the vanilla approach to data-driven VFMs is not robust enough to be rolled-out and scaled in operations. Our findings motivate approaches that utilize all available knowledge to a greater extent. For instance, probabilistic modeling, gray-box modeling and learning across wells. We believe that this is a necessity to provide high accuracy, easily maintained virtual flow meters.
Keep an eye out for the remaining articles in this series, where we will deep-dive into these results and discuss the above-mentioned alternative approaches to VFM modeling.
References
- B. Grimstad, V. Gunnerud, A. Sandnes, S. Shamlou, I. S. Skrondal, V. Uglane, S. Ursin-Holm, B. Foss, A Simple Data-Driven Approach to Production Estimation and Optimization, in: SPE Intelligent Energy International Conference and Exhibition, Society of Petroleum Engineers, 2016.
- Grimstad, B., Hotvedt, M., Sandnes, A.T., Kolbjørnsen, O., Imsland, L.S., 2021. Bayesian neural networks for virtual flow metering: An empirical study. arXiv:2102.01391
- S. Mishra, A. Datta-Gupta, Applied Statistical Modeling and Data Analytics - A Practical Guide for the Petroleum Geosciences, Elsevier, 2018.
- D. D. Monteiro, M. M. Duque, G. S. Chaves, V. M. F. Filho, J. S. Baioco, Using data analytics to quantify the impact of production test uncertainty on oil flow rate forecast, IFP Energies Nouvelles 75 (2020).