Data-driven at core, now sprinkled with physics
Neural networks, multi-task learning, generative models, 7 million (!) data points. Safe to say, we're fully committed to the data-driven philosophy. But we're also maturing how physics can fit into this. Why ignore hundreds of years of wisdom?
- Author
- Solution Seeker
- Publish date
- · 4 min read
In all seriousness though, our excellent research team has been dedicated in their efforts to develop an in-house physics-based simulator, called ManyWells. The fact that we can now artificially generate an abundance of physics-compliant wells and scenarios signifies a great leap for us. It opens a lot of doors, granting us the opportunity to apply additional synthetic data on top of the true production data our clients have given us access to.
«But we are not here to gatekeep! True to our strong commitment towards research and innovation, we're sharing our simulator and synthetic data with the community.»
People
People in all locations
-
Bjarne Grimstad
Bjarne leads the company’s research activities. He holds a position as associate adjunct professor at Engineering Cybernetics at NTNU, teaching and supervising students on the topic of AI and machine learning. Bjarne has previously worked as a Senior Engineer at FMC Technologies. He loves climbing, watching Japanese cartoons, and trying out new restaurants in Oslo and abroad.
-
Kristoffer Nesland
Kristoffer is a part of our data science team. He holds an MSC in Mechanical Engineering from NTNU in Trondheim specializing in Robotics and Automation, and has taken additional computer science courses. Kristoffer usually keeps a high pace, and enjoys activities like mountain biking, orienteering and cross-country skiing.
-
Erlend Lundby
-
Kristian Løvland
Kristian is part of the Solution Seeker research team, and is pursuing his PhD on machine learning for control. He holds an MSc in Cybernetics and Robotics from NTNU, specializing in process control. He enjoys reading, cooking, and attending quiz events around Oslo.
A well's typical behaviour has been studied by physicists for a while. In addition to established first principles equations, several empirical ones govern our expectations as to how a well should behave. The well's properties are obviously important, such as depth, fluid densities, etc. However, this doesn't mean that a well is going to do exactly what these equations predict. There are many simplifying and idealistic assumptions in even the most complex of equation sets and simulators. The true behaviour of a well is extremely difficult to capture, often due to the multiphase flow (oil, gas and water) that is streaming through the pipes.
Yet, with time, the governing equations have grown to include a great deal of scenarios, for example different flow regimes. A simulator is perhaps never going to capture all physical phenomena perfectly, but in very many cases it's simulation results are better proxies than blind assumptions, especially when the historical production data isn't representative for the current phenomena.
But we are not here to gatekeep! True to our strong commitment towards research and innovation, we're sharing our simulator and synthetic data with the community. With the exceptions of the 3W dataset by Petrobras and the MRST reservoir simulator by Sintef, there are almost no public resources in this space. We're proud that we can help change that!
- GitHub repo
-
For the simulator code
Link to GitHub repo - HuggingFace repo
-
For the simulated datasets
Link to HuggingFace repo
Now, if we take a sneak peak through those famous open doors, there are quite a few ways ManyWells can guide us in the right direction. The most intended ones are listed below.
1
Hybrid AI has been the talk of the town for a while, and it simply states that you're combining physics and a data-driven methodology to compensate for each other's weaknesses and build on each other's strengths. It's no secret that the main pillar of machine learning is that what you've seen (or trained on) before should be similar to what you experience now. In general, a well system lacks this property due to the non-stationarity of the underlying reservoir. As the reservoir changes unpredictably, the well changes accordingly. Machine learning models often have the inherent flaw that they exhibit a significantly decreased predictive performance when applied on data that doesn't resemble the past. Previously, we've explained how multi-task learning can drastically improve the predictive performance for virtual flow metering. Though it doesn't have to stop there, does it?
For example, we can now generate synthetic data and practice a hybrid training routine where our data-driven virtual flow meter (VFM) trains on both simulated and true production data. Our VFM could then see far more scenarios than before, ensuring a more physics-compliant and robust behaviour.
2
When developing an algorithm or model, especially in the oil & gas industry, you're not always blessed with a whole lot of representative data to actually test it on. This means that the evaluation phase becomes a tedious and long-lasting trial-and-error period without structure and clear results, often with the client using their gut feeling as the target. Although their gut feeling can be quite correct in many cases, nobody is immune to biases and one can hardly say that this qualifies for objective, quantitative performance reports.
With a simulator, we can create test data to quality assure an algorithm before we roll it out. We can use it to understand what "should" happen in certain scenarios to 1) enhance our understanding of our client's true production data, 2) evaluate how well our models or algorithms respond to the data and 3) adjust accordingly.
3
Due to the confidential nature of our clients' data, it's seldomly straightforward for us to give a demo of our products. Certain features, even whole applications, are very dependent on the specific traits of a client's data and that the data correlations are realistic. Hence, you can't just scramble the data to anonymize it. In a demo, where the point is to demonstrate the application's value, it's underwhelming and undermining if you can't tell the story with the data backing it.
The simulator is then a crucial step towards being able to demonstrate an application with realistic data, without our clients knocking on our door with a subpoena.
- Manywells: Simulation of Multiphase Flow in Thousands of Wells
-
For the journal paper
Link to ManyWells' paper