Addressing data challenges with multi-task learning
When training machine learning models we are often faced with challenges related to insufficient data. Multi-task learning addresses this issue by sharing knowledge between related problems. Part 1/5 of the MTL series.
![Author Anders Sandnes portrait image](https://solutionseeker2021.imgix.net/images/people/anderss_avatar.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=160&q=90&w=160&s=36bcb8ec418ccd0f7cdfcad45e644710)
- Author
- Anders Sandnes
- Publish date
- · 3 min read
Related to multi-task learning, transfer learning refers to the concept of applying knowledge gained in one domain to other related domains. Humans excel at this, being able to recognize similarities between new problems and past experiences with ease. For instance, when children learn to walk, they take the experiences they have with crawling along with examples of watching their parents, or other children walk, and put them together. Before you know it, your kid is running at the speed of light (at least it feels like it).
In the context of machine learning, many problems have insufficient data to learn a satisfying model, and it is desirable to leverage knowledge from other related problems to improve the performance.
Take image classification as an example. Imagine that you would like to classify a Lynx. As an incredibly shy felid, a limited number of pictures actually exist of it. On the other hand, there are numerous cat pictures online (haven't we all been watching cute cat videos in the dead of night?). Why not utilize an image classification algorithm trained on cats as a starting point for classifying lynx? Transfer learning is a long standing problem within the machine learning community, with initial work dating back to the 1980s (Silver 2008). In recent years the focus on knowledge transfer has grown, with Andrew Ng predicting at NIPS 2016 that transfer learning or multi-task learning will be the next driver of value in machine learning (Ng 2016).
There are several ways to address knowledge sharing, and they all attempt to really make use of the data used to train “data hungry” machine learning models (Adadi 2021). Multi-task learning (Caruana 1997) differs from transfer learning by attempting to learn multiple related tasks in parallel, using a shared model representation to transfer knowledge between tasks. It has seen significant results in various domains, including speech (Chen 2015), language (Collobert 2008), and image analysis (Wang 2009).
Let us use a human's ability for speech analysis as an example. Consider hearing a person talk. From the sound, we can usually tell the persons sex, where they are from, the mood they are in, and the meaning of the words. This is what multi-task learning is all about. With one model, we can answer a plurality of questions with just one sequence of inputs.
![](https://solutionseeker2021.imgix.net/images/MTL-2.1-Figure2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=552&q=90&w=600&s=8ef95851c6a19e45d7087bd90b352ba0 600w, https://solutionseeker2021.imgix.net/images/MTL-2.1-Figure2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=644&q=90&w=700&s=200e8407c357aee08e171e6da7e39521 700w, https://solutionseeker2021.imgix.net/images/MTL-2.1-Figure2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=736&q=90&w=800&s=c834f28fce33a06e849bdd11d08b6758 800w, https://solutionseeker2021.imgix.net/images/MTL-2.1-Figure2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=828&q=90&w=900&s=ddbdd043a3b4a9db766ebe57ef3d5de0 900w, https://solutionseeker2021.imgix.net/images/MTL-2.1-Figure2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=920&q=90&w=1000&s=db15510cc4d5f1dbf57ad02411aee8e0 1000w, https://solutionseeker2021.imgix.net/images/MTL-2.1-Figure2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1012&q=90&w=1100&s=32085d2f4fcc3f0b1ab50bb513a3813f 1100w, https://solutionseeker2021.imgix.net/images/MTL-2.1-Figure2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1104&q=90&w=1200&s=f8e75778410ff2df2833c3bbd7d57c88 1200w, https://solutionseeker2021.imgix.net/images/MTL-2.1-Figure2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1196&q=90&w=1300&s=b3a176cb2efeee29f32c76182df65383 1300w, https://solutionseeker2021.imgix.net/images/MTL-2.1-Figure2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1288&q=90&w=1400&s=639fb8e55c13023853e8addaab1f520d 1400w, https://solutionseeker2021.imgix.net/images/MTL-2.1-Figure2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1380&q=90&w=1500&s=f728e0e22b1d1bc9a3b0b27c07bda1b8 1500w, https://solutionseeker2021.imgix.net/images/MTL-2.1-Figure2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1472&q=90&w=1600&s=d347ac55f8f6c66cbaaf84ffda3e228a 1600w, https://solutionseeker2021.imgix.net/images/MTL-2.1-Figure2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1564&q=90&w=1700&s=2c803c1afcd8de8e96d4475440869f1a 1700w, https://solutionseeker2021.imgix.net/images/MTL-2.1-Figure2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1656&q=90&w=1800&s=45a456ce4c886e95b9aa35853d268cd0 1800w, https://solutionseeker2021.imgix.net/images/MTL-2.1-Figure2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1749&q=90&w=1900&s=c75add39c69931fef58bae88411ea5ba 1900w, https://solutionseeker2021.imgix.net/images/MTL-2.1-Figure2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1841&q=90&w=2000&s=087eff992d7167ca9d090ebf8834d6b8 2000w, https://solutionseeker2021.imgix.net/images/MTL-2.1-Figure2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=1933&q=90&w=2100&s=c13efa08a922c289ffe8881a72317d01 2100w, https://solutionseeker2021.imgix.net/images/MTL-2.1-Figure2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=2025&q=90&w=2200&s=8c9174bd25fff84d1004c679343f580f 2200w, https://solutionseeker2021.imgix.net/images/MTL-2.1-Figure2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=2117&q=90&w=2300&s=c0a7b1aa923c0ffcd07c84da5526493f 2300w, https://solutionseeker2021.imgix.net/images/MTL-2.1-Figure2.png?auto=compress%2Cformat&crop=focalpoint&cs=srgb&fit=crop&fp-x=0.5&fp-y=0.5&h=2209&q=90&w=2400&s=6342252e6cac397f9a0b3dbecc3c1fd8 2400w)
A diverse set of model architectures has been explored for multi-task learning (Zhang 2021), ranging from simple linear models to large convolutional networks. The strength lies in model architectures that are able to capture universal aspects across different problems, while simultaneously performing well on each individual task.
What's next?
In an upcoming series of articles (stay tuned!) we will present multi-task learning for virtual flow metering (VFM). We will be highlighting how the method is able to address challenges faced by conventional data-driven VFMs, to better adhere to physical expectations, improve predictive performance, and reduce model maintenance requirements.
- Guest editor’s introduction: special issue on inductive transfer learning.
-
Silver, D.L., Bennett, K.P. Mach Learn 73, 215 (2008).
- NIPS 2016 tutorial: "Nuts and bolts of building AI applications using Deep Learning" by Andrew Ng
- Watch on Youtube
- A survey on data‐efficient algorithms in big data era. J Big Data 8, 24 (2021)
-
Adadi, A. J Big Data 8, 24 (2021).
- Multitask Learning.
-
Caruana, R. Machine Learning 28, 41–75 (1997).
- A survey on multi-task learning.
-
Zhang, Yu, and Qiang Yang. IEEE Transactions on Knowledge and Data Engineering (2021).
- Speech enhancement and recognition using multi-task learning of long short-term memory recurrent neural networks.
-
Chen, Z., Watanabe, S., Erdogan, H., & Hershey, J. R. Sixteenth Annual Conference of the International Speech Communication Association. (2015).
- A unified architecture for natural language processing: Deep neural networks with multitask learning.
-
Collobert, R., Weston, J. Proceedings of the 25th international conference on Machine learning (2008).
- Boosted multi-task learning for face verification with applications to web image and video search.
-
Wang, X., Zhang, C., Zhang, Z. IEEE Conference on Computer Vision and Pattern Recognition (2009).