Addressing data challenges with multi-task learning

When training machine learning models we are often faced with challenges related to insufficient data. Multi-task learning addresses this issue by sharing knowledge between related problems. Part 1/5 of the MTL series.

Author Anders Sandnes portrait image
Anders Sandnes
Publish date
· 3 min read

Related to multi-task learning, transfer learning refers to the concept of applying knowledge gained in one domain to other related domains. Humans excel at this, being able to recognize similarities between new problems and past experiences with ease. For instance, when children learn to walk, they take the experiences they have with crawling along with examples of watching their parents, or other children walk, and put them together. Before you know it, your kid is running at the speed of light (at least it feels like it).

In the context of machine learning, many problems have insufficient data to learn a satisfying model, and it is desirable to leverage knowledge from other related problems to improve the performance.

Take image classification as an example. Imagine that you would like to classify a Lynx. As an incredibly shy felid, a limited number of pictures actually exist of it. On the other hand, there are numerous cat pictures online (haven't we all been watching cute cat videos in the dead of night?). Why not utilize an image classification algorithm trained on cats as a starting point for classifying lynx? Transfer learning is a long standing problem within the machine learning community, with initial work dating back to the 1980s (Silver 2008). In recent years the focus on knowledge transfer has grown, with Andrew Ng predicting at NIPS 2016 that transfer learning or multi-task learning will be the next driver of value in machine learning (Ng 2016).

There are several ways to address knowledge sharing, and they all attempt to really make use of the data used to train “data hungry” machine learning models (Adadi 2021). Multi-task learning (Caruana 1997) differs from transfer learning by attempting to learn multiple related tasks in parallel, using a shared model representation to transfer knowledge between tasks. It has seen significant results in various domains, including speech (Chen 2015), language (Collobert 2008), and image analysis (Wang 2009).

Let us use a human's ability for speech analysis as an example. Consider hearing a person talk. From the sound, we can usually tell the persons sex, where they are from, the mood they are in, and the meaning of the words. This is what multi-task learning is all about. With one model, we can answer a plurality of questions with just one sequence of inputs.

A diverse set of model architectures has been explored for multi-task learning (Zhang 2021), ranging from simple linear models to large convolutional networks. The strength lies in model architectures that are able to capture universal aspects across different problems, while simultaneously performing well on each individual task.

What's next?

In an upcoming series of articles (stay tuned!) we will present multi-task learning for virtual flow metering (VFM). We will be highlighting how the method is able to address challenges faced by conventional data-driven VFMs, to better adhere to physical expectations, improve predictive performance, and reduce model maintenance requirements.

Guest editor’s introduction: special issue on inductive transfer learning.

Silver, D.L., Bennett, K.P. Mach Learn 73, 215 (2008).

NIPS 2016 tutorial: "Nuts and bolts of building AI applications using Deep Learning" by Andrew Ng
Watch on Youtube
A survey on data‐efficient algorithms in big data era. J Big Data 8, 24 (2021)

Adadi, A. J Big Data 8, 24 (2021).

Multitask Learning.

Caruana, R. Machine Learning 28, 41–75 (1997).

A survey on multi-task learning.

Zhang, Yu, and Qiang Yang. IEEE Transactions on Knowledge and Data Engineering (2021).

Speech enhancement and recognition using multi-task learning of long short-term memory recurrent neural networks.

Chen, Z., Watanabe, S., Erdogan, H., & Hershey, J. R. Sixteenth Annual Conference of the International Speech Communication Association. (2015).

A unified architecture for natural language processing: Deep neural networks with multitask learning.

Collobert, R., Weston, J. Proceedings of the 25th international conference on Machine learning (2008).

Boosted multi-task learning for face verification with applications to web image and video search.

Wang, X., Zhang, C., Zhang, Z. IEEE Conference on Computer Vision and Pattern Recognition (2009).