Developing, deploying and maintaining software are challenging tasks. Adding tons of live data and advanced machine learning models to the mix does not simplify the process. Our database consists of close to 100 billion data points!
Machine learning algorithms use data to build code according to specifications - without human interference in the actual building process. In order to manage and version machine learning models, code is thus not sufficient - the models must be associated with corresponding data and metadata. Due to the probabilistic nature of machine learning models, also testing is non-trivial. Assessing model performance is just one of many test cases for machine learning models, and more, complexer methods are often needed. These are some of the challenges that arise when operationalizing machine learning systems.
Identifying with Tuborg’s slogan “Stilstand er tilbagegang”, which translates to “standing still is the same as going backwards”, we at Solution Seeker prioritize exploring new technologies, techniques or ways of working in order to keep us at the forefront of our industry. There exists a plethora of technologies that aim to streamline machine learning related operations. The journey does not start at choosing technologies, but with looking inwards and identifying our mechanics, enabling ourselves to make the best decisions.
The disciplines involved in operationalizing machine learning systems are too diverse and comprehensive for one single type of engineer to handle. As a result, cross-disciplinary teams and clear and concise communication is of the essence. At Solution Seeker we have separate research, data science and development groups. Our researchers strive to develop new and more accurate machine learning models. Our data scientists tailor, adjust and integrate the models to facilitate our customers. Our developers manage, structure and transform incoming and outgoing data, and build the infrastructure on which the models run on. They also build surrounding infrastructure, such as interfaces where customers can interact with model predictions. Developers must be well aware of the data scientists’ needs in order to enable them to perform fast iterations on re-training, designing new applications, and presenting analysis and computations, which ultimately brings value to our customers. Data scientists provide researchers with insight concerning customer needs, which helps navigate the direction of our research. Researchers develop and test new models to cover these needs through experiments.
Employing good MLOps technologies can help streamline our workflow significantly and make development much more robust and efficient. Of particular importance are technologies that can help us eliminate bottlenecks in our workflows and abstract “manual labour”. However, there is a fine balance between exploiting already existing technologies and developing our own solutions. We don’t want to waste time reinventing the wheel, but at the same time we don’t want to rely on technologies that impose restrictions on us and force us to do things in a certain way. Oftentimes frequent iterations and assessments can be a valuable approach for evaluating whether a method or technology is worth adopting. Our Seekers are encouraged to unleash their creativity and spend the time they feel necessary in order to make clever and sustainable decisions when it comes to technology choices.
Some of the technologies we have adopted at Solution Seeker and why, as well as how MLOps relates to DevOps will be discussed in future journal posts. Stay tuned!