Getting Started with Sparse Modeling Using spm-image

摘要

Blackbox problem has been becoming a popular concern when applying machine learning in specific applications, like medical system, where a user is supposed to understand the behavior of the system. Collecting tons of data for training machine learning model is another headache especially when you newly create a system from scratch. In this talk, I introduce the data analysis approach called "Sparse Modeling" that can produce good results, even if the amount of data is small. Event Horizon Telescope project, capturing blackhole image, is one good example of this nature. It's also referred to as explainable since it can tell you which input features have a strong impact to result generated by a machine learning model. With the overview of the method, I'll show concrete code examples for common use cases like image analysis, using a Python library named spm-image.

說明

The hottest topic in machine learning nowadays is undoubtedly deep learning. Though it is a super powerful tool in applications like image recognition and machine translation, you might be facing issues when applying it to your own business problem. The most typical issue is the so-called black box problem, that is, deep learning never gives you the reason for the result. Another issue is that it requires tons of data for training otherwise the performance would not be able to reach the desired level. For the former issue, [LIME, SHAP and other model agnostic methods](https://christophm.github.io/interpretable-ml-book/) have been proposed to make a model interpretable and specific methods for deep learning like [GRAD-CAM](https://arxiv.org/abs/1610.02391) have also been developed these days. They are really helpful to understand the behavior of a model but the model still remains complicated. For the latter issue, data augmentation or transfer learning is a common solution. However, it doesn't mean deep learning model can be built on a small amount of data. Sparse modeling is completely different. This is not even an algorithm but a way of modeling with a sparsity constraint. Since it is not a concrete algorithm, you need to use an existing machine learning algorithm with a small modification to apply sparse modeling. This requires you to get a profound understanding of business process that generates data. Though you have to put considerable effort into doing it, you can get huge benefit like explainability of the model, a capability of building model from a small amount of data once you figure out the nature of data. The basic knowledge of sparse modeling will help you see data from a different perspective and will be helpful when you are facing issues in using deep learning and other advanced machine learning algorithm. With the introduction of popular algorithm of sparse modeling like Lasso, sparse coding, the concrete implementation of how to use them will be given in this talk so that the audience can get started with sparse modeling immediately. For the code examples, [spm-image](https://github.com/hacarus/spm-image), which is an open source library of sparse modeling developed by my team, and [scikit-learn](https://scikit-learn.org/) will be used.

投影片

https://speakerdeck.com/hacarus/getting-started-with-sparse-modeling-with-spm-image

講者

Takashi Someda

After getting his master’s degree in informatics at Graduate School of Kyoto University, he started his job at Sun Microsystems as an engineer.

For about 20 years in the software industry, he has experienced several roles like software developer, technical evangelist, and data scientist.

Now, as CTO of Hacarus, he is responsible for technical direction with strong passion toward building a creative, self-organized team like Pixer.