Recognising people in videos using a pre-trained deep learning model

R0
Day 2, 16:30‑17:15
English talk
Data Analysis
Intermediate

In this talk, I will present how to recognise speakers and their speaking sessions in a filmed meeting using deep learning (deep convolutional neural networks; CNN). This project uses the "transfer learning" approach. Pre-trained CNN models, such as the VGG face descriptor used in this project, enable everyone to analyse photos or videos without training his own CNN. I will explain how to use a pre-trained model to extract face features and use clustering methods to identify different people without knowing their identity in advance. Results of a real case will be shown. Application and restriction of this method will be discussed.

Talk Detail

Project page and Github repository: * Website: http://www.ccfux.url.tw/UVA.html * Github: https://github.com/chiachun/UVA </br> Packages used in this project: - opencv for video reading and image pre-processing http://opencv.org - caffe for running CNN models http://caffe.berkeleyvision.org - pre-trained VGG face desciptor http://www.robots.ox.ac.uk/~vgg/research/very_deep/ - sklearn for clustering http://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html

Speaker Information

Chia-Chun Lu

Chia-Chun is a scientist and pianist. She is interested in various areas including astrophysics, data science, software engineering, and music. She currently works as an algorithm engineer at OnePlus.