用 Python 來玩粒子物理實驗吧:An Example of Using Deep Learning to Correct Energy Measurement in the Belle II Experiment

摘要

粒子物理實驗是一門探求物質最小組成法則和相互作用的科學,隨著時代的演進,現在的粒子物理實驗必須要面對數以千萬計大量的資料和自動化儀器的需求。近十幾年來,在傳統的數據分析,到機器學習的數據分析,到儀器監控介面等方面採用Python語言的物理學家們越來越多,也對Python社群做出許多的貢獻。

位於日本筑波市的Belle II實驗即將於今年春天開始啟動運轉,目的是尋找反物質與物質不對稱之謎,更是第一個全採用Python來作為主要開發及分析的大型粒子物理實驗。
本次分享Belle II實驗中使用Python進行數據分析的現況,並用一個使用深度學習CNN方法來校正伽瑪射線能量量測之實例,來展示在粒子物理的資料分析中使用tensorflow in python API之過程。

說明

My talk will go from popular science of particle physics experiment, then the topic changes to how the particle experimentist do numerical analyis using Python packages. Finally, an example of CNN regression for observed energy correction will be discussed to show that the improvement of numerical analysis in science by using Python machine learning packages. *introduction of particle physics: the physics world composed from elementary particles, and particle experimentists use accelerator or cosmic rays to research their properties. * Belle II experiment: one particle physics puzzle is that most of matter in Universe compsed from particles, not anti-particles. However, we believe that at big bang, number of particles should be equal to anti-particles. So, there are something basic differences than charge sign for particle and anti-particle. Belle II experiment prepare to use accelerator to manufacture ~10^{8} B meson and anti-B meson pairs. B meson is a special particle: Nobel Prize winners-- Kobayashi and Masukawa preidcted the life time of B meson and that of anti-B meson are different. After it decayed, Belle detector measure its daughter particles and try to use its daughter information to recombine B meson and measure its life time. (Belle II experiment wiki:https://en.wikipedia.org/wiki/Belle_II_experiment, public official site: http://belle2.jp/) * Data and Numerical Analysis in Belle II experiment: Belle II detector is a onion structure that is consist of sevaral sub-detectors. Each sub-detector has about thousands of pixels to collect the daughter particles information such as track, ring and cluster. So, we have to build up 0.1 billion B meson events which is composed from 10^{4} pixels. Also, the background events from other particles is thousand times of B mesons. The data perduction (including simulation production) can up to ~1,000MB/sec during the coming operation time. We use many stastical method such as maximun likelihood value, chi-square value, etc to pick up the B meson events from numerous background events. Recently, machine learning started to be used in our field to classify signal or correct the measurement value. * Gamma-ray energy measurement problem: The sub-detector of ECL (electronic calorimeter) in Belle II experiment measure gamma-ray energy with under-estimation. So, this problem cause that we can't measurement some B mesons decays to a gamma-ray daughter precisely. ECL is a two-dimension crystal array that measuring gamma-ray energy, and each crystal record some part of energies from gamma-ray. The energies distribution over 2 dimension array can be a hint for the correction of measured energy (summation of energies at all crystal). As we have no idea about what kind of regression fuction can be used, so CNN regression is a suitable choice to solve this problem. This talk will use tensorflow python API to correct simulated data of smaller detector. In the near future, I hope this CNN method can be deployed to real Belle II analysis.

Slides

https://goo.gl/7AUoBL

講者

黃坤賢

目前從事粒子物理實驗研究,在日本和台灣之間飛來飛去。喜歡python,C++,shell script等程式語言,並用python完成了好累的博士論文。目前栽進了用機器學習來完成實驗數據分析的坑洞,準備挖另一個平行運算的坑掉下去。
專長是睡過頭,未來方向是尋找一個可叫得醒我的鬧鐘。