Text-Driven Image Cropping with Deep Learning and Genetic Algorithm
Mazer
I am Mazer (Ting-Yu Chen), not a good enough software engineer, but someone who enjoys moments of insight during the research process. "The sky and the cosmos are one." — The Choir, Bloodborne.
Abstract
分享我個人的 side project "Text2Focus",一個能根據文字描述,自動裁剪圖片中關鍵區域的工具。它使用 Saliency Detection 以及 Zero Shot Object Detection 偵測圖片的關鍵區域,有鑑於所謂的 "好" 剪裁不容易使用單一目標函數來表示,該工具使用多目標優化 Genetic Algorithm 在數個目標函數中提供 Pareto Front Optimization set,來滿足使用者的需求。