Distributing your pandas ETL job using Ray and Modin

李泓旻 (Andrew)

李泓旻 (Andrew)

I am currently working as a data engineer in the financial industry. In the past, I worked as a one-stop shop for data science(Manufacturing), covering data engineering, ETL, modeling, and deployment. Dedicated to finding the most suitable tool for each need. Keep contributing to open source projects. LIFE IS SHORT. USE PYTHON.

    摘要

    Are you using pandas to process data? Do you want to handle a large dataset using pandas? Do you want to develop the Python code on your laptop and run it on Cloud or Kubernetes effortlessly? In this talk, I assume you are familiar with pandas and I will share how to distribute your pandas ETL job by changing few lines of code(even just one).

    說明

    影片

    地點

    R0

    時間

    第一天 • 10:10-10:40 (GMT+8)

    語言

    中文演講/英文投影片

    層級

    中階

    分類

    資料分析