Distributing your pandas ETL job using Ray and Modin

李泓旻 (Andrew)

李泓旻 (Andrew)

I am currently working as a data engineer in the financial industry. In the past, I worked as a one-stop shop for data science(Manufacturing), covering data engineering, ETL, modeling, and deployment. Dedicated to finding the most suitable tool for each need. Keep contributing to open source projects. LIFE IS SHORT. USE PYTHON.

    Abstract

    Are you using pandas to process data? Do you want to handle a large dataset using pandas? Do you want to develop the Python code on your laptop and run it on Cloud or Kubernetes effortlessly? In this talk, I assume you are familiar with pandas and I will share how to distribute your pandas ETL job by changing few lines of code(even just one).

    Description

    Video

    Location

    R0

    Date

    Day 1 • 10:10-10:40 (GMT+8)

    Language

    Chinese talk w. English slides

    Level

    Intermediate

    Category

    Data Analysis