Distributing your pandas ETL job using Ray and Modin

李泓旻 (Andrew)

I am currently working as a data engineer in the financial industry. In the past, I worked as a one-stop shop for data science(Manufacturing), covering data engineering, ETL, modeling, and deployment. Dedicated to finding the most suitable tool for each need. Keep contributing to open source projects. LIFE IS SHORT. USE PYTHON.

Abstract

Are you using pandas to process data? Do you want to handle a large dataset using pandas? Do you want to develop the Python code on your laptop and run it on Cloud or Kubernetes effortlessly? In this talk, I assume you are familiar with pandas and I will share how to distribute your pandas ETL job by changing few lines of code(even just one).

Description

Slides

Video

Location

Date

Day 1 • 10:10-10:40 (GMT+8)

Language

Chinese talk w. English slides

Level

Intermediate