Challenges in Data Cleaning and Transformation: Mistakes, Confusion, and Solutions

Iris Chen

Iris Chen

As a fresh graduate stepping into the professional world, I am filled with enthusiasm for every aspect of data.

    Abstract

    Data cleaning and transformation are critical steps in any data-related project, including data analysis, machine learning, and business intelligence. However, these tasks can be challenging due to the various methods available and the lack of standardization in the process. In this speech, I will discuss the specific challenges I faced while cleaning and transforming video metadata. I will also introduce some solutions that I have found helpful in my work, such as the use of systematic data cleaning processes like Data Build Tool(DBT) and data pipeline quality monitoring tools like PipeRider. By combining these tools and techniques, we can improve the efficiency and reliability of data transformation and enhance the overall data application process. The goal of this speech is to show how DBT and PipeRider can help create a more efficient, scalable, and error-free data transformation pipeline.

    Description

    Location

    R2

    Date

    Day 1 • 11:35-12:05 (GMT+8)

    Language

    Chinese talk w. English slides

    Level

    Intermediate

    Category

    Data Analysis