Nordic Testing Days 2025: Full Schedule

arrow_back View All Dates

10:00 EEST

COFFEE BREAK

Friday May 16, 2025 10:00 - 10:30 EEST

Väike Saal

Friday May 16, 2025 10:00 - 10:30 EEST
Väike Saal

Break

10:30 EEST

Performance and Scalability Testing of ETL Data Pipeline

Friday May 16, 2025 10:30 - 15:30 EEST

Väike Saal

In the era of data-driven decision-making, the performance and scalability of ETL (Extract, Transform, Load) data pipelines are crucial for managing large and complex datasets efficiently. This study explores the design and testing of an ETL data pipeline built with Apache Airflow, Python Pandas, and Pytest. Airflow orchestrates pipeline workflows, ensuring transformation dependencies and scheduling are managed correctly. Pandas handles data manipulation, offering robust tools for efficient transformations. Pytest enables a structured framework for unit testing, ensuring reliability, performance and scalability under varying data loads.

The study provides technique for data performance and scalability testing by simulating diverse workloads to identify bottlenecks and optimize execution time. Metrics such as records per second or data points per second (optionally resource utilization) are evaluated via not always provided system performance and scalability expectations.

Key Takeaways:

Gain knowledge of creating ETL Airflow data pipeline with Python Pandas library
Know how to test performance and scalability of ETL data pipeline by designing/executing/reporting tests via Python Pytest library
Getting familiar with tips and tricks in Python IDE (Pycharm)

Preconditions:

own laptop (with Windows installed) + mouse
IDE Pycharm Community Edition installed
Docker Desktop installed
Python 3.12.7 64-bit installed

Speakers

Michal Pilarski

Test Engineer, GISKI

During his career, Michal has been always connected with geospatial data and GIS geoprocessing. He likes to find and overcome challenges in Testing Big Data with geometry attributes. He has experience in preparing the testing strategies for ETL systems that extract, transform and... Read More →

Mateusz Adamczak

Software Engineer, Dynatrace

With around 10 years of experience in Software Industry, Mateusz covered most of the available functions – tester, developer, devops engineer, and also a scrum master for a little while. This gives him an excellent overview of the software production process that he likes to share... Read More →

Friday May 16, 2025 10:30 - 15:30 EEST
Väike Saal

Workshop (4h)

Difficulty Like a Fish in the Sea

12:30 EEST

LUNCH

Friday May 16, 2025 12:30 - 13:30 EEST

Väike Saal

Friday May 16, 2025 12:30 - 13:30 EEST
Väike Saal

Break

15:30 EEST

COFFEE BREAK

Friday May 16, 2025 15:30 - 16:00 EEST

Väike Saal

Friday May 16, 2025 15:30 - 16:00 EEST
Väike Saal

Break