Loading…
Venue: Väike Saal clear filter
arrow_back View All Dates
Friday, May 16
 

10:00 EEST

COFFEE BREAK
Friday May 16, 2025 10:00 - 10:30 EEST
Friday May 16, 2025 10:00 - 10:30 EEST
Väike Saal

10:30 EEST

Performance and Scalability Testing of ETL Data Pipeline
Friday May 16, 2025 10:30 - 15:30 EEST
In the era of data-driven decision-making, the performance and scalability of ETL (Extract, Transform, Load) data pipelines are crucial for managing large and complex datasets efficiently. This study explores the design and testing of an ETL data pipeline built with Apache Airflow, Python Pandas, and Pytest. Airflow orchestrates pipeline workflows, ensuring transformation dependencies and scheduling are managed correctly. Pandas handles data manipulation, offering robust tools for efficient transformations. Pytest enables a structured framework for unit testing, ensuring reliability, performance and scalability under varying data loads.

The study provides technique for data performance and scalability testing by simulating diverse workloads to identify bottlenecks and optimize execution time. Metrics such as records per second or data points per second (optionally resource utilization) are evaluated via not always provided system performance and scalability expectations.

Key Takeaways:
  • Gain knowledge of creating ETL Airflow data pipeline with Python Pandas library
  • Know how to test performance and scalability of ETL data pipeline by designing/executing/reporting tests via Python Pytest library
  • Getting familiar with tips and tricks in Python IDE (Pycharm)
Speakers
avatar for Michal Pilarski

Michal Pilarski

Test Engineer, GISKI
During his career, Michal has been always connected with geospatial data and GIS geoprocessing. He likes to find and overcome challenges in Testing Big Data with geometry attributes. He has experience in preparing the testing strategies for ETL systems that extract, transform and... Read More →
avatar for Mateusz Adamczak

Mateusz Adamczak

With around 7 years of experience in Aviation Software, Mateusz covered most of the available functions – tester, developer, DevOps engineer, and also a scrum master for a little while. This gives him an excellent overview of the software production process that he likes to share... Read More →
Friday May 16, 2025 10:30 - 15:30 EEST
Väike Saal

12:30 EEST

LUNCH
Friday May 16, 2025 12:30 - 13:30 EEST
Friday May 16, 2025 12:30 - 13:30 EEST
Väike Saal

15:30 EEST

COFFEE BREAK
Friday May 16, 2025 15:30 - 16:00 EEST
Friday May 16, 2025 15:30 - 16:00 EEST
Väike Saal
 
Share Modal

Share this link via

Or copy link

Filter sessions
Apply filters to sessions.
Filtered by Date -