1 of 5 free roles viewed today. Upgrade to premium for unlimited from only $19.99 with a 2-day free trial.

Data Engineer (Python, Spark, Pandas, SQL, GX Open Source Module)

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Engineer with expertise in Python, Spark, Pandas, SQL, and Great Expectations. The contract spans "X months" at a pay rate of "$X per hour". Remote work is available. Key skills include ETL, data validation, and cloud platforms.
🌎 - Country
United Kingdom
💱 - Currency
£ GBP
💰 - Day rate
Unknown
Unknown
🗓️ - Date discovered
March 29, 2025
🕒 - Project duration
Unknown
🏝️ - Location type
Unknown
📄 - Contract type
Unknown
🔒 - Security clearance
Unknown
📍 - Location detailed
Edinburgh, Scotland, United Kingdom
🧠 - Skills detailed
#Datasets #Spark (Apache Spark) #Big Data #Python #Hadoop #Data Extraction #Data Pipeline #Airflow #Pandas #GIT #SQL (Structured Query Language) #Docker #Quality Assurance #Snowflake #Apache Spark #Delta Lake #Data Engineering #Data Manipulation #Data Quality #Version Control #AWS (Amazon Web Services) #GCP (Google Cloud Platform) #Scala #"ETL (Extract #Transform #Load)" #Data Processing #BI (Business Intelligence) #Azure #SQL Queries #Cloud
Role description
You've reached your limit of 5 free role views today.
Upgrade to premium for unlimited access - from only $19.99.

Job Title: Data Engineer (Python, Spark, Pandas, SQL, GX Open Source Module)

Job Summary:

We are looking for a skilled Data Engineer with expertise in Python, Spark, Pandas, SQL, and Great Expectations (GX) Open Source Module to join our dynamic team. The ideal candidate will play a key role in building, optimising, and maintaining data pipelines, ensuring data quality, and supporting business intelligence and analytics needs.

Key Responsibilities:

   • Design, develop, and optimize scalable ETL/ELT pipelines using Python and Apache Spark.

   • Work with Pandas to process, clean, and transform large datasets efficiently.

   • Write and optimize complex SQL queries for data extraction, transformation, and analysis.

   • Implement data validation and quality checks using the Great Expectations (GX) Open Source Module.

   • Monitor and troubleshoot data pipeline performance issues, ensuring high availability and reliability.

Required Skills & Experience:

   • Strong proficiency in Python, with experience in writing efficient and scalable code for data processing.

   • Hands-on experience with Apache Spark for large-scale data processing.

   • Expertise in Pandas for data manipulation and transformation.

   • Solid understanding of SQL and relational database concepts, with experience in query optimization.

   • Experience working with Great Expectations (GX) Open Source Module for data validation and quality assurance.

   • Familiarity with cloud-based data platforms (AWS, Azure, GCP) is a plus.

   • Strong problem-solving skills and the ability to work in a fast-paced, collaborative environment.

Preferred Qualifications:

   • Experience with big data technologies such as Hadoop, Delta Lake, or Snowflake.

   • Understanding of CI/CD for data pipelines and version control (Git).

   • Exposure to containerization tools like Docker and orchestration frameworks like Airflow.

If you are passionate about data engineering, quality assurance, and scalable data solutions, we’d love to hear from you! Apply now and be part of an exciting journey in data-driven innovation.