1 of 5 free roles viewed today. Upgrade to premium for unlimited.

Spark Developer / Engineer

This role is for a Spark Developer/Engineer, offering a 6-12 month contract, remote work during PST hours. Required skills include Apache Spark (PySpark/Scala), Scalding, Hadoop, SQL, and data migration experience. US Citizens and Green Card holders only.

🌎 - Country

United States

💱 - Currency

$ USD

💰 - Day rate

Unknown

🗓️ - Date discovered

February 15, 2025

🕒 - Project duration

More than 6 months

🏝️ - Location type

Remote

📄 - Contract type

Unknown

🔒 - Security clearance

Unknown

📍 - Location detailed

United States

🧠 - Skills detailed

#Big Data #Python #Apache Spark #Apache Airflow #Data Migration #Scala #Batch #Spark (Apache Spark) #Data Transformations #"ETL (Extract #Transform #Load)" #PySpark #Data Processing #Hadoop #SQL (Structured Query Language) #Data Pipeline #Migration #Airflow

Role description

You've reached your limit of 5 free role views today. Upgrade to premium for unlimited access.

Job Title: Spark Developer / Engineer (2 positions)

Location: US Remote, work during PST time zone

Duration: 6-12 Months

Rate: DOE

US Citizen and Green Card only No Third-party agencies Corp to Corp

Workflows are powered by offline batch jobs written in Scalding, a MapReduce-based framework. To enhance scalability and performance, migrating these jobs from Scalding to Apache Spark.

Key Responsibilities:

Understanding the Existing Scalding Codebase

Analyze the current Scaling-based data pipelines.

Document existing business logic and transformations.

Migrating the Logic to Spark

Convert existing Scaling jobs into Spark (Py Spark/Scala) while ensuring optimized performance.

Refactor data transformations and aggregations in Spark.

Optimize Spark jobs for efficiency and scalability.

Ensuring Data Parity & Validation

Develop data parity tests to compare outputs between Scalding and Spark implementations.

Identify and resolve any discrepancies between the two versions.

Work with stakeholders to validate correctness.

Writing Unit Tests & Improving Code Quality

Implement robust unit and integration tests for Spark jobs.

Ensure code meets engineering best practices (modular, reusable, and well-documented).

Required Qualifications:

Experience in big data processing with Apache Spark (PySpark or Scala).

Strong experience with data migration from legacy systems to Spark.

Proficiency in Scalding and MapReduce frameworks.

Experience with Hadoop, Hive, and distributed data processing.

Hands-on experience in writing unit tests for Spark pipelines.

Strong SQL and data validation experience.

Proficiency in Python, Scala

Knowledge of CI/CD pipelines for data jobs.

Familiarity with Apache Airflow orchestration tool.

Apply now Try premium

 See all roles

Go to role

Spark Developer / Engineer

Premium Members Land Roles Faster—Claim Your 7 Day Free Trial to Start.

AI Engineer

Senior Technical Analyst (Contract)

Senior Data/Business Analyst - Remote - One Year Contract - 740337

Data Annotator

Premium Members Land Roles Faster—Claim Your 7 Day Free Trial to Start.