1 of 5 free roles viewed today. Upgrade to premium for unlimited.

Hybrid Data Engineer | VA MD DC TX LOCALS

This role is for a "Hybrid Data Engineer" in Reston, VA or Plano, TX, lasting 12 months with a pay rate of "Unknown." Requires 5-7 years of software development, expertise in CDC, Apache Spark, Java, Python, and AWS.

🌎 - Country

United States

💱 - Currency

$ USD

💰 - Day rate

Unknown

🗓️ - Date discovered

February 16, 2025

🕒 - Project duration

More than 6 months

🏝️ - Location type

Hybrid

📄 - Contract type

Unknown

🔒 - Security clearance

Unknown

📍 - Location detailed

Reston, VA

🧠 - Skills detailed

#Spark (Apache Spark) #Batch #Scala #Java #Data Catalog #Data Engineering #Spark SQL #PySpark #Data Lake #Big Data #"ETL (Extract #Transform #Load)" #S3 (Amazon Simple Storage Service) #Apache Airflow #Python #Data Pipeline #Airflow #Databases #SQL (Structured Query Language) #Lambda (AWS Lambda) #Apache Spark #Computer Science #AWS (Amazon Web Services)

Role description

You've reached your limit of 5 free role views today. Upgrade to premium for unlimited access.

Position Title: Data Engineer - Developer

Location: Reston, VA (Hybrid)/Plano, TX

Duration: 12 Months (Potential for Extension)

NO H1B CPT

Interview: 1st round will be video interview and 2nd round will be onsite interview, and candidate must bring their own laptop for Design and Coding challenge.

Job Description:
• Job consists of setting up Change Data Capture (or CDC) for multiple types of databases for the purpose of hydrating a data lake.
• Debezium or other CDC knowledge required.
• Along with data hydration, job requires knowledge on ETL transformations using Apache spark, both streaming and batch processing of data.
• Engineer needs to know how to work with Apache Spark Data Frames, ETL jobs, and streaming data pipelines that will orchestrate raw CDC data and transform it into useable and query-able data for analytics. Big Data concepts, including performance tuning is a plus.

Skill set:
• Java – Mid to Senior level experience
• Python – Mid level experience (pyspark)
• Apache Spark – Data Frames, Spark SQL, Spark Streaming and ETL pipelines
• Apache Airflow
• Scala – not required but a plus
• Apache Hudi – not required, but a plus
• Apache Griffin – not required, but a plus

AWS Skillset:
• Extensive knowledge with S3 and S3 operations (CRUD)
• EMR and EMR Serverless
• Glue Data Catalog
• Step Functions
• MWAA (Managed Workflows Apache Airflow)
• Lambdas (Python)
• AWS Batch
• AWS Deequ – not required, but a plus

Qualification:

Must Have:

FNMACompany Default Category:
• 5 to 7 plus years software development experience: 5 years
• Bachelor degree in Computer Science, Information Systems or related field: Yes
• Post-graduate degree desired: Yes
• Professional certification(s) desired: Yes
• Strong knowledge of Software Development Lifecycle (SDLC)

Apply now Try premium

 See all roles