1 of 5 free roles viewed today. Upgrade to premium for unlimited.

Data Engineer (Python, Scala, AWS) - W2 Only #953368

This role is for a Data Engineer (Python, Scala, AWS) in Chicago, hybrid 3x/week onsite, with a contract length of 1 year+ at $62-$68/hr. Key skills include Scala, Python, PySpark, and AWS services.

🌎 - Country

United States

💱 - Currency

$ USD

💰 - Day rate

Unknown

544

🗓️ - Date discovered

February 12, 2025

🕒 - Project duration

More than 6 months

🏝️ - Location type

Hybrid

📄 - Contract type

W2 Contractor

🔒 - Security clearance

Unknown

📍 - Location detailed

Chicago, IL

🧠 - Skills detailed

#Data Quality #Data Lake #Airflow #Redshift #Big Data #Data Storage #Apache Spark #Scala #Data Engineering #Data Access #Data Analysis #Data Pipeline #Version Control #Programming #Datasets #"ETL (Extract #Transform #Load)" #AWS (Amazon Web Services) #S3 (Amazon Simple Storage Service) #SQL (Structured Query Language) #Data Modeling #Databases #Quality Assurance #Data Processing #Storage #Distributed Computing #Data Warehouse #Lambda (AWS Lambda) #AWS EMR (Amazon Elastic MapReduce) #PySpark #Python #Spark (Apache Spark) #GIT

Role description

You've reached your limit of 5 free role views today. Upgrade to premium for unlimited access.

Data Engineer (Python, Scala, Pyspark, AWS) - W2 only #953368
•
• No subcontractors or C2C candidates, please
•
• Location: Chicago ( on W. Wacker near all trains for an easy commute) – Hybrid 3x/week onsite

Duration: 1 year+ with probable extension well into 2026 or conversion to Perm roles

Compensation: $62-$68/hr plus benefits and PTO

Benefits: For full-time contract employees benefits include medical, dental, and vision benefits, health saving accounts medical and dependent care flexible spending accounts as well as voluntary life and disability benefits and a 401k plan. PTO and sick time are also included for FTE contractors.

Key Responsibilities:
• Data Pipeline Development:

Design, develop, and maintain robust data pipelines using Scala and PySpark on AWS EMR to ingest, clean, transform, and load data from various sources (databases, APIs, flat files) into data lakes (S3) and data warehouses (Redshift).
• AWS Integration:

Utilize AWS services like S3, Glue, Lambda, and Kinesis to manage data storage, data processing, and real-time data streaming applications.
• Data Modeling:

Design data models and schema for data warehousing and data lakes to optimize data access and analysis.
• ETL/ELT Processes:

Develop complex ETL/ELT workflows using Python and PySpark to extract, transform, and load data across different systems.
• Performance Optimization:

Analyze and optimize data pipelines for performance and scalability, identifying bottlenecks and implementing improvements.
• Data Quality Assurance:

Implement data quality checks and validation procedures to ensure the accuracy and consistency of data throughout the pipeline.
• Collaboration:

Work closely with data analysts, business stakeholders, and other engineers to understand data requirements and translate them into technical solutions.

Required Skills:
• Programming Languages: Proficient in Scala, Python, and PySpark
• AWS Expertise: Deep understanding of AWS services including S3, EMR, Glue, Redshift, Kinesis, and Lambda
• Big Data Concepts: Familiarity with distributed computing concepts, data partitioning, and optimization techniques for large datasets
• SQL Proficiency: Strong SQL skills to query and manipulate data in relational databases
• Data Engineering Tools: Experience with data engineering tools like Apache Spark, Hive, and Airflow
• Version Control: Proficiency in Git for code management

Apply now Try premium

 See all roles