Refer a freelancer, and you both get 1 free week of DFH Premium. They must use your code {code} at sign-up. More referrals = more free weeks! T&Cs apply.
1 of 5 free roles viewed today. Upgrade to premium for unlimited.

Databricks Platform Engineer (PySpark Expert)

This role is for a Databricks Platform Engineer (PySpark Expert) in Dallas, TX, for 6+ months at a pay rate of "X". Requires 14+ years IT experience, 10+ years in data engineering, and expertise in Databricks, PySpark, and cloud platforms.
🌎 - Country
United States
💱 - Currency
$ USD
💰 - Day rate
Unknown
Unknown
🗓️ - Date discovered
February 21, 2025
🕒 - Project duration
More than 6 months
🏝️ - Location type
Remote
📄 - Contract type
Unknown
🔒 - Security clearance
Unknown
📍 - Location detailed
Dallas, TX
🧠 - Skills detailed
#Kubernetes #Data Analysis #HDFS (Hadoop Distributed File System) #Data Engineering #"ETL (Extract #Transform #Load)" #Azure #PySpark #GIT #Data Pipeline #Cloud #Computer Science #Delta Lake #Version Control #Jenkins #ML (Machine Learning) #Big Data #Data Storage #Data Science #Automation #Python #Security #Databricks #Data Governance #Datasets #Scala #Spark (Apache Spark) #Docker #Compliance #Apache Kafka #Deployment #Libraries #Storage #Data Processing #Kafka (Apache Kafka) #S3 (Amazon Simple Storage Service) #AWS (Amazon Web Services)
Role description
You've reached your limit of 5 free role views today. Upgrade to premium for unlimited access.

Job Title: Databricks Platform Engineer (PySpark Expert)

Location: Dallas, TX

Workplace type: On-site (very minimum scope of remote work.)

Duration: 6+ months to start (Long-term multiyear project)

Experience level: Senior Engineer with minimum 14+ Years of IT experience and 10+ years of relevant experience.

Some of the expected job responsibilities:
• Lead the integration and deployment of Databricks with other data storage and processing tools (e.g., AWS, Azure, Google Cloud, Delta Lake).
• Design, implement, and maintain scalable data pipelines and workflows in the Databricks environment, ensuring optimal performance and reliability.
• Collaborate with data scientists, data analysts, and other engineering teams to enable efficient and scalable big data processing using PySpark.
• Build and optimize Spark-based data processing frameworks to handle large datasets, ensuring the highest levels of performance and efficiency.
• Write high-quality, maintainable, and well-documented code, adhering to coding standards and best practices.

Required Skills & Qualifications:
• Proven experience with Databricks platform and PySpark in large-scale data engineering environments.
• Solid understanding of Spark architecture, performance tuning, and troubleshooting.
• Expertise in Python and PySpark, with the ability to write complex, optimized Spark jobs.
• Hands-on experience with cloud data platforms such as AWS, Azure, or Google Cloud.
• Proficient in working with data storage systems like Delta Lake, HDFS, and S3.
• Strong knowledge of data processing, ETL workflows, and data pipeline automation.
• Familiarity with containerization and orchestration tools such as Docker and Kubernetes is a plus.
• Experience with version control tools like Git and collaboration platforms like Jenkins or CI/CD pipelines.
• Ability to work in a fast-paced, collaborative environment with cross-functional teams.
• Excellent communication skills and the ability to explain complex technical concepts to non-technical stakeholders.
• Bachelor’s or master’s degree in computer science, Data Engineering, or a related field.
• Experience with Databricks notebooks, jobs, clusters, and libraries.

Preferred but not mandatory:
• Familiarity with Apache Kafka or other stream processing technologies.
• Exposure to machine learning workflows and integration with Databricks.
• Prior experience with data governance, security, and compliance in cloud environments.