

Databricks Platform Engineer (PySpark Expert)
Job Title: Databricks Platform Engineer (PySpark Expert)
Location: Dallas, TX
Workplace type: On-site (very minimum scope of remote work.)
Duration: 6+ months to start (Long-term multiyear project)
Experience level: Senior Engineer with minimum 14+ Years of IT experience and 10+ years of relevant experience.
Some of the expected job responsibilities:
• Lead the integration and deployment of Databricks with other data storage and processing tools (e.g., AWS, Azure, Google Cloud, Delta Lake).
• Design, implement, and maintain scalable data pipelines and workflows in the Databricks environment, ensuring optimal performance and reliability.
• Collaborate with data scientists, data analysts, and other engineering teams to enable efficient and scalable big data processing using PySpark.
• Build and optimize Spark-based data processing frameworks to handle large datasets, ensuring the highest levels of performance and efficiency.
• Write high-quality, maintainable, and well-documented code, adhering to coding standards and best practices.
Required Skills & Qualifications:
• Proven experience with Databricks platform and PySpark in large-scale data engineering environments.
• Solid understanding of Spark architecture, performance tuning, and troubleshooting.
• Expertise in Python and PySpark, with the ability to write complex, optimized Spark jobs.
• Hands-on experience with cloud data platforms such as AWS, Azure, or Google Cloud.
• Proficient in working with data storage systems like Delta Lake, HDFS, and S3.
• Strong knowledge of data processing, ETL workflows, and data pipeline automation.
• Familiarity with containerization and orchestration tools such as Docker and Kubernetes is a plus.
• Experience with version control tools like Git and collaboration platforms like Jenkins or CI/CD pipelines.
• Ability to work in a fast-paced, collaborative environment with cross-functional teams.
• Excellent communication skills and the ability to explain complex technical concepts to non-technical stakeholders.
• Bachelor’s or master’s degree in computer science, Data Engineering, or a related field.
• Experience with Databricks notebooks, jobs, clusters, and libraries.
Preferred but not mandatory:
• Familiarity with Apache Kafka or other stream processing technologies.
• Exposure to machine learning workflows and integration with Databricks.
• Prior experience with data governance, security, and compliance in cloud environments.