Data Engineer - Content Retrieval

⭐ - Featured Role | Apply direct with Data Freelance Hub

This role is for a Data Engineer - Content Retrieval on a contract basis, located in Washington, DC or Philadelphia, PA. Key skills include Python, Databricks, Kubernetes, and AWS. Experience in information retrieval systems and cloud infrastructure is required. Pay rate is unspecified.

🌎 - Country

United States

💱 - Currency

$ USD

💰 - Day rate

Unknown

🗓️ - Date discovered

April 4, 2025

🕒 - Project duration

Unknown

🏝️ - Location type

On-site

📄 - Contract type

Unknown

🔒 - Security clearance

Unknown

📍 - Location detailed

Washington, DC

🧠 - Skills detailed

#Scala #Infrastructure as Code (IaC) #Deep Learning #Deployment #Istio #Databricks #Logging #Python #Data Ingestion #Data Engineering #Reinforcement Learning #Apache Airflow #Spark (Apache Spark) #Microservices #Kubernetes #ML (Machine Learning) #AI (Artificial Intelligence) #Monitoring #Data Processing #TensorFlow #Apache Spark #Terraform #PyTorch #GitHub #Cloud #AWS (Amazon Web Services) #NLP (Natural Language Processing) #Airflow #Big Data #Data Pipeline #Indexing

Role description

Dice is the leading career destination for tech experts at every stage of their careers. Our client, GCS, is seeking the following. Apply via Dice today!

Job Title: Software Developer III - Information Retrieval

Location: Washington, DC (preferred) or Philadelphia, PA

Employment Type: Contract

Job Description:

We are seeking a skilled Software Developer III - Information Retrieval to contribute to the development and optimization of advanced information retrieval systems. The ideal candidate will work on search algorithms, ranking models, and indexing strategies, while ensuring seamless data ingestion and processing. This role involves collaborating with cross-functional teams to develop scalable, high-performance content retrieval platforms.

Key Responsibilities:

• Develop and implement information retrieval systems, including search algorithms, ranking models, and indexing strategies.

• Design and optimize retrieval-augmented generation (RAG) systems to enhance AI-driven search capabilities.

• Build and maintain microservices infrastructure, ensuring scalability, efficiency, and modularity.

• Develop and integrate RESTful APIs to facilitate seamless communication between software components.

• Design and implement scalable cloud infrastructure using Kubernetes (EKS preferred), ensuring high availability and optimal performance.

• Utilize Databricks and Apache Spark Notebooks to streamline data processing and analytics workflows.

• Orchestrate and manage data pipelines using Apache Airflow to enable efficient workflow scheduling and monitoring.

• Deploy and maintain CI/CD pipelines to automate testing, integration, and deployment using GitHub Actions.

• Optimize search engine architectures and information retrieval processes to enhance search accuracy and user experience.

• Debug, research, and resolve technical issues related to data retrieval, pipelines, and cloud deployments.

• Stay up-to-date with the latest trends and technologies in AI, ML, and cloud infrastructure, applying best practices to improve internal systems.

Technical Requirements:

• Strong experience with Databricks and Spark Notebooks for big data processing.

• Expertise in Python for backend development and data pipeline management.

• Hands-on experience with Kubernetes for container orchestration (EKS preferred).

• Proficiency in AWS cloud services for deploying and managing infrastructure.

• Deep understanding of content retrieval systems, search engine architectures, and information retrieval optimization.

• Experience deploying and maintaining data pipelines in production environments.

• Familiarity with Infrastructure as Code (IaC) using Terraform or CloudFormation.

• Strong understanding of CI/CD pipelines, with hands-on experience in GitHub Actions for automated deployments.

• Experience working with logging and tracing tools such as ELK Stack, Zipkin, or OpenTracing.

• Familiarity with service mesh technologies such as Istio or Linkerd for managing microservices communications.

Preferred Qualifications:

• Experience with Generative AI technologies, including Natural Language Processing (NLP) and Reinforcement Learning.

• Hands-on experience optimizing LLMs (Large Language Models) in production environments.

• Knowledge of deep learning frameworks such as PyTorch or TensorFlow.

• Familiarity with OpenAI technologies for AI-powered search and recommendation systems.

• Background in Machine Learning (ML) and AI-driven information retrieval.

This role requires a technically savvy individual who is proactive, detail-oriented, and capable of building and maintaining robust information retrieval systems in production environments. If you're passionate about search optimization, AI-driven retrieval systems, and scalable cloud infrastructure.