Refer a freelancer, and you both get 1 free week of DFH Premium. They must use your code {code} at sign-up. More referrals = more free weeks! T&Cs apply.
1 of 5 free roles viewed today. Upgrade to premium for unlimited.

Data Engineer (Vectorization)

This role is for a Data Engineer (Vectorization) in Irvine, California, on a 1-year contract with a pay rate of "unknown." Candidates must have 3+ years in data engineering, expertise in vectorization and Azure AI Studio, and proficiency in Python and SQL.
🌎 - Country
United States
💱 - Currency
$ USD
💰 - Day rate
Unknown
Unknown
640
🗓️ - Date discovered
February 20, 2025
🕒 - Project duration
More than 6 months
🏝️ - Location type
On-site
📄 - Contract type
W2 Contractor
🔒 - Security clearance
Unknown
📍 - Location detailed
Irvine, CA
🧠 - Skills detailed
#Data Architecture #Kubernetes #"ETL (Extract #Transform #Load)" #Model Deployment #ADF (Azure Data Factory) #Spark (Apache Spark) #Monitoring #Cloud #Azure Data Factory #Python #Storage #Scala #Apache Spark #ML (Machine Learning) #Data Storage #Data Engineering #NLP (Natural Language Processing) #Data Science #Azure Blob Storage #Databases #SQL (Structured Query Language) #Azure #Data Processing #Data Pipeline #Automation #Docker #Synapse #Data Lake #Deployment #Databricks #AI (Artificial Intelligence)
Role description
You've reached your limit of 5 free role views today. Upgrade to premium for unlimited access.

Job Title: Data Engineer (Vectorization & Azure AI Studio)

Location: On-site in Irvine, California

Job Type: Contract role (1-year with yearly extensions)

We are working with a global investment firm that is known for its expertise in fixed-income strategies and has a long-standing reputation for delivering innovative financial solutions. They have a deep focus on research-driven decision-making in investment products including bonds, equities, real estate, and alternative assets.

Job Overview:

We are seeking an experienced Data Engineer with expertise in vectorization, vector pipelines, and working within Azure AI Studio to help design, build, and maintain scalable data architectures that power AI models and machine learning applications. You will be responsible for implementing efficient vectorization techniques and managing data pipelines to enable seamless integration with AI-powered solutions.

Key Responsibilities:
• Vectorization & Data Transformation: Design and implement data transformation pipelines for vectorization, enabling the efficient conversion of raw data into structured or unstructured vectors for AI and ML models.
• Azure AI Studio Integration: Leverage Azure AI Studio to build and manage machine learning workflows, fine-tune models, and integrate AI capabilities within data pipelines.
• Vector Pipeline Development: Develop, optimize, and maintain vector pipelines for large-scale data processing, ensuring efficient storage, retrieval, and processing of high-dimensional vector data.
• Data Architecture Design: Collaborate with cross-functional teams to define, build, and maintain data architecture that supports AI and machine learning model development and deployment.
• Performance Optimization: Continuously evaluate and enhance pipeline performance, ensuring scalability, reliability, and cost-efficiency in vectorization workflows.
• Collaboration: Work closely with data scientists, AI engineers, and other stakeholders to support the design and implementation of end-to-end machine learning and AI solutions.
• Automation & Monitoring: Automate routine tasks, build monitoring tools, and ensure smooth operation of vectorization pipelines in production environments.

Qualifications:
• Minimum 3 years of experience in data engineering, with a strong focus on vectorization, data pipelines, and Azure-based platforms.
• Proven experience with Azure AI Studio for building and deploying AI/ML models.
• Solid understanding and hands-on experience with vectorization techniques (e.g., converting data into numerical representations such as embeddings for NLP, image recognition, or other domains).
• Expertise in vector data pipelines and working with high-dimensional data at scale.

Technical Skills:
• Proficiency in Python, SQL, and relevant data processing tools (Apache Spark, Databricks, etc.).
• Familiarity with Azure services such as Azure Data Factory, Azure ML, and Azure Synapse.
• Experience with vector databases (e.g., Pinecone, Faiss, or Milvus) is a plus.
• Understanding of machine learning concepts and algorithms, particularly as they relate to vectorized data.
• Soft Skills:
• Strong problem-solving and analytical skills.
• Ability to communicate complex technical concepts to non-technical stakeholders.
• Collaborative and self-driven with an enthusiasm for learning new technologies and approaches.

Preferred Qualifications:
• Experience with cloud-native data storage solutions (e.g., Azure Blob Storage, Azure Data Lake).
• Familiarity with containerized environments (Docker, Kubernetes).
• Familiarity with AI model deployment and monitoring frameworks.