IT Data/Analytics Engineer

This role is an IT Data/Analytics Engineer for a 6-month contract at a pay rate of "X" per hour, requiring 3-5 years of AI solution design experience, strong AWS SageMaker skills, and proficiency in Python, SQL, and data visualization tools like Tableau.
🌎 - Country
United States
💱 - Currency
$ USD
💰 - Day rate
Unknown
Unknown
🗓️ - Date discovered
January 16, 2025
🕒 - Project duration
Unknown
🏝️ - Location type
Unknown
📄 - Contract type
Unknown
🔒 - Security clearance
Unknown
📍 - Location detailed
United States
🧠 - Skills detailed
#Data Analysis #GitHub #Jupyter #Storage #Visualization #Bash #Redshift #Data Governance #EC2 #PySpark #AWS SageMaker #JSON (JavaScript Object Notation) #AWS RDS (Amazon Relational Database Service) #Pandas #Tableau #Computer Science #MongoDB #AWS (Amazon Web Services) #CLI (Command-Line Interface) #Athena #Agile #AWS CLI (Amazon Web Services Command Line Interface) #AWS Glue #NumPy #Scala #Spark (Apache Spark) #Complex Queries #Lambda (AWS Lambda) #Data Mining #Data Processing #MIS Systems (Management Information Systems) #AWS Lambda #Cloud #Python #SageMaker #SQL (Structured Query Language) #Programming #AI (Artificial Intelligence) #S3 (Amazon Simple Storage Service) #Data Science #R #ML (Machine Learning) #Libraries #Data Engineering #Requirements Gathering #Amazon EMR (Amazon Elastic MapReduce) #"ETL (Extract #Transform #Load)" #Data Modeling #Code Reviews #RDS (Amazon Relational Database Service)
Role description
Log in or sign up for free to view the full role description and the link to apply.

The IT Data/Analytics Engineer position requires experience designing, developing, testing and deploying efficient and scalable AI/ML solutions for life sciences data and analytics.

This position requires strong AWS SageMaker AI/ML skills to include supporting tools such as Dask, Athena, AWS Glue, Redshift and EMR, to build data and analytical insights into Company data sets.

Focus on data and analytics supporting Translational Research, Chemistry, Manufacturing and Controls (CMC) to include data collection and analysis tools to aggregate research, clinical, and manufacturing data sets.

Working in development environments to include Python, Pandas, SQL, Jupyter Notebooks, JupyterLab, MongoDB and Posit.

Support data science colleagues in provisioning curated data sets for the creation of data visualization dashboards in Tableau.

The position requires enthusiasm, passion, attention to detail and a desire to create new medicines for Company’s patients.

Liaise with data scientists, data engineers, and cloud solution hosting partners to implement data driven AI/ML solutions for drug discovery analytics and patient cohort development.

Work with large clinical real-world evidence data sets in parquet or JSON formats.

Creation of Python scripts and algorithms to clean, transform and extract data from multiple sources and providing these outputs to the Data Science Team.

Computational requirements gathering and collaboration with Company’s AWS cloud provider on sizing solutions and performance tuning or Notebooks, EC2 and other storage solutions.

Conduct regular code reviews with data engineers and data scientists. Management of GitHub code repositories.

Closely coordinate with the Director of Data Science on the creation and management of agile software sprint planning and release management.

Requirements Description

Education/Training Advanced degree in a quantitative field such as Management Information Systems, Computer Science, Machine Learning or equivalent experience.

Experience 3-5 years of experience designing and building AI solutions

Licenses

Skills/Abilities : Relational and non-relational database experience with the proven ability to model, design, and optimize data structures.

Expert knowledge of Structured Query Language (SQL) with the proven ability to author and optimize complex queries is required.

High competency using data science tools such as AWS SageMaker AI/ML and Jupyter notebooks.

Experience implementing and using one or more managed data services such as AWS Athena, AWS Glue, Amazon S3, AWS RDS, and Amazon EMR.

Python programming expertise with experience using common data analysis libraries such as Dask, Pandas, Numpy, PySpark, etc, as well as boto3 for AWS integrations.

Experience writing shell scripts (e.g., BASH) to automate processes is desirable but not required. Experience with the AWS CLI is a plus.

Experience with distributed data processing and management systems

Working knowledge of R and Posit IDE’s

Data Governance, Data Modeling, Data Mining experience is desirable.

Experience working with and implementing cloud computing services such as Amazon EC2, AWS Fargate, and AWS Lambda is desirable.

Tableau dash visualizations experience desirable.