About the Role
The Data Scientist will generate large synthetic datasets and analyze the gap between simulated and real data to support agentic AI models. They will also evaluate and fine-tune model performance while translating subjective product requirements into objective criteria.
Requirements
Candidates must have a Bachelor's or Master's degree in Computer Science or a related field with at least 2 years of experience working with large datasets. Proficiency in Python and strong statistical analysis skills are required, with a PhD and industry experience in multimodal data being preferred.
Full Job Description
Apple is where individual imaginations gather together, committing to the values that lead to great work. Every new product we build, service we create, or experience we deliver is the result of us making each other’s ideas stronger. The diversity of our people and their thinking inspires the innovation that runs through everything we do. When we bring everybody in, we can do the best work of our lives. Here, you’ll do more than join something — you’ll add something.
Description
The Special Projects team at Apple is developing novel user-facing conversational features that leverage the multimodal capabilities of state-of-the-art foundation models. A key component of this process is the ability to produce complex simulated scenario data, in order to train and evaluate agentic AI models. We are looking for a skilled Data Scientist to work closely with our Simulation and Machine Learning Evaluations teams to generate large synthetic datasets, analyze the gap between simulated and real data, and evaluate and fine-tune agentic AI model performance at various tasks. A successful candidate is experienced in managing large, multi-modal datasets, in translating subjective product requirements into objective criteria, and has strong statistical analysis skills.
Minimum Qualifications
BA or Master’s degree in Computer Science, Data Science, or related field
2+ years of hands-on experience working with large data sets
Proficiency in Python
Excellent communication skills
Preferred Qualifications
PhD in Computer Science, Data Science, Statistics, or other STEM field
Hands-on industry experience with product focused statistical analysis
Experience working with large-scale multimodal data and data-annotation pipelines
Experience working with simulations to produce large datasets
Experience with experimental design, A/B testing and Failure Analysis
A track record of publications or technical presentations in Data Science
Excellent cross-functional collaboration skills