ML Framework (MetalLM) Engineer, Graphics, Game and ML
Apple
About the Role
The role involves working on high-performance, distributed inference for GenAI applications like LLMs on Private Cloud Compute using custom server hardware. Engineers will focus on building scalable, efficient, and production-grade solutions tailored for high-throughput GPU execution.
Requirements
Candidates must have a minimum of 3 years of programming experience with C/C++/ObjC, coupled with expertise in GPU kernel development using models like Metal or CUDA. Experience with distributed techniques and system-level programming is also required.
Full Job Description
Description
Our team is seeking extraordinary machine learning and GPU programming engineers who are passionate about providing robust compute solutions for accelerating Machine learning libraries on Apple Silicon. Role has the opportunity to influence the design of compute and programming models in next generation GPU architectures.
Minimum Qualifications
3+ years of programming and problem-solving experience with C/C++/ObjC Experience with GPU kernel development & optimizations using compute programming models such as Metal, CUDA etc. Experience with Distributed training or inference techniques Experience with system level programming and computer architecture
Preferred Qualifications
Experience with graph compilers such as CuTE, CuTile, Triton, OpenXLA or LLVM is a plus Good understanding of LLM and Diffusion based model architectures
AI Resume Tailoring
Generate a resume tailored to this job's requirements based on your uploaded resume.
Compensation
AI Est. Total Comp
$411,500
Details
Location
Cupertino
Work Type
On-site
Seniority
senior level
Experience
2-5 years
Category
ml ai
Quality Score
8.0