Lead AI Cluster Models Architect
Advanced Micro Devices, Inc
About the Role
The role involves designing state-of-the-art model architectures, data, and parameter sets for large AI/ML training and inferencing systems optimized for hyperscale capabilities, while engaging with customers to align system and model architectures. Key duties also include pioneering system and container networking strategies and developing scalable communication network reference architectures for AMD AI/ML products.
Requirements
The ideal candidate must have in-depth knowledge and extensive real-world experience designing hyperscale computing clusters, coupled with strong analytical and problem-solving skills, and the ability to drive tasks independently. A Master’s or PhD degree in a related computational field is preferred, though equivalent experience is considered.
Full Job Description
WHAT YOU DO AT AMD CHANGES EVERYTHING
At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you’ll discover the real differentiator is our culture. We push the limits of innovation to solve the world’s most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career.
THE ROLE:
We are looking for a dynamic, energetic Lead AI Cluster Models Architect to join our growing team. As a key contributor to the success of AMD’s product, you will be part of a leading team to drive and improve AMD’s abilities to deliver the highest quality, industry-leading technologies to market. AMD's Systems Design Engineering team fosters and encourages continuous technical innovation to showcase successes as well as facilitate continuous career development.
THE PERSON:
The AI Cluster Models Architect plays a critical role in shaping the future of AI/ML training and inferencing systems as the AI Industry transitions into the Inference space (while still broadening within the AI Training market space). This individual will collaborate with a broad range of internal and external partners, including System management, OS, NOS, Compute Libraries, and Software Tools teams, to integrate state-of-the-art technology solutions that pave the way for AMD AI adoption within both inferencing and training.
KEY RESPONSIBILITIES:
- Designing state of the art model architectures, data, and parameter sets, for large AI/ML training and inferencing systems which can be optimized for hyperscale capabilities
- Engage with AMD customer base while aligning system and model architectures
- Pioneering system and container networking strategies to facilitate seamless operation and scaling of AI clusters
- Developing scalable AI/ML training and inferencing communication network reference architectures for each generation of AMD AI/ML products
- Participate in design phase of each AMD AI/ML GPU generation by developing cluster computational architectures and requirements
- Collaborate across AMD internal and external partner teams to improve performance for AMD AI/ML clusters
PREFERRED EXPERIENCE:
- In-depth knowledge and experience with AI clusters and topologies
- Extensive real world experience designing hyperscale computing clusters
- Strong analytical/problem-solving skills and pronounced attention to details
- Must be a self-starter, and able to independently drive tasks to completion
ACADEMIC CREDENTIALS:
- Master’s or PhD degree preferred in Mathematics, Statistics, Electrical Engineering, Computer Engineering, or a related computational field; equivalent experience also considered.
Location: Could be Hybrid or remote
This role is not eligible for visa sponsorship.
#LI-TL1
Benefits offered are described: AMD benefits at a glance.
AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process.
AMD may use Artificial Intelligence to help screen, assess or select applicants for this position. AMD’s “Responsible AI Policy” is available here.
This posting is for an existing vacancy.
AI Resume Tailoring
Generate a resume tailored to this job's requirements based on your uploaded resume.
Compensation
AI Est. Total Comp
$295,000
Details
Location
Austin
Work Type
Hybrid
Seniority
senior level
Experience
10+ years
Category
management
Quality Score
7.3