NVIDIA is developing processor and system architectures that accelerate deep learning on edge devices, workstations, data center GPUs for a variety of applications including automotive, robotics, large language models and AI generative models. We are looking for an expert deep learning system performance architect to join our deep learning modelling, performance optimization, projections, and analysis effort. In this position, you will have the chance to optimize deep learning hardware and software architecture and make the significant impact in a dynamic technology focused company
What you’ll be doing:
Analyze performance of various machine learning/deep learning algorithms on different GPU architectures
Identify architecture and software performance bottlenecks and propose optimizations
Explore new features and hardware capabilities on deep learning applications
What we need to see:
BSc. MS or PhD in relevant discipline (CS, EE, Math, etc.,)
4+ years of working experience in relevant directions will be a plus
Be familiar with GPU or Accelerator-based deep learning platform and software stack
A strong background in computer architecture
Be familiar with LLM or generative AI deep learning algorithms
Experience on system architecture design and performance optimization
Familiar with machine learning and deep learning frameworks