Role intent Live

Senior Researcher – GPU Performance

Microsoft

Work Mode

Onsite

Employment Type

FULL TIME

Location

India, Karnataka, Bangalore

Application Deadline

July 26, 2026

Design, implement, and optimize GPU kernels for complex computational workloads such as AI inferencing. Research and develop novel optimization techniques for generation of GPU kernels. Document optimization strategies and maintain performance benchmarks.

Responsibilities

Profile and analyze kernel performance using advanced diagnostic tools. Generate automated solutions for kernel optimization and tuning. Collaborate with other researchers to improve model performance. Contribute to the development of internal GPU computing frameworks.

Required Qualifications

Doctorate in relevant field - OR equivalent experience. Solid understanding of GPU architecture, memory hierarchies, parallel computing and algorithm optimization. Hands-on experience in GPU programming, including performance profiling and optimization tools. Advanced C++ programming skills. Other Requirements 5+ years of experience in GPU programming and optimization, expert knowledge of CUDA, ROCm, Triton, PTX, CUTLASS, or similar GPU programming frameworks Experience with machine learning frameworks (PyTorch, TensorFlow) Familiarity with compiler optimization techniques and background in auto-tuning and automated code generation Publication record in relevant conferences or journals (MLSys, NeurIPS, ICML, ICLR, AISTATS, ACL, EMNLP, NAACL, ISCA, MICRO, ASPLOS, HPCA, SOSP, OSDI, NSDI, etc.)

×

Join the Human Intelligence Club

Signal-preserving access for practitioners ready to be measured by applied depth.

Designed for builders entering the Human Intelligence club. Bring your PDF resume and intent snapshot. For companies running talent searches via Human Intelligence Recruiting Agent. Official email + role context required.

Max 10MB. We keep resumes private and route them only to HIRA reviewers.

Already earned access?

×

Log back into the club

Pick up where you left off. Evaluations, trajectories, and HIRA signals stay synced.

New to Human Intelligence?