Research Intern - AI System Architecture Modeling and Performance
Microsoft
Research Internships at Microsoft provide a dynamic environment for research careers with a network of world-class research labs led by globally-recognized scientists and engineers, who pursue innovation in a range of scientific and technical disciplines to help solve complex challenges in diverse fields, including computing, healthcare, economics, and the environment.
The Azure Hardware and Systems Infrastructure organization is central to defining Microsoft's first-party Artificial Intelligence (AI) infrastructure architecture and strategy. This is a dynamic and fast-paced environment that in close partnership with sister organizations helps define System on Chip (SoC) designs, interconnect topologies, memory hierarchies, and much more, all in the context of enabling and optimizing workload optimized data flows for large-scale AI models. This organization plays a critical role in roadmap definition all the way from concept to silicon to hyperscale integration.
Responsibilities
Research Interns put inquiry and theory into practice. Alongside fellow doctoral candidates and some of the world’s best researchers, Research Interns learn, collaborate, and network for life. Research Interns not only advance their own careers, but they also contribute to exciting research and development strides. During the 12-week internship, Research Interns are paired with mentors and expected to collaborate with other Research Interns and researchers, present findings, and contribute to the vibrant life of the community. Research internships are available in all areas of research, and are offered year-round, though they typically begin in the summer.
As a Research Intern, you will be at the forefront of hardware/software co-design and have a direct impact in answering critical questions around designing an optimized AI system and evaluating real-world impact on the Azure’s supporting hyperscale infrastructure. This role will evaluate opportunities to co-optimize central processing unit (CPU), graphics processing unit (GPU) and networking infrastructure for the Maia accelerator ecosystem. You will be expected to identify system stress points, propose novel architectural ideas, and create methodologies using a combination of workload characterization, modeling and benchmarking to evaluate their effectiveness.
Qualifications
Required Qualifications
- Accepted or currently enrolled in a PhD program in Computer Science or related STEM field.
- At least 1 year of experience with performance analysis tools and methodologies, optimization and modeling.
Other Requirements
- Research Interns are expected to be physically located in their manager’s Microsoft worksite location for the duration of their internship.
- In addition to the qualifications below, you’ll need to submit a minimum of two reference letters for this position as well as a cover letter and any relevant work or research samples. After you submit your application, a request for letters may be sent to your list of references on your behalf. Note that reference letters cannot be requested until after you have submitted your application, and furthermore, that they might not be automatically requested for all candidates. You may wish to alert your letter writers in advance, so they will be ready to submit your letter.
Preferred Qualifications
- Proficiency with frameworks such as PyTorch, SGLang, Dynamo, and AI accelerator programming models/compilers such as CUDA and Triton.
- Deep understanding of GPU and AI architectures including memory hierarchies, compute-communication interplay, kernel scheduling and interconnect properties.
- Familiarity with CPU/server architectures including understanding of PCIe topologies and accelerator/NIC/peripheral demand. Solid understanding of CPU involvement in dispatching, scheduling and orchestration of input data pipelines to AI accelerators.
- Hands-on experience with benchmarking, profiling, identifying perf bottlenecks and performance analysis and optimization, including trace generation, event monitoring and instrumentation.
- Familiarity with roofline performance modeling, detailed performance simulations and awareness of speed vs accuracy tradeoffs in various performance modeling methodologies.
- Ability to apply the appropriate performance analysis methodology including devising new or combinatorial approaches in evaluating complex system architecture what-if scenarios.
- Solid verbal and written communication skills.
The base pay range for this internship is USD $6,710 - $13,270 per month. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $8,760 - $14,360 per month.
Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-intern-pay
Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.
This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.
Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations.