Continuously evolve the platform toward more autonomous and agentic workflows, reducing manual effort while increasing depth and reliability of testing.
Responsibilities
Build and scale AIādriven testing capabilities using LLMs, prompts, and agentābased workflows to validate MAI products across scenarios, geographies, and product surfaces. Design and optimize prompts, models, and agent behaviors to perform functional, quality, and experienceāfocused testing at scale. Collaborate closely with product and engineering teams across MAI and beyond to understand testing needs and translate them into efficient, AIāpowered testing workflows. Develop metrics and evaluation frameworks to measure test quality, coverage, effectiveness, and signal accuracy across AIādriven testing pipelines. Create actionable outputs and insights (issues, summaries, trends, and recommendations) that product owners can directly consume to fix defects and improve product quality. Partner with engineers and platform teams to operationalize data science solutions in production, ensuring scalability, reliability, and performance.
Required Qualifications
Bachelor's Degree in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND 7+ years related experience (e.g., statistics predictive analytics, research). OR Master's Degree in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND 5+ years related experience (e.g., statistics, predictive analytics, research). OR Doctorate in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND 3+ year(s) related experience (e.g., statistics, predictive analytics, research). OR equivalent experience. 4+ years of solid experience in Data Science, Applied AI, or Machine Learning, with a track record of building solutions that operate at scale. Handsāon experience with LLMs, prompt engineering, and/or agentic AI systems. Solid foundation in statistics, experimentation, and metrics design, especially for evaluating AI system quality. Experience working with data pipelines, model evaluation, and production systems. Ability to work across multiple product teams, influence without authority, and translate ambiguous testing needs into concrete AI solutions. Solid communication skills to explain complex AI outputs clearly to engineering and product stakeholders. 3+ years experience creating publications (e.g., patents, libraries, peer-reviewed academic papers). 3+ year(s) experience developing and deploying live production systems, as part of a product team. 3+ year(s) experience developing and deploying products or systems at multiple points in the product cycle from ideation to shipping.
Original Posting
This role is sourced from Microsoft. Apply on Microsoft careers page