Own end-to-end development of features in a cross-geo, cross-org, cross-product setup from ideation and specification to deployment and iteration. Full-stack accountability for features including elegant architecture, efficient code, appropriate RAG, exhaustive Evals to deliver secure, global products. Design, build, and optimize production-grade code, delivering robust features within a much larger existing architecture. Run experiments to determine how different prompting techniques affect res…
Responsibilities
Design and build evaluation systems that test LLM capabilities in the healthcare domain, and interpret and communicate results. Collaborate with AI researchers, product managers, and designers to bring a world-class AI health companion to the world. Lead by example in software development best practices, leveraging AI in day to day work, influence improvements with high momentum & high quality. Guide peers, contributing to a culture of technical excellence and continuous improvement. Experience with data engineering - handling text dataset sourcing, curation, and processing tasks at scale. Have 0 to 1 experience with a bias towards shipping and learning, while balancing a high-quality bar. Experience in healthcare technology, particularly with regulated medical devices or large consumer products. Passionate about conversational AI and its deployment. Experience developing and improving evaluation methodologies for assessing quality of LLM-based products.
Original Posting
This role is sourced from Microsoft. Apply on Microsoft careers page