Software Development Manager, AI Inference Technology, Neuron SDK
Seattle, Washington | Remote-Friendly | $166,400 - $287,700
We're working with Annapurna Labs (Part of Amazon) on this exciting opportunity.
Join a pioneering team at Annapurna Labs, an AWS company, dedicated to optimizing cutting-edge AI inference technology for cloud-scale machine learning accelerators like Trainium and Inferentia. This role offers the chance to lead expert AI engineers in delivering fundamental inference building blocks and libraries, directly impacting the performance of large language models for global customers. You'll be at the forefront of innovation, navigating dynamic priorities and shaping the future of AI inference.
Key Responsibilities
- Guide AI engineers to build fundamental inference technology building blocks and libraries.
- Optimize LLMs such as Llama and GPT OSS to run efficiently on Trainium and Inferentia devices.
- Develop and optimize attention kernels and deliver them in the Neuronx_Distributed Inference Libraries.
- Define the building blocks for the latest LLMs in collaboration with senior management and technical leaders.
- Manage changing priorities as new models and technologies emerge, adapting team's work accordingly.
- Dive deep to help the team solve complex technical challenges.
What You'll Need
- 3+ years of engineering team management experience.
- Established background in optimizing LLMs.
- Experience delivering high-performance models using distributed inference libraries.
- Capability of managing demanding, fast-changing priorities.
- Strong technical ability to understand and deliver within a vertically integrated system stack.
- Proficiency with PyTorch inference library, Neuron compiler, runtime, and collectives.
Apply via Haystack today!