Work on State of the Art runtime systems hosting cutting edge Large Language Models (LLM). We work in a fast paced dynamic environment to rapidly experiment and deliver scaled runtime solutions based on cutting edge experiments and research in the LLM space.

Key job responsibilities
- Design, develop, test and deploy inference solutions for high-end LLMs
- Explore emerging inference optimization techniques
- Collaborate with cross-functional teams of engineers and scientists to identify and solve complex problems
- Mentor and guide junior engineers, and contribute to the overall growth and development of the team

BASIC QUALIFICATIONS

- 3+ years of non-internship professional software development experience
- 2+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
- Experience programming with at least one software programming language

PREFERRED QUALIFICATIONS

- 3+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
- Bachelor's degree in computer science or equivalent

Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status. For individuals with disabilities who would like to request an accommodation, please visit https://www.amazon.jobs/en/disability/us.