Sr. Software Engineer- AI/ML, AWS Neuron Distributed Training

AWS Utility Computing (UC) provides product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for their cloud services. Additionally, this role may involve exposure to and experience with Amazon's growing suite of generative AI services and other cutting-edge cloud computing offerings across the AWS portfolio.Annapurna Labs (our organization within AWS UC) designs silicon and software that accelerates innovation. Customers choose us to create cloud solutions that solve challenges that were unimaginable a short time ago—even yesterday. Our custom chips, accelerators, and software stacks enable us to take on technical challenges that have never been seen before, and deliver results that help our customers change the world.AWS Neuron is the complete software stack for the AWS Inferentia and Trainium cloud-scale machinelearning accelerators and the Trn1 and Inf1 servers that use them. This role is for a senior software engineer in the Machine Learning Applications (ML Apps) team for AWS Neuron. This role is responsible for development, enablement and performance tuning of a wide variety of ML model families, including massive scale large language models like GPT2, GPT3 and beyond, as well as stable diffusion, Vision Transformers and many more. The ML Apps team works side by side with chip architects, compiler engineers and runtime engineers to create , build and tune distributed training solutions with Trn1. Experience training these large models using Python is a must. FSDP, Deepspeed and other distributed training libraries are central to this and extending all of this for the Neuron based system is key.Key job responsibilitiesThis role will help lead the efforts building distributed training and inference support into Pytorch, Tensorflow, Jax using XLA and the Neuron compiler and runtime stacks. This role will help tune these models to ensure highest performance and maximize the efficiency of them running on the customer AWS Trainium and Inferentia silicon and the TRn1 , Inf1 servers. Strong software development and ML knowledge are both critical to this role.About the teamAbout the teamOur team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge-sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects that help our team members develop your engineering expertise so you feel empowered to take on more complex tasks in the future.Diverse ExperiencesAWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.About AWSAmazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.Inclusive Team CultureHere at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.Work/Life BalanceWe value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud.Mentorship & Career GrowthWe’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.BASIC QUALIFICATIONS- - 5+ years of non-internship professional software development experience- - 5+ years of programming with at least one software programming language experience- - 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience- - 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience- - Experience as a mentor, tech lead or leading an engineering team ...

Manufacturing Engineer - SW Tools/Infrastructure, Annapurna Labs

AWS Utility Computing (UC) provides product innovations — from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Internet of Things (Iot), Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for customers who require specialized security solutions for their cloud services.Annapurna Labs (our organization within AWS UC) designs silicon and software that accelerates innovation. Customers choose us to create cloud solutions that solve challenges that were unimaginable a short time ago—even yesterday. Our custom chips, accelerators, and software stacks enable us to take on technical challenges that have never been seen before, and deliver results that help our customers change the world.You will work on developing at-scale software solutions to manage the manufacturing environments at board and server level test. You will work together with other engineering teams to unify testing solutions between manufacturing and data center operations groups. You will develop and maintain the test deployment and distribution systems to ensure that our manufacturing partners have access to appropriate versions of our software as soon as it's available. You will respond to new issues raised by our manufacturing partners, analyze logs and failures, and then develop and deploy solutions to those issues. You will develop documentation as well as testing and debug procedures for our manufacturing partners to follow.Key job responsibilities• Develop, validate and deploy test infrastructure mechanisms into manufacturing environments• Manage scaled fleets of custom test equipment and ensure their maintenance• Lead and develop data gathering/parsing solutions to ensure manufacturing data is stored on AWS servers in a structured way.• Support internal lab infrastructure and manufacturing unification effortsAbout the teamAbout the teamOur team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge-sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects that help our team members develop your engineering expertise so you feel empowered to take on more complex tasks in the future.Diverse ExperiencesAWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.About AWSAmazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.Inclusive Team CultureHere at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.Work/Life BalanceWe value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud.Mentorship & Career GrowthWe’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.BASIC QUALIFICATIONS• Bachelor's degree in Electrical Engineering, Computer Engineering, Computer Science or related field• Experience with Python, BASH or other scripting language• Developing network deployment infrastructure such as PXE• Experience working with system management components (BMC, BIOS, CPLD, etc)• Strong background working in UNIX environments ...

2025 ASIC Physical Design Engineer Intern, Annapurna Labs

Amazon Web Services (AWS) internships are full-time (40 hours/week) for 12 consecutive weeks during summer. By applying to this position, your application will be considered for all locations we hire for in the United States.In Annapurna Labs we are at the forefront of hardware co-design not just in Amazon Web Services (AWS) but across the industry. The work we do is cutting-edge and internet-scale while also being deeply important to our customers. We design and build every component of our hardware and software to come together into products that our customers use for accelerated computing: either Machine Learning acceleration, or FPGA acceleration. We get our hands dirty, from creating our own silicon, pushing the electrons in the right direction, ensuring our hardware is functional and healthy, and managing the full lifecycle of our systems at the huge scale and complexity of AWS. If you're interested in "building a complete product" from inception to delighted customers, Annapurna is a fantastic choice.As a member of the Cloud-Scale Machine Learning Acceleration team you’ll be responsible for the design and optimization of Software and Hardware in our data centers including technologies such as AWS Inferentia, which is a machine learning inference accelerator designed to deliver high performance at low cost.If this sounds exciting to you - come build the future with us!Key job responsibilities• Perform physical design for Amazon's machine learning custom silicon solutions• Participate in various aspects of physical design: full chip floorplanning, circuit analysis, power/clock distribution, timing optimization, place and route, power integrity analysis, and physical verification• Write Tcl or PERL scripts to improve physical design flows and methods• Collaborate with RTL, DFT designers to ensure high quality design implementationBASIC QUALIFICATIONS- Enrolled in a Bachelors’ degree program or higher in Electrical Engineering, Computer Engineering, or a related field with a graduation conferral date between December 2025 and September 2026- Scripting internship/project experience with Python, Perl or equivalent- Strong understanding of VLSI circuit design fundamentals- Have taken at least one or more VLSI circuit design and implementation classes and done lab projects ...

Software Development Manager, AWS Neuron Machine Learning Distributed Training, ML Accuracy

AWS Neuron is the complete software stack for the AWS Inferentia and Trainium cloud-scale machinelearning accelerators and the Trn1,2 and Inf1 servers that use them. As the SDM of Software Development for the Machine Learning Distributed Training team, you will be responsible for leading a strong team of engineers to help design and deploy the ML models. You will be responsible for setting up methodologies for accuracy measurement and baselining for the ML models we deliver. Develop generic solutions for training with low precision. Develop accuracy related reliability/scalability features. Responsible for the full development life cycle of our integrations and extensions for inference and training support in Pytorch, XLA, JAX as well as distributed training libraries like FSDP, DDP and others. Lead the way to ensure support for key ML functionality in a combined chip / software platform. Ensure the right thing is being built and delivered to customers.A successful candidate will have an established background in developing Machine Learning products with direct customer-facing experience, a strong technical ability and a motivation to achieve results. Experience in Machine Learning and software development is also a must.Key job responsibilitiesOur engineers collaborate across diverse teams, projects, and environments to have a firsthand impact on our global customer base. You’ll bring a passion for innovation, data, search, analytics, and distributed systems. You’ll also:- Solve challenging technical problems, often ones not solved before, at every layer of the stack. - Design, implement, test, deploy and maintain innovative software solutions to transform service performance, durability, cost, and security.- Build high-quality, highly available, always-on products.- Research implementations that deliver the best possible experiences for customers.A day in the lifeYou will work with the executive leadership and other senior management and technical leaders to define product directions and deliver them to customers. We build massive-scale distributed training and inference solutions. This organization builds the full stack of software, servers and chips to accelerate at the highest scale.About the teamOur team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge-sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects that help our team members develop your engineering expertise so you feel empowered to take on more complex tasks in the future.Diverse ExperiencesAWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. About AWSAmazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.Inclusive Team CultureHere at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.Work/Life BalanceWe value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Mentorship & Career GrowthWe’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. BASIC QUALIFICATIONS- 3+ years of engineering team management experience- 7+ years of working directly within engineering teams experience- 3+ years of designing or architecting (design patterns, reliability and scaling) of new and existing systems experience- 8+ years of leading the definition and development of multi tier web services experience- Experience partnering with product or program management teams- Knowledge of engineering practices and patterns for the full software/hardware/networks development life cycle, including coding standards, code reviews, source control management, build processes, testing, certification, and livesite operations- 3+ Years of Deep Learning/Machine learning experience ...

Software Dev Engineer - Compiler, Annapurna Labs

Are you excited about Machine Learning, chip acceleration, compilers, storage, systems or EC2? Are you passionate about delivering high quality services that affect hundreds of thousands of users? We are the dubbed the "secret sauce" behind AWS's success with development centers in the U.S. and Israel, Annarpuna is at the forefront of innovation by combining cloud scale with the world’s most talented engineers.The Annapurna team hires for multiple disciplines Software and Hardware engineers including but not limited to complier engineer, machine learning engineer, runtime engineer, performance engineer and ML chip accelerator, ASIC, physical designs, SDE in Test. Because of our teams’ breadth of talent, we’ve been able to improve AWS cloud infrastructure in networking and security with products such as AWS Nitro, Enhanced Network Adapter (ENA), and Elastic Fabric Adapter (EFA), in compute with AWS Graviton and F1 EC2 Instances, in machine learning with AWS Neuron, Inferentia and Trainium ML Accelerators, and in storage with scalable NVMe.Key job responsibilitiesInnovating and delivering creative SW Designs to develop new services, solve operational problems, drive improvements in developer velocity, or positively impact operational safetyWriting requirements capturing documents, design documents, integration test plans, and deployment plansCommunicating status and progress of deliverables to schedule, and sharing learnings/ innovations with your team and stakeholdersBASIC QUALIFICATIONS- Currently enrolled in, or completed a Bachelor’s degree program or higher in Computer Science, Computer Engineering, Electrical Engineering or related field- To qualify, applicants should have earned a Bachelor’s or Master’s degree between May 2023 to September 2025. Possible start dates for this role are between January 2025 to October 2025.- Programming experience in internship or coursework with programming language such as Python and/or C or C++.Candidates with strong interests and academic qualifications/research focus in two of the following:- Knowledge of code generation, compute graph optimization, resource scheduling- Data structure and algorithms - Compiler - Optimizing compilers (internals of LLVM, clang, etc)- Machine Learning - Experience with XLA, TVM, MLIR, LLVM- Deep learning models and algorithms- Tensorflow, PyTorch, or MxNET frameworks ...

Lab Engineer, Annapurnna MLA Hardware

AWS Utility Computing (UC) provides product innovations — from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Internet of Things (Iot), Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for customers who require specialized security solutions for their cloud services.Annapurna Labs (our organization within AWS UC) designs silicon and software that accelerates innovation. Customers choose us to create cloud solutions that solve challenges that were unimaginable a short time ago—even yesterday. Our custom chips, accelerators, and software stacks enable us to take on technical challenges that have never been seen before, and deliver results that help our customers change the world.Key job responsibilitiesAs a member of the Machine Learning Acceleration team, you’ll manage the organization’s lab activities and develop solutions to improve lab automation.You will support the Hardware and Software Engineering teams with lab setup and maintenance, as well as work with external / internal groups in the organization, including vendors for lab equipment.In this role, you will provide technical support for product development projects in the Machine learning lab including AWS Server integration and maintenance.We are looking for candidates who thrive in a fast-paced, start-up like environment, and who can work independently to deliver multiple projects in parallel across multiple sites. To be successful you need to be hands-on, highly motivated and detailed oriented while meeting the highest standards, time to market, cost and quality goals.About the teamOur team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge-sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects that help our team members develop your engineering expertise so you feel empowered to take on more complex tasks in the future.Diverse ExperiencesAWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.About AWSAmazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.Inclusive Team CultureHere at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.Work/Life BalanceWe value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud.Mentorship & Career GrowthWe’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.BASIC QUALIFICATIONS- Associates degree in Electrical Engineering or related field - 5+ years of related experience in engineering labs- Soldering and rework skills (0402, QFNs, etc.)- Experience in reading schematics and PCB layout - Great problem solving and analytical skills, including organization and communication skills- Manage inventory and procurement of components and equipment - Good at prioritizing tasks and keeping detailed logs of rework, revision history etc. ...

Machine Learning - Compiler Engineer II, Annapurna Labs

The Product: AWS Machine Learning accelerators are at the forefront of AWS innovation and one of several AWS tools used for building Generative AI on AWS. The Inferentia chip delivers best-in-class ML inference performance at the lowest cost in cloud. Trainium will deliver the best-in-class ML training performance with the most teraflops (TFLOPS) of compute power for ML in the cloud. This is all enabled by cutting edge software stack, the AWS Neuron Software Development Kit (SDK), which includes an ML compiler, runtime and natively integrates into popular ML frameworks, such as PyTorch, TensorFlow and MxNet. AWS Neuron and Inferentia are used at scale with customers like Snap, Autodesk, Amazon Alexa, Amazon Rekognition and more customers in various other segments.The Team: As a whole, the Amazon Annapurna Labs team is responsible for silicon development at AWS. The team covers multiple disciplines including silicon engineering, hardware design and verification, software and operations.The AWS Neuron team works to optimize the performance of complex neural net models on our custom-built AWS hardware. More specifically, the AWS Neuron team is developing a deep learning compiler stack that takes neural network descriptions created in frameworks such as TensorFlow, PyTorch, and MXNET, and converts them into code suitable for execution. As you might expect, the team is comprised of some of the brightest minds in the engineering, research, and product communities, focused on the ambitious goal of creating a toolchain that will provide a quantum leap in performance.You: Machine Learning Compiler Engineer II on the AWS Neuron team, you will be supporting the ground-up development and scaling of a compiler to handle the world's largest ML workloads. Architecting and implementing business-critical features, publish cutting-edge research, and contributing to a brilliant team of experienced engineers excites and challenges you. You will leverage your technical communications skill as a hands-on partner to AWS ML services teams and you will be involved in pre-silicon design, bringing new products/features to market, and many other exciting projects. A background in Machine Learning and AI accelerators is preferred, but not required.About the teamAbout UsInclusive Team CultureHere at AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences. Amazon’s culture of inclusion is reinforced within our 16 Leadership Principles, which remind team members to seek diverse perspectives, learn and be curious, and earn trust. Work/Life BalanceOur team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives. Mentorship & Career GrowthOur team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge sharing and mentorship. We care about your career growth and strive to assign projects based on what will help each team member develop into a better-rounded professional and enable them to take on more complex tasks in the future.BASIC QUALIFICATIONS- 3+ years of non-internship professional software development experience- 2+ years of experience architecting and optimizing compilers- Proficiency with 1 or more of the following programming languages: C++ (preferred), C, Python ...

DFT Design Engineer, AWS Machine Learning Acceleration

AWS Utility Computing (UC) provides product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for their cloud services. Additionally, this role may involve exposure to and experience with Amazon's growing suite of generative AI services and other cutting-edge cloud computing offerings across the AWS portfolio.Amazon Web Services provides a highly reliable, scalable, low-cost infrastructure platform in the cloud that powers hundreds of thousands of businesses in 190 countries around the world. We have data center locations in the U.S., Europe, Singapore, and Japan, and customers across all industries.Annapurna Labs (our organization within AWS UC) designs silicon and software that accelerates innovation. Customers choose us to create cloud solutions that solve challenges that were unimaginable a short time ago—even yesterday. Our custom chips, accelerators, and software stacks enable us to take on technical challenges that have never been seen before, and deliver results that help our customers change the world. As a member of the Silicon Optimization Engineering Team you’ll be responsible for the design and optimization of hardware in our data centers. You’ll provide leadership in the application of new technologies to large scale server deployments in a continuous effort to deliver a world-class customer experience. This is a fast-paced, intellectually challenging position, and you’ll work with thought leaders in multiple technology areas. You’ll have relentlessly high standards for yourself and everyone you work with, and you’ll be constantly looking for ways to improve your products performance, quality and cost. We’re changing an industry, and we want individuals who are ready for this challenge and want to reach beyond what is possible today. Key job responsibilities• Develop, implement and verify state-of-the-art Design for Test (DFT) architectures• Work with block designers to integrate DFT implementations• Work with physical design team to setup and implement DFT insertion flow• Develop high coverage and cost effective DFT methodologies• Perform RTL coding and Verification • Participate in Silicon debug and write scripts to effectively handle ATE related data• Communicate and work with team members across multiple disciplinesA day in the lifeAbout the teamOur team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge-sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects that help our team members develop your engineering expertise so you feel empowered to take on more complex tasks in the future.Diverse ExperiencesAWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.About AWSAmazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.Inclusive Team CultureHere at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.Work/Life BalanceWe value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud.Mentorship & Career GrowthWe’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.BASIC QUALIFICATIONS- BS degree in EE, CE, or CS- 3+ years of practical DFT experience with large processor and/or SoC designs- Knowledge about industry standard tools and practices in DFT, including ATPG, JTAG, MBIST and trade-offs between test quality and test time- Experience with automation script development ...

SDM, ML Acceleration, Neuron Frameworks

Utility Computing (UC)AWS Utility Computing (UC) provides product innovations — from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Internet of Things (Iot), Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for customers who require specialized security solutions for their cloud services.AWS Neuron is the complete software stack for the AWS Inferentia and Trainium cloud-scale machinelearning accelerators and the Trn1 and Inf1 servers that use them. As the Software Development Manager for the ML Applications - Framework team, you will be responsible for leading a strong team of engineers to help design and deploy ML applications/usecases on various frameworks such as Pytorch, JAX, Tensorflow. You will be responsible for the full development life cycle of our integrations and extensions for inference and training support in Pytorch, XLA, Tensorflow and JAX. Develop reliability/scalability features and performance updates in the Neuron ML Frameworks as well as contribute to other popular open Frameworks to enable them make Trainium and Inferentia devices as the first-class citizens for ML Acceleration. Lead the way to ensure support for key ML functionality in a combined chip / software platform. Ensure the right thing is being built and delivered to customersA successful candidate will have an established background in developing ML frameworks using Pytorch on XLA devices and corresponding framework technology components such as Torch-XLA, Open-XLA project integrations using PJRT or StableHLO, familarity of OpenXLA compilers. The ideal candidate should have a strong technical ability to work/deliver on a vertically integrated system stack that consists of a combinatorial matrix of hardware, frameworks, and workflows. Deep expertise in Framework integrations and development using C++ is a must along-with direct customer-facing experience and a strong motivation to achieve results. A day in the lifeYou will work with the executive leadership and other senior management and technical leaders to define product directions and deliver them to customers. We build massive-scale distributed training and inference solutions. This organization builds the full stack of software, servers and chips to accelerate at the highest scale.About the teamAbout AWSAmazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.Diverse ExperiencesAWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Work/Life BalanceWe value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Inclusive Team CultureHere at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.Mentorship & Career GrowthWe’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. BASIC QUALIFICATIONS- 3+ years of engineering team management experience- 7+ years of working directly within engineering teams experience- 3+ years of designing or architecting (design patterns, reliability and scaling) of new and existing systems experience- Experience partnering with product or program management teams ...

2025 ASIC Design Verification Engineer Intern, Annapurna Labs

Amazon Web Services (AWS) internships are full-time (40 hours/week) for 12 consecutive weeks during summer. By applying to this position, your application will be considered for all locations we hire for in the United States.In Annapurna Labs we are at the forefront of hardware co-design not just in Amazon Web Services (AWS) but across the industry. The work we do is cutting-edge and internet-scale while also being deeply important to our customers. We design and build every component of our hardware and software to come together into products that our customers use for accelerated computing through Machine Learning acceleration and FPGA acceleration. If you are interested in "building a complete product" from inception to delighted customers, Annapurna is a fantastic choice.If this sounds exciting to you - come build the future with us!Responsibilities: • Develop a deep understanding of end customer requirements including software applications, use models, system architecture and SoC architecture/micro-architecture solutions.• Participate in logic design activities as part of Amazon's machine learning custom silicon solutions. • Develop and execute design automation mechanisms and flows.• Work with physical design teams to achieve performance and area requirements.Mentorship & Career GrowthOur team is dedicated to supporting new team members in an environment that celebrates knowledge sharing and mentorship. Projects and tasks are assigned in a way that leverages your strengths and helps you further develop your skillset.Inclusive Team CultureHere at AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences.Work/Life HarmonyOur team puts a high value on work-life harmony. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility and encourage you to find your own balance between your work and personal lives.BASIC QUALIFICATIONS- Enrolled in a Bachelors’ degree program or higher in Electrical Engineering, Computer Engineering, or a related field with a graduation conferral date between December 2025 and September 2026- Programming experience in System Verilog or UVM ...

Physical Design Engineer, Annapurna Labs

AWS Utility Computing (UC) provides product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for their cloud services. Additionally, this role may involve exposure to and experience with Amazon's growing suite of generative AI services and other cutting-edge cloud computing offerings across the AWS portfolio.Annapurna Labs (our organization within AWS UC) designs silicon and software that accelerates innovation. Customers choose us to create cloud solutions that solve challenges that were unimaginable a short time ago—even yesterday. Our custom chips, accelerators, and software stacks enable us to take on technical challenges that have never been seen before, and deliver results that help our customers change the world.Amazon Web Services provides a highly reliable, scalable, low-cost infrastructure platform in the cloud that powers hundreds of thousands of businesses in 190 countries around the world. We have data center locations in the U.S., Europe, Singapore, and Japan, and customers across all industries.Custom SoCs (System on Chip) live at the heart of AWS Machine Learning servers. As a member of the Cloud-Scale Machine Learning Acceleration team you’ll be responsible for the design and optimization of hardware in our data centers including AWS Inferentia, Trainium Systems (our custom designed machine learning inference and training datacenter servers). Our success depends on our world-class server infrastructure; we’re handling massive scale and rapid integration of emergent technologies. We’re looking for an ASIC Physical Design Engineer to help us trail-blaze new technologies and architectures, while ensuring high design quality and making the right trade-offs.Key job responsibilities- Work with RTL/logic designers to drive architectural feasibility studies, explore power-performance-area tradeoffs for physical design closure- Drive IO/Core block physical implementation through synthesis, floor planning, bus / pin planning, place and route, power/clock distribution, congestion analysis, timing closure, IR drop analysis, physical verification, ECO and sign-off- Develop physical design methodologies- Evaluate 3rd party IP and provide recommendations- Be a highly-valued member of our start-up like team through excellent collaboration and teamwork with other physical design engineers as well as with the RTL/Arch. teamsA day in the lifeAbout the teamOur team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge-sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects that help our team members develop your engineering expertise so you feel empowered to take on more complex tasks in the future.Diverse ExperiencesAWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. About AWSAmazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.Inclusive Team CultureHere at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.Work/Life BalanceWe value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Mentorship & Career GrowthWe’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. BASIC QUALIFICATIONS- BS + 4yrs or MS + 3yrs in EE/CS- 4+ years of experience in ASIC Physical Design from - RTL-to-GDSII in either 7nm, 14/16nm, 20nm, or 28nm- Block Design using EDA tools (examples: Cadence, Mentor Graphics, Synopsys, or Others) including synthesis, equivalency verification, floor planning, bus / pin planning, place and route, power/clock distribution, congestion analysis, timing closure, IR drop analysis, physical verification, and ECO- Deep understanding on sign-off activities (timing, ir/em, physical verification)- Scripting experience with Tcl, Perl or Python ...

System Development Engineer, Annapurna Labs, Machine Learning Accelerator Systems - Fleet Triage

Annapurna Labs was a startup company acquired by AWS in 2015, and is now fully integrated. If AWS is an infrastructure company, then think Annapurna Labs as the infrastructure provider of AWS. Our org covers multiple disciplines including silicon engineering, hardware design and verification, software, and operations. AWS Nitro, ENA, EFA, Graviton and F1 EC2 Instances, AWS Neuron, Inferentia and Trainium ML Accelerators, and in storage with scalable NVMe, are some of the products we have delivered, over the last few years.In Annapurna Labs we are at the forefront of hardware/software co-design not just in Amazon Web Services (AWS) but across the industry. The Machine Learning System Operations and Automation Team is looking for candidates interested in writing automation software for our "fleet" of Machine Learning servers deployed around the world. Do you like solving mysteries? Great! Figuring out what that light switch with no obvious function actually does? Me too! Are you wearing a smartwatch to monitor your sleep and activity over time to optimize your routines? You'll fit right in. Does the word exabyte excite you? Let's get to work. Our team writes truly massive scale autonomous software to monitor, optimize, and remediate hardware in the most advanced servers in the world. Come join us!Key job responsibilities- Member of a team responsible for system remediation, operational excellence, and customer experience on bleeding edge ML products- Utilize data to root cause hardware failures and identify live trends on the most complex systems in AWS- Implement and improve system level testing across the product lifecycle- Develop software which can be maintained, improved upon, documented, tested, and reused- Dive deep on issues at the intersection of hardware and softwareA day in the lifeAs you design and code solutions to help our team drive efficiencies in software architecture, you’ll create metrics, implement automation and other improvements, and resolve the root cause of software defects. You’ll also:Build high-impact solutions to deliver to our large customer base.Participate in design discussions, code review, and communicate with internal and external stakeholders.Work cross-functionally to help drive business decisions with your technical input.Work in a startup-like development environment, where you’re always working on the most important stuff.About the teamThe MLA Systems Fleet Triage team is responsible for identifying and responding to the most challenging hardware and software failures from ML optimized servers at scale. We work in tandem with hardware design, firmware, and validation teams to improve test coverage and detection in production environments. Why AWS?Diverse ExperiencesAWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. About AWSAmazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.Inclusive Team CultureHere at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.Work/Life BalanceWe value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why flexible work hours and arrangements are part of our culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Mentorship & Career GrowthWe’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Hybrid WorkWe value innovation and recognize this sometimes requires uninterrupted time to focus on a build. We also value in-person collaboration and time spent face-to-face. Our team affords engineers options to work in the office every day or in a flexible, hybrid work model near one of our US Amazon offices. Our hybrid models allow you the freedom to work from home whenever in-office collaboration isn’t necessary.BASIC QUALIFICATIONS- 2+ years of non-internship professional software development experience- 1+ years of designing or architecting (design patterns, reliability and scaling) of new and existing systems experience- 3+ years of administrative experience in networking, storage systems, operating systems and hands-on systems engineering experience- Knowledge of systems engineering fundamentals (networking, storage, operating systems)- Experience programming with at least one modern language such as C++, C#, Java, Python, Golang, PowerShell, Ruby ...

We show restricted results, but there are more jobs available in our database, use Search to see them