Overview EFA Network Sr. Software Engineer, EFA ML Software Team Job ID: 2993295 | Amazon Development Center U.S., Inc. Want to help make the next generation of Machine Learning in the cloud possible? Do you have a laser focus on performance in your team's code? We want to talk to you! We own the user-space software that makes the Elastic Fabric Adapter (EFA) network card work for Machine Learning (ML) and High-Performance Computing (HPC) customers on AWS. Across multiple projects written in C, our team enables customers to network thousands of GPU and CPU instance types to handle the toughest clustered workloads. Lead a dynamic, fast-paced group that has a big impact every day on the hottest companies doing AI and HPC today. Responsibilities
Lead a team of networking developers working at a high level in networking. Write the highest-performing C code for multiple open source projects supporting EFA, such as Libfabric and Open MPI. Collaborate with multiple teams to invent new APIs for the latest cloud networking concepts. Analyze how customers perform collectives and messaging at high bandwidth and low latency. Provide expert-level support to some of the largest AI names in the world.
A day in the life
Start from customer needs and invent ways to reduce the occupancy of the software stack for EFA. Drive peers and leadership to accept well-written designs. Coordinate with the ML Infrastructure team to ensure performance on hundreds to thousands of top-end machine clusters.
About the team We are a fast-paced team that owns the user-space software stack for EFA. As part of Annapurna Labs in AWS we are nimble and focused on what the AI industry will try next. We emphasize automation and concentrate on the most interesting problems as customers experiment with our network. Our team supports growth and helps you achieve your career goals. Basic Qualifications
5+ years of non-internship professional software development experience 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems 5+ years of full software development life cycle experience, including coding standards, code reviews, source control management, build processes, testing, and operations Experience as a mentor, tech lead or leading an engineering team
Preferred Qualifications
Bachelor's degree in computer science or equivalent
Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status. Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit the provided accommodations information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner. Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $151,300/year in our lowest geographic market up to $261,500/year in our highest geographic market. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience. Amazon is a total compensation company. Depending on the position offered, equity, sign-on payments, and other forms of compensation may be provided as part of a total compensation package, in addition to a full range of medical, financial, and/or other benefits. For more information, please visit the Amazon benefits page. This position will remain posted until filled. Applicants should apply via our internal or external career site. Important FAQs for current Government employees: Before proceeding, please review the following FAQs https://amazon.jobs/en/faqs#faqs-for-us-government-employees Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status. #J-18808-Ljbffr