The Artificial General Intelligence (AGI) team is looking for passionate, talented, and inventive engineers to play a pivotal role in the development and maintenance of industry‐leading multi‐modal and multi‐lingual large language models (LLM). The AGI team's mission is to leverage our hyper‐scalable, general‐purpose large model training and inference systems to develop and deploy cutting‐edge sensory AI foundational models that revolutionize machine perception, interpretation, and interaction with humans and the physical world.
We believe in "Work Hard. Have Fun. Make History" and focus on sharing learning experiences from the front line with development teams. Whether you enjoy mastering a domain, juggling multiple tasks, refining processes, or diving into code, there is a role for you here.
You will be required to deeply understand technology landscapes, evaluate new technologies, and influence standards for operational excellence across systems. You will tackle abstract issues that span multiple functional areas and drive improvements that can scale across other teams, services, and platforms.
Responsibilities
- Lead design, automation, and continuous improvement of GenAI training compute infrastructure.
- Guide and mentor other engineers as a force‐multiplier to deliver results.
- Participate in design and code reviews, identifying bottlenecks.
- Identify performance bottlenecks in compute infrastructure and propose solutions.
- Be well‐versed in core AWS services, including EC2, Lambda, and EKS.
- Set up and manage CI/CD pipelines using tools such as AWS CodePipeline, GitHub Actions, or similar platforms.
- Use Infrastructure as Code (IaC) tools like AWS CloudFormation, Terraform, or the AWS CDK; understand networking concepts such as VPC, subnets, security groups, Load Balancers, and Route 53.
- Have hands‐on experience with Kubernetes.
Basic Qualifications
- 6+ years of systems design, software development, operations, automation, and process improvement experience.
- Experience programming in at least one modern language such as Python, Ruby, Golang, Java, C++, C#, or Rust.
- Experience with Linux/Unix.
- Experience with CI/CD pipeline build processes.
Preferred Qualifications
- Experience with distributed systems at scale.
Accommodations
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you're applying in isn't listed, please contact your Recruiting Partner.
Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.
#J-18808-Ljbffr