Job Title: Senior Infrastructure Engineer

Company Name: Flex

Job Url: https://boards.greenhouse.io/embed/job_app?token=4666310005&utm_source=jobright&jr_id=69aa08162ebd316bece1c618

Job Description: About the role
Flex is looking for a seasoned Senior Infrastructure Engineer with a passion for performance optimization to join our dynamic Infrastructure Team.

In this role, you will be part of the Infrastructure Engineering team, a small team responsible for creating a sustainable platform that ensures the effectiveness, reliability and scalability of our systems.  You'll play a pivotal role in designing, building, and maintaining our robust and scalable infrastructure. You'll collaborate closely with our service engineering teams to automate processes, streamline operations, and ensure optimal system performance and reliability in our cloud infrastructure on AWS and GCP.

At Flex, we are an AI-first engineering organization. We believe that the future of infrastructure isn't just about managing resources—it’s about building the intelligent, automated systems that manage them for us.  We aren't looking for "task-takers"; we are looking for domain experts who use their deep knowledge of cloud architecture and SRE principles to steer these AI tools effectively.  On this team, your value is defined by your ability to combine your technical mastery with an AI-augmented workflow to deliver world-class reliability at a growth-stage pace.

We are particularly interested in candidates with software engineering experience in languages like Java, Python, or TypeScript. This background will allow you to collaborate effectively with product teams, build tools and automation, and improve the developer experience across our engineering organization. You’ll have the opportunity to influence key infrastructure and architecture decisions while ensuring high reliability and smooth delivery pipelines.  

This remote role requires a minimum of 5 years of cloud infrastructure experience.

What you’ll do

Collaborate with service engineering teams to design, implement, and maintain scalable and resilient infrastructure solutions optimizing for performance, resilience, and cost.
Ensure infrastructure aligns with business requirements and industry standards.
Leverage Terraform to automate infrastructure provisioning and configurations.
Implement SRE principles to improve system reliability and reduce downtime.
Improve developer workflows by creating self-service tools, optimizing CI/CD pipelines, and enhancing deployment processes to remove friction.
Develop and maintain robust monitoring and alerting systems to proactively identify and resolve issues.
Lead incident responses, manage on-call rotations, and facilitate post-incident reviews to drive continuous improvement and resilience.
Automate everything—drive adoption of Infrastructure as Code (IaC) and build automated pipelines for testing, monitoring, and deployments.
Leverage your excellent written and verbal communication skills, to create communications on upcoming changes and how they affect teams.
Key qualifications
Proven experience in building, scaling and monitoring cloud infrastructure on AWS, especially EKS, S3, RDS, API Gateway, Load Balancers, VPC, Lambdas, DocumentDB and DynamoDB.
Proven experience using Terraform to update and maintain cloud infrastructure.
Proven experience with containerized applications, kubernetes and microservice deployments.
Strong knowledge of GitHub Actions and CI/CD best practices.
Experience with developer productivity tools: designing CI/CD workflows, building internal tools, and creating self-service solutions to streamline software development.
Knowledge of monitoring and observability tools and frameworks, with working knowledge of Datadog being a plus.
Familiarity with networking concepts (DNS, load balancing, firewalls, VPNs).
Strong collaboration skills with the ability to work effectively across teams and communicate technical ideas clearly.
Experience coding/reading in one of the industry standard language such as Java, Python, TypeScript
 

Flex takes a market-based approach to pay, and compensation may vary depending on your primary work location. Work locations are categorized into one of three tiers based on a cost of labor index for that geographic area. The successful candidate’s starting pay will be commensurate with their experience, qualifications, and Flex’s internal leveling guidelines and benchmarks.
Tier A (NYC/SF/Seattle): $172,000 - $212,000 USD
Tier B: $154,800 - $193,500 USD
Tier C: $146,200 - $182,740 USD
#LI-Remote

Life at Flex:

We understand that it takes a diverse team of highly intelligent, curious, determined, empathetic, and self aware people to grow a successful company. Our HQ is located in New York City, but we have employees located throughout the US, Australia, Canada and South America. We are growing quickly, but deliberately, with a focus on building an inclusive culture. Our dynamic team has incredible perspectives to share, just as we know you do, and we take great pride in being an equal opportunity workplace.

We offer many employee benefits & perks. For full-time U.S based positions we offer:

Competitive medical, dental, and vision available from Day 1
Company equity
401(k) plan with company match (our company match kicks off at the beginning of 2026)
Unlimited paid time off + 13 company paid holidays
Parental leave 
Flex Cares Program
Free Flex subscription
 For full time non-US employees, we offer

Competitive compensation + company equity
Unlimited PTO