Job Title: Senior DevOps Engineer Company Name: Avantos Job Details: RemoteFull,Time Job Url: https://hiring.cafe/viewjob/mh6uv64c8zxyphpr Job Description: Posted 2mo agoSenior DevOps Engineer@ AvantosView All JobsWebsiteUnited StatesRemoteFull TimeResponsibilities:design infrastructure, build pipelines, own observabilityRequirements Summary:8+ years in DevOps/SRE/infrastructure; strong AWS, Terraform, CI/CD, Python/Bash; cloud security and reliability focus.Technical Tools Mentioned:Terraform, AWS, ECS Fargate, ALB, Cognito, S3, SQS, CloudWatch, Datadog, GitHub Actions, ArgoCD, Python, Bash, Docker, Kubernetes, PostgreSQL, RDS Company overview Avantos is building the industry’s first AI-native operating system for financial services, redefining how firms onboard clients, deliver advice, and manage core servicing workflows. Our platform unifies fragmented data, automates complex processes, and embeds intelligent decision-making across every step of the client lifecycle. We partner with leading financial institutions and are scaling rapidly. We’re an execution-driven, design-obsessed, product-led team composed of founders and leaders from Wharton, MIT, top design programs, and prior unicorn SaaS companies. We move fast, solve deep industry problems, and build technology that puts users back in control of their workflows. If you love client impact, product design, complex problem solving, and bringing AI-enabled change to real-world businesses, Avantos is where you will thrive.Job summary We're seeking a Senior DevOps Engineer / Site Reliability Engineer to own and evolve our infrastructure, reliability, and deployment practices. You'll be responsible for building the foundational platform that enables our engineering teams to ship quickly and reliably while maintaining the security and compliance standards required in financial services. Design, implement, and maintain our AWS cloud infrastructure using infrastructure-as-code principles with TerraformBuild and optimize CI/CD pipelines to enable rapid, safe deployments across multiple environmentsOwn observability strategy—implement comprehensive monitoring, logging, and alerting systems using Datadog and other toolingArchitect and manage containerized workloads on ECS Fargate and evaluate migration paths to KubernetesEstablish and enforce security best practices, working closely with compliance teams on financial services requirementsDesign and implement disaster recovery, backup, and business continuity strategiesOptimize system performance, cost efficiency, and resource utilization across AWS servicesCollaborate with engineering teams to improve service reliability, reduce toil, and establish SLOs/SLIsParticipate in incident response and conduct thorough post-mortems to drive continuous improvementMentor engineers on DevOps practices, cloud architecture patterns, and operational excellenceYour skills will include 8+ years of experience in DevOps, SRE, or infrastructure engineering rolesExpert-level proficiency with AWS services including ECS Fargate, ALB, Cognito, S3, SQS, and related servicesDeep hands-on experience with Terraform for managing complex, multi-account AWS environmentsStrong scripting and automation skills in Python and/or BashProven experience designing and implementing CI/CD pipelines (GitHub Actions, ArgoCD, or similar)Solid understanding of containerization technologies (Docker) and orchestration platforms (Kubernetes/ECS)Experience with observability and monitoring tools (Datadog, CloudWatch, or equivalent)Deep knowledge of networking, security, and AWS best practicesStrong problem-solving abilities and experience troubleshooting complex distributed systemsExcellent communication skills and ability to work cross-functionally with engineering teams Nice to haves Experience in financial services or highly regulated industriesFamiliarity with event-driven architectures and message queue systems (Kafka, SQS)Experience with PostgreSQL performance tuning and RDS managementKnowledge of microservices architecture patterns and service mesh technologiesExperience with security tooling, vulnerability scanning, and compliance frameworksFamiliarity with our application stack (Golang, Next.js, PostgreSQL)Experience managing AI/ML infrastructure and AWS BedrockWhat we offer: Competitive compensation + meaningful equityOpportunity to build production infrastructure from the ground up for a rapidly scaling AI platformA culture optimized for engineering excellence, focus, deep work, and ownership—not ticket factoriesRemote work flexibility