Job Url: https://ats.rippling.com/pdq/jobs/d1b02e6a-23c1-460d-a894-cf0a9defd02c?jobSite=LinkedIn Job Description: What you'll be doing: Scale & Reliability Architect and manage GCP production environments across multiple regions, ensuring resilience, scalability, and security Partner with DBAs, SREs, and senior engineers to lead scalability and reliability initiatives for PDQ Connect Design for future multi-tenancy, regional data isolation, and compliance-readiness for frameworks like FedRAMP and GDPR Security & Compliance Own secure infrastructure provisioning, secrets management, IAM access controls, and runtime policies Harden GCP infrastructure to reduce the blast radius of misconfigurations and bad deploys Integrate SAST, SCA, SBOM, and CSPM tooling into the SDLC and production systems in collaboration with InfoSec and GRC Developer Experience & Productivity Improve the local development loop, test harnesses, and environment consistency across environments Deliver efficient, reliable CI/CD pipelines across dev, test, and prod Integrate and operationalize AI-assisted developer tools to enhance engineering productivity across the SDLC Codify infrastructure and service patterns into reusable Terraform modules, shared libraries, and documentation Observability & Operations Enable observability, monitoring, and alerting across all cloud environments to support proactive operations Implement cloud cost observability and continuously optimize infrastructure spend as we scale Technical Leadership & Mentorship Mentor engineers through code reviews, architectural guidance, and technical pairing Drive internal technical enablement through platform-focused talks, documentation, and hands-on workshops We're looking for people who have: Must-Haves 10+ years of platform/DevOps engineering experience, including 3+ years in a Staff or leadership role at a SaaS company 7+ years of hands-on GCP and Kubernetes experience Proven ability to scale and cost-optimize infrastructure in a product-led growth SaaS environment Demonstrated ownership of the full lifecycle: architecting, building, testing, deploying, securing, and operating cloud-native products in GCP Experience working in distributed teams across multiple time zones Track record of adopting and deploying new technologies, frameworks, and architectures Familiarity with applying AI in engineering or operational contexts Excellent collaboration and communication skills across technical and non-technical audiences Comfortable working in agile, kanban, or shape-up delivery models