Company Name: Optimal Dynamics Job Details: $160-180k+,EquityKubernetesAWSPythonTerraformBazelSenior,and,Expert,levelNew,YorkRemote,from,US Job Url: https://app.welcometothejungle.com/jobs/qFEd1930?theme=favourite-technologies Job Description: RoleWho you areExperienced:‑ Individual contributor who has led reliability programs at a meaningful scale and owned incident response standardsTechnically Grounded: Deep, hands-on experience with infrastructure at scale, cloud, containerization, and more:AWS (multi‑service)ECS and/or Kubernetes containerization workloadsCICD & IaC (Terraform)Production Networking/FundamentalsPython Proficient: You can read/review service code and land operational improvementsData Driven: In your approach to SLOs, capacity, performance, and cost efficiency with strong observability chopsInfluential: Able to shape direction and create simple, durable standardsCommunicative: Excels in both technical and interpersonal communication, with strong written and verbal skillsDesirableAware of FinOps (cost attribution, efficient scaling) and DR/BCP program experienceFamiliar with secure SDLC, threat modeling, and compliance automation in a SOC 2 contextExperience collaborating with Data Science/ML teams and batch/streaming workloadsExposure to monorepo frameworks such as (bazel, buck, etc.)What the job involvesWe’re hiring a Senior Software Engineer, Site Reliability to lead reliability across our production platformAs a Staff‑level Individual contributor, you will drive strategy and hands‑on execution across incident response, SLO/SLI programs, and production readiness, directly owning highly available services in AWS; all while partnering with Platform/Infra to build paved‑road tooling in our monorepoThis is a full‑time, remote‑friendly role open to candidates across the United States. For those who prefer an in‑office experience, our HQ in New York City offers a collaborative environmentReliability (≈50%)Own the company‑wide incident lifecycle: standards for detection, escalation, incident command, customer comms, and high‑quality postmortems with action trackingDefine and drive SLIs/SLOs for core services; build guardrails and dashboards that make reliability visible and actionableLead production readiness reviews, capacity/performance planning, load testing, disaster recovery exercises, and resilience engineering (failure testing/chaos where appropriate)Level‑up on‑call: right‑sizing rotations, paging hygiene, runbooks, auto‑remediation, and continuous improvement of MTTA/MTTRSecurity (≈30%)Embed security into the delivery pipeline: dependency and image scanning, least‑privilege/IAM baselines, secrets management, and service‑to‑service authPartner with Engineering leadership to maintain SOC 2‑aligned controls as code; make audit‑friendly evidence generation part of everyday engineeringDrive secure‑by‑default patterns in the platform (e.g., network posture, data protection, runtime policies) without slowing down developersPlatform & DevEx (≈20%)Build and evolve paved roads for deploys, config, and runtime operations in our monorepo (Bazel) and CI/CD (AWS CodePipeline/CodeBuild)Partner with product teams to make the “secure, reliable default” the easiest path—templates, tooling, libraries, and automationImprove observability end‑to‑end (traces, logs, metrics, alerts)Our tech stack includes:Backend & AI: Python 3 and JavaFrontend: JavaScript/TypeScript for our web-based SPAData Stack: Trino, Dagster, dbt, DuckDB, and PresetIaC: Terraform and SpaceliftCloud: AWS (ECS/RDS/S3/etc)CI/CD: Bazel, Github, AWS CodePipeline/CodeBuildWe follow modern development best practices with all code stored on GitHub. Every pull request undergoes thorough code reviews, is fully unit tested, and deployed through our CI/CD pipeline for continuous quality assuranceShare this jobReport a problem with this jobHide companyView 4 more jobs at Optimal DynamicsInsightsTop investors1% employee growth in 12 monthsGlassdoor (4.9)CompanyCompany benefitsWe offer competitive pay on all positions401(k) with matching100% covered health/dental/vision benefitsFlexible work from home: We think the office is a tool for meetings, and not required to get work doneEquity: Every Optimal Dynamics employee is a shareholderFunding (last 2 of 4 rounds)May 2025$33.3mLATE VCApr 2022$33mSERIES BTotal funding: $88.7mOur takeMarket volatility and overburdening are continual problems in the freight industry. Now that the boom in e-commerce has home-deliveries on the rise, costly and inefficient ‘last-mile’ journeys are increasing too. Optimal Dynamics is looking to revolutionize freight through high-dimensional artificial intelligence.CORE.ai, Optimal Dynamics' proprietary software-as-service product, helps shippers optimize logistics using flexible, open API protocol to create probabilistic profit-maximizing strategies. This software gives them a material competitive edge in this respect, as it tackles the kind of growing uncertainty and complexity the industry is grappling with.Optimal Dynamics' plans going forward include continual development into their software, such as through an integration partnership with McLeod software that will help expedite deployment periods and its 2023 Bid solution for truckload carriers. It is also doubling its workforce to keep up with the rapid growth the company is experiencing.KirstyCompany Specialist at Welcome to the Jungle