Job Title: Senior / Staff Software Engineer - Infrastructure Company Name: Boundless Job Details: RemoteFull,Time Job Url: https://hiring.cafe/viewjob/hik3o1t05npsq3zm Job Description: Posted 18h agoSenior / Staff Software Engineer - Infrastructure@ BoundlessView All JobsWebsiteUnited StatesRemoteFull TimeResponsibilities:build clusters, orchestrate workloads, monitor systemsRequirements Summary:5+ years in infrastructure/DevOps with 2+ years managing large GPU clusters; Kubernetes, Docker, Terraform, Ansible; CUDA; on-prem and cloud (AWS/GCP/Azure).Technical Tools Mentioned:Terraform, Ansible, Pulumi, Kubernetes, Docker, CUDA, Linux The RoleAs an Infrastructure Engineer, you'll build and deploy massive computational infrastructure that positions Boundless as the leading decentralized proving network.. You'll architect GPU clusters at unprecedented scale, orchestrate proving across every major blockchain, and manage the complex systems that power billions of cycles of ZK proofs daily. This role demands expertise in both bare-metal optimization and cloud-native architectures.What You'll DoBuild Massive Proving Clusters: Design and deploy proving infrastructure with 1000s of GPUs across both on-premises data centers and cloud services (AWS, GCP, Azure)Orchestrate Multi-Chain Proving: Build infrastructure that coordinates proving workloads across every major blockchain, ensuring optimal resource allocation and throughputOptimize Container Topology: Design and refine the topology of complex containerized services, maximizing efficiency and minimizing latency in proof generationBare Metal Engineering: Work at the hardware level, optimizing GPU performance, managing CUDA installations, and tuning kernel parameters for maximum throughputCloud Infrastructure: Architect highly available, auto-scaling cloud infrastructure that can dynamically respond to proving demand across multiple regionsRelease Management: Manage deployment pipelines and release schedules for complex distributed software, ensuring zero-downtime upgradesPerformance Monitoring: Build comprehensive monitoring and alerting systems to track GPU utilization, proof generation metrics, and system healthCost Optimization: Implement strategies to minimize infrastructure costs while maintaining performance, including spot instance management and resource schedulingRequirements5+ years of infrastructure/DevOps experience with 2+ years managing large-scale GPU clustersExperience with both on-premises compute operations and cloud platforms (AWS/GCP/Azure)Proficiency in infrastructure-as-code tools (Terraform, Ansible, Pulumi)Deep expertise in Kubernetes, Docker, and container orchestration at scaleExperience with GPU computing infrastructure (CUDA)Experience releasing complex software to communities, including building and packaging AMIs, Docker images, binaries, and maintaining distribution channelsTrack record of managing mission-critical, high-throughput systemsStrong Linux systems administration and bare-metal optimization skillsProficiency in Rust and low-level systems programming Nice to HaveFamiliarity with ZK proof generation or blockchain infrastructureExperience operating cryptocurrency mining or ML training infrastructureKnowledge of network optimization and topology designExperience with multi-region, globally distributed systemsBenefitsAt Boundless, we take care of our people, because building the future of decentralized computing starts with an empowered team. Here’s what you can expect when you join us:Competitive salary + equity/token allocationHealth, dental, vision (for U.S. employees; region-adjusted globally)Flexible PTO + home-office/equipment stipendProfessional development and conference travel budgetRemote-first with regular off-sites and a high-trust, high-velocity team environment