Company Name: CentML Job Details: Salary,not,providedEmployee,stock,options,availableKubernetesAWSGCPPythonJavaGoTerraformC++AzureDockerCMid,and,Senior,levelSan,Francisco,Bay,AreaTorontoRemote,from,US Job Url: https://app.welcometothejungle.com/jobs/ouL7eh60?theme=take-another-look Job Description: RoleWho you are4+ years of experience working with containerized deployment systems (e.g, kubernetes, openshift, terraform etc.)A big plus if you have contributed to kubernetes and have expertise in container runtime technologies like docker engine, containerd, or CRI-OExperience with deploying and managing cloud infrastructure on AWS, GCP, AzurePast experience in building GPU clusters for large scale ML training and inference is desirableKnowledge in GPU architecture and Nvidia GPU virtualization technologies is highly desirableStrong coding skills in languages like Python, Java, Go, and/or C/C++What the job involvesJoin our team in a key role focused on designing, developing, and maintaining the CentML platform that offers a cost effective infrastructure for serving and training large scale machine learning modelsResponsible for laying out the design of a deployment infrastructure for ML training and inference jobs over GPU clusters that spans across multiple cloud service providers like AWS, GCP, Azure, Coreweave, and OCIResponsible for leading a team of engineers and building a scalable, performant, and reliable platform, enabling our customers to seamlessly access and utilize a comprehensive suite of ML services that we offerDesign and lead the development of the deployment infrastructure of the CentML platform. The deployment infrastructure manages the hardware resources necessary to deploy the ML training and inference applicationsImplementing GPU cluster scheduling solutions for large scale ML training and inference workloads to efficiently utilize the hardware resources in the GPU clusterCommunicate with our product teams and define new features and goals for improving the CentML platformShare this jobReport a problem with this jobHide companyView 13 more jobs at CentMLCompanyCompany benefitsAn open and inclusive culture and work environmentFully stocked kitchen at the officeFull health and dental benefitsParental Leave top-up for 6 monthsContinuous education budgetGenerous vacation - we're not saying unlimited, but if you need extra time to recharge, just askFunding (2 rounds)Sep 2023$27.3mSEEDJun 2022$3.5mSEEDTotal funding: $30.8mOur takeIn an increasingly AI and ML-driven world, the demand for these technologies is skyrocketing, alongside their costs, leaving numerous companies without access to tools that could enhance their operations. CentML emerges as a solution, aiming to democratize AI and ML by making them more accessible and cost-effective for all.Backed by a team with extensive expertise in AI, ML compilers, and ML hardware, CentML possesses a deep understanding of the inefficiencies prevalent in the industry. Among the challenges it addresses is the scarcity of AI chips. By meticulously analyzing clients' AI/ML requirements, CentML advises on suitable hardware options to optimize performance and minimize costs.With its inception in 2022, CentML has swiftly garnered attention and funding, underscoring the market's appetite for its offerings. Recent funding will enable the company to further refine its product and conduct pivotal research in the field, solidifying its position as a pioneering force in democratizing AI and ML technologies.KirstyCompany Specialist at Welcome to the Jungle