Job Title: LLM Operations Engineer Company Name: Health Business Solutions LLC Job Url: https://www.simplyhired.com/job/tN8MtGqGPpQgbsT7L239yREI_wLqO5tRUQlb4ebdMHC_2IMe_xqclw Job Description: LLM Operations Engineer Health Business Solutions LLC Remote Job Details 1 day ago Qualifications Data encryption SSO Computer science Databricks Computer Science Data lake Software deployment Bachelor of Science Engineering IT system monitoring Spark GitHub Actions Technology security practices 3 years Automating deployment processes Microservices SQL Information security compliance RBAC Analysis skills Incident response Bachelor's degree Cloud-based systems Model deployment Azure DevOps proficiency Feature extraction Data quality monitoring Master of Science Data interpretation Scalability Master’s degree in computer science Integration testing Unit testing Data validation Senior level Batch data processing Cross-functional collaboration Bachelor's degree in computer science Model evaluation GitLab Communication skills Data warehouse Python MLOps Generative AI Cross-functional communication Bachelor's degree in data science Data Science System performance monitoring Database software proficiency Full Job Description We are looking for an LLM Ops Engineer with deep Databricks experience to build, automate, and scale our machine learning delivery pipelines on the Lakehouse. You’ll own the model lifecycle end‑to‑end—from data ingestion and feature engineering to CI/CD, deployment, monitoring, and governance—ensuring our ML systems are reliable, auditable, secure, and cost‑efficient. You will partner closely with Leadership, Data Engineers, and subject matter experts to productionize models using Databricks (Delta Lake, Unity Catalog, MLflow, Feature Store, Workflows) and modern DevOps practices across our cloud environments. Key Responsibilities Lakehouse & Databricks Platform Design and maintain Databricks workspaces, clusters, SQL Warehouses, cluster policies, and workspace governance (RBAC, SCIM, SSO, secret scopes). Implement robust data pipelines with Delta Lake (ACID tables, Z‑ordering, OPTIMIZE/VACUUM), Delta Live Tables (DAGs, expectations), and Workflows (jobs, task orchestration). Set up Unity Catalog for cross-workspace governance: data & model lineage, permissions, catalogs/schemas, data tags, and auditability. Operationalize ML models using MLflow (tracking, artifacts, metrics, model registry, approvals, stages: Staging/Production). Build/maintain Feature Store entities and feature pipelines; enforce reproducibility and feature governance. Establish model deployment patterns (batch scoring, streaming, microservices) using Model Serving. Create scalable CI/CD for notebooks, repos, and jobs using Azure DevOps, including unit/integration tests, data/feature validation, and registry promotions. Implement data quality and ML quality controls (e.g., Great Expectations/Delta expectations, statistical tests, drift detection, canary releases). Build robust monitoring & alerting for data freshness, pipeline SLAs, model performance, drift, and operational metrics. Optimize performance and cost (autoscaling, spot instances, DBR runtimes, caching, storage tiers). Enforce compliance and security best practices (PII handling, encryption at rest/in transit, network controls, secret management). Partner with data engineers and subject matter experts to standardize templates for experiments, pipelines, model packaging, and deployment. Document patterns and build internal tooling (CLI utilities, Python packages) to streamline model release and observability. Contribute to incident response, post‑mortems, and continuous improvements. ML Lifecycle & MLOps Operationalize ML models using MLflow (tracking, artifacts, metrics, model registry, approvals, stages: Staging/Production). Build/maintain Feature Store entities and feature pipelines; enforce reproducibility and feature governance. Establish model deployment patterns (batch scoring, streaming, microservices) using Model Serving. Create scalable CI/CD for notebooks, repos, and jobs using Azure DevOps, including unit/integration tests, data/feature validation, and registry promotions. Implement data quality and ML quality controls (e.g., Great Expectations/Delta expectations, statistical tests, drift detection, canary releases). Build robust monitoring & alerting for data freshness, pipeline SLAs, model performance, drift, and operational metrics. Operationalize ML models using MLflow (tracking, artifacts, metrics, model registry, approvals, stages: Staging/Production). Build/maintain Feature Store entities and feature pipelines; enforce reproducibility and feature governance. Establish model deployment patterns (batch scoring, streaming, microservices) using Model Serving. Create scalable CI/CD for notebooks, repos, and jobs using Azure DevOps, including unit/integration tests, data/feature validation, and registry promotions. Implement data quality and ML quality controls (e.g., Great Expectations/Delta expectations, statistical tests, drift detection, canary releases). Build robust monitoring & alerting for data freshness, pipeline SLAs, model performance, drift, and operational metrics. Design, deploy, and operate LLMOps pipelines for Retrieval‑Augmented Generation (RAG), including document ingestion, embedding generation, vector storage, retrieval strategies, prompt/version management, and evaluation, using Databricks (Delta Lake, MLflow, Model Serving) to ensure secure, auditable, and production‑grade GenAI systems. Infrastructure & Security Optimize performance and cost (autoscaling, spot instances, DBR runtimes, caching, storage tiers). Enforce compliance and security best practices (PII handling, encryption at rest/in transit, network controls, secret management). Collaboration & Process Partner with data engineers and subject matter experts to standardize templates for experiments, pipelines, model packaging, and deployment. Document patterns and build internal tooling (CLI utilities, Python packages) to streamline model release and observability. Contribute to incident response, post‑mortems, and continuous improvements. Qualifications Required BS/MS in Computer Science, Engineering, Data Science, or equivalent practical experience. 3+ years of MLOps/ML Engineering/Platform Engineering experience in Databricks. Hands‑on expertise with Databricks: Delta Lake, Unity Catalog, MLflow (Tracking/Registry), Feature Store, Workflows/Jobs, Repos, and Model Serving. Strong Python engineering skills (packaging, testing, virtual environments); familiarity with Spark (PySpark) and SQL. Experience with CI/CD (GitHub Actions/Azure DevOps/GitLab), artifact registries, and environment management. Solid understanding of data/machine learning pipeline design (batch/streaming), data quality checks, and ML evaluation/monitoring. Soft Skills Excellent communication and organizational abilities. Ability to work independently and as a part of cross-functional teams. Comfortable operating in a fast-paced, changing environment. Strong analytical and problem-solving skills, with the ability to interpret data and drive recommendations. HBiz Approval & Disclaimer This job description is intended to describe the general nature and level of work performed by individuals assigned to this position. It is not intended to be an exhaustive list of all duties, responsibilities, or qualifications required. Responsibilities may change based on business needs, client requirements, or operational priorities. HBiz reserves the right to modify this job description at any time, with or without notice. Employment with HBiz is at-will, meaning either the employee or the company may terminate employment at any time, with or without cause or notice, subject to applicable law. HBiz is an Equal Opportunity Employer and is committed to providing a workplace free from discrimination and harassment. We celebrate diversity and are committed to creating an inclusive environment for all employees.