Job Url: https://www.linkedin.com/jobs/search/?currentJobId=4343709566&distance=25.0&f_TPR=r86400&f_WT=2&geoId=103644278&keywords=software%20engineer&origin=JOB_SEARCH_PAGE_JOB_FILTER&start=200 Job Description: ential Solutions Share Show more options AI/LLM Engineer  Tampa, FL · 17 hours ago · Over 100 people clicked apply Promoted by hirer · Responses managed off LinkedIn Remote Matches your job preferences, workplace type is Remote. Contract Apply Save Save AI/LLM Engineer  at Tential Solutions Your profile was shared with the job poster. Undo shared profile with the job posterUndo Did you apply? Let us know, and we’ll help you track your application. Yes No AI/LLM Engineer Tential Solutions · Tampa, FL (Remote) Apply Save Save AI/LLM Engineer  at Tential Solutions Show more options Your profile is missing required qualifications Show match details Help me update my profile BETA Is this information helpful? Get personalized tips to stand out to hirers Find jobs where you’re a top applicant and tailor your resume with the help of AI. Try Premium for PKR0 About the job Senior SDET – AI / LLM Quality Engineering (Shared Services) About The Team This role sits within the QA Center of Excellence, as part of a small, highly specialized AI Quality Engineering team consisting of two SDETs and one Data Engineer. The team operates as a shared service across the organization, defining how Large Language Model (LLM)–powered systems are tested, evaluated, observed, and trusted before and after production release. Rather than building customer-facing AI features, this team builds LLM-based testing and evaluation frameworks and partners with product, platform, and data teams to ensure generative AI solutions meet quality, reliability, and compliance standards. Role Overview We are seeking a Senior Software Development Engineer in Test (SDET) with a strong automation and systems-testing background to focus on LLM quality, validation, and evaluation. In This Role, You Will Test LLM-powered applications used across the enterprise Build LLM-driven testing and evaluation workflows Define organization-wide standards for GenAI quality and reliability This is a hands-on engineering role with significant influence across teams. Key Responsibilities LLM Testing & Evaluation Design and implement test strategies for LLM-powered systems, including: Prompt and response validation Regression testing across model, prompt, and data changes Evaluation of accuracy, consistency, hallucinations, and safety Build and maintain LLM-based evaluation frameworks using tools such as DeepEval, MLflow, Langflow, and LangChain Develop synthetic and real-world test datasets in partnership with the Data Engineer Define quality thresholds, scoring mechanisms, and pass/fail criteria for GenAI systems Test Automation & Framework Development Build and maintain automated test frameworks for: LLM APIs and services Agentic and RAG workflows Data and inference pipelines Integrate testing and evaluation into CI/CD pipelines, enforcing quality gates before production release Partner with engineering teams to improve testability and reliability of AI systems Perform root-cause analysis of failures related to model behavior, data quality, or orchestration logic Observability & Monitoring Instrument LLM applications with Datadog LLM Observability to monitor: Latency, token usage, errors, and cost Quality regressions and performance anomalies Build dashboards and alerts focused on LLM quality, reliability, and drift Use production telemetry to continuously refine test coverage and evaluation strategies Shared Services & Collaboration Act as a consultative partner to product, platform, and data teams adopting LLM technologies Provide guidance on: Test strategies for generative AI Prompt and workflow validation Release readiness and risk assessment Contribute to organization-wide standards and best practices for explaining, testing, and monitoring AI systems Participate in design and architecture reviews from a quality-first perspective Engineering Excellence Advocate for automation-first testing, infrastructure as code, and continuous monitoring Drive adoption of Agile, DevOps, and CI/CD best practices within the AI quality space Conduct code reviews and promote secure, maintainable test frameworks Continuously improve internal tooling and frameworks used by the QA Center of Excellence Required Skills & Experience Core SDET Experience 5+ years of experience in SDET, test automation, or quality engineering roles Strong Python development skills Experience testing backend systems, APIs, or distributed platforms Proven experience building and maintaining automation frameworks Comfort working with ambiguous, non-deterministic systems AI / LLM Experience Hands-on experience testing or validating ML- or LLM-based systems Familiarity with LLM orchestration and evaluation tools such as: Langflow, LangChain DeepEval, MLflow Understanding of challenges unique to testing generative AI systems Nice to Have Experience with Datadog (especially LLM Observability) Exposure to Hugging Face, PyTorch, or TensorFlow (usage-level) Experience testing RAG pipelines, VectorDBs, or data-driven platforms Background working in platform, shared services, or Center of Excellence teams Experience collaborating closely with data engineering or ML platform teams What This Role Is Not ? Not a pure ML research or model training role ? Not a feature-focused backend engineering role ? Not manual QA Why This Role Is Unique You will define how AI quality is measured across the organization You will build LLM-powered testing systems, not just test scripts You will influence multiple teams and products, not just one codebase You will work at the intersection of AI, automation, and reliability #Remote