Job Url: https://www.linkedin.com/jobs/search/?currentJobId=4343709566&distance=25.0&f_TPR=r86400&f_WT=2&geoId=103644278&keywords=software%20engineer&origin=JOB_SEARCH_PAGE_JOB_FILTER&start=200

Job Description: ential Solutions
Share
Show more options
AI/LLM Engineer 
Tampa, FL · 17 hours ago · Over 100 people clicked apply
Promoted by hirer · Responses managed off LinkedIn


 Remote
Matches your job preferences, workplace type is Remote.

Contract

Apply

Save
Save AI/LLM Engineer  at Tential Solutions
Your profile was shared with the job poster. Undo shared profile with the job posterUndo
Did you apply?

Let us know, and we’ll help you track your application.


Yes

No
AI/LLM Engineer
Tential Solutions · Tampa, FL (Remote)

Apply

Save
Save AI/LLM Engineer  at Tential Solutions
Show more options
Your profile is missing required qualifications


Show match details

Help me update my profile


BETA

Is this information helpful?


Get personalized tips to stand out to hirers
Find jobs where you’re a top applicant and tailor your resume with the help of AI.

Try Premium for PKR0
About the job
Senior SDET – AI / LLM Quality Engineering (Shared Services)

About The Team

This role sits within the QA Center of Excellence, as part of a small, highly specialized AI Quality Engineering team consisting of two SDETs and one Data Engineer.

The team operates as a shared service across the organization, defining how Large Language Model (LLM)–powered systems are tested, evaluated, observed, and trusted before and after production release.

Rather than building customer-facing AI features, this team builds LLM-based testing and evaluation frameworks and partners with product, platform, and data teams to ensure generative AI solutions meet quality, reliability, and compliance standards.

Role Overview

We are seeking a Senior Software Development Engineer in Test (SDET) with a strong automation and systems-testing background to focus on LLM quality, validation, and evaluation.

In This Role, You Will

Test LLM-powered applications used across the enterprise 
Build LLM-driven testing and evaluation workflows 
Define organization-wide standards for GenAI quality and reliability 

This is a hands-on engineering role with significant influence across teams.

Key Responsibilities

LLM Testing & Evaluation 

Design and implement test strategies for LLM-powered systems, including:
Prompt and response validation 
Regression testing across model, prompt, and data changes 
Evaluation of accuracy, consistency, hallucinations, and safety 
Build and maintain LLM-based evaluation frameworks using tools such as DeepEval, MLflow, Langflow, and LangChain 
Develop synthetic and real-world test datasets in partnership with the Data Engineer 
Define quality thresholds, scoring mechanisms, and pass/fail criteria for GenAI systems 
Test Automation & Framework Development 

Build and maintain automated test frameworks for:
LLM APIs and services 
Agentic and RAG workflows 
Data and inference pipelines 
Integrate testing and evaluation into CI/CD pipelines, enforcing quality gates before production release 
Partner with engineering teams to improve testability and reliability of AI systems 
Perform root-cause analysis of failures related to model behavior, data quality, or orchestration logic 
Observability & Monitoring 

Instrument LLM applications with Datadog LLM Observability to monitor:
Latency, token usage, errors, and cost 
Quality regressions and performance anomalies 
Build dashboards and alerts focused on LLM quality, reliability, and drift 
Use production telemetry to continuously refine test coverage and evaluation strategies 
Shared Services & Collaboration 

Act as a consultative partner to product, platform, and data teams adopting LLM technologies 
Provide guidance on:
Test strategies for generative AI 
Prompt and workflow validation 
Release readiness and risk assessment 
Contribute to organization-wide standards and best practices for explaining, testing, and monitoring AI systems 
Participate in design and architecture reviews from a quality-first perspective 
Engineering Excellence

Advocate for automation-first testing, infrastructure as code, and continuous monitoring 
Drive adoption of Agile, DevOps, and CI/CD best practices within the AI quality space 
Conduct code reviews and promote secure, maintainable test frameworks 
Continuously improve internal tooling and frameworks used by the QA Center of Excellence 

Required Skills & Experience

Core SDET Experience

5+ years of experience in SDET, test automation, or quality engineering roles 
Strong Python development skills 
Experience testing backend systems, APIs, or distributed platforms 
Proven experience building and maintaining automation frameworks 
Comfort working with ambiguous, non-deterministic systems 

AI / LLM Experience 

Hands-on experience testing or validating ML- or LLM-based systems 
Familiarity with LLM orchestration and evaluation tools such as:
Langflow, LangChain 
DeepEval, MLflow 
Understanding of challenges unique to testing generative AI systems 
Nice to Have

Experience with Datadog (especially LLM Observability) 
Exposure to Hugging Face, PyTorch, or TensorFlow (usage-level) 
Experience testing RAG pipelines, VectorDBs, or data-driven platforms 
Background working in platform, shared services, or Center of Excellence teams 
Experience collaborating closely with data engineering or ML platform teams 

What This Role Is Not

? Not a pure ML research or model training role 
? Not a feature-focused backend engineering role 
? Not manual QA 

Why This Role Is Unique

You will define how AI quality is measured across the organization 
You will build LLM-powered testing systems, not just test scripts 
You will influence multiple teams and products, not just one codebase 
You will work at the intersection of AI, automation, and reliability

#Remote