Job Title: Senior NLP / ML Researcher (LLM Evaluation & Agentic Systems)

Company Name: Iris.ai

Job Details: RemoteFull,Time

Job Url: https://hiring.cafe/viewjob/wgzpato6li9t3nue

Job Description: Posted 21h agoSenior NLP / ML Researcher (LLM Evaluation & Agentic Systems)@ Iris.aiView All JobsWebsiteUnited StatesRemoteFull TimeResponsibilities:Design methods, Run experiments, Collaborate engineersRequirements Summary:PhD in ML/NLP or related field; 5+ years in industry or applied research; strong background in NLP, LLM evaluation, RAG; hands-on with grants; Python software engineering; publications; alignment to European time zones.Technical Tools Mentioned:Python, PyTorch, Transformers, TensorFlow, LLMs, Hugging Face, OpenAI, RAG, AWS, Docker, CI/CD, distributed computing, reproducible research workflows
Why Iris.ai At Iris.ai, we’re building an agentic AI platform that scales expert-level domain knowledge across entire organizations.For more than a decade, we’ve worked at the intersection of scientific research, industrial data, and applied AI, helping researchers, engineers, and business teams reason over complex technical knowledge.Our products - Neuralith, Axion, and RSpace - span the full GenAI lifecycle: Data ingestion across text, tables, figures, and technical formats Advanced RAG and indexing pipelines Agentic orchestration and reasoning Rigorous LLM evaluation and governance What makes us different: we care deeply about accuracy, evaluation, and responsibility. We don’t optimize for demos and proof-of-concepts we optimize for systems that experts trust and use.The RoleWe’re looking for a Senior NLP / ML Researcher who wants to work on hard, unsolved problems in modern language models — and see their ideas land in real products used by enterprises and researchers.This role combines research and applied engineering. You’ll drive novel research directions, build prototypes, conduct experimentation, and help turn them into production capabilities inside our platform.You’ll also play a key role in securing research funding by contributing to high-quality grant proposals both EU and national grants (EIC, Horizon, etc.).If you enjoy thinking deeply about why models fail, how to measure intelligence and uncertainty, and when agents should reason vs. act — you’ll feel at home here.What You’ll Research:You’ll work on a focused set of high‑impact research directions that sit at the core of modern applied NLP and agentic AI. The exact mix will evolve based on your strengths and interests, but broadly includes: LLM evaluation & uncertainty — confidence estimation, answer relevance, and robustness in open‑book QA and RAG systems Agentic reasoning & control — understanding when models should reason, stop reasoning, or act, including inference‑time steering Translation & multilingual NLP — evaluation and system design for modern LLM‑based translation, including low‑resource languages Your goal will be turning rigorous research into capabilities that real users can trust and use.What You’ll Do Design and implement novel NLP & ML methods (from theory to code) Run end‑to‑end experiments: data, training, evaluation, ablations Translate research insights into prototypes and production features Collaborate closely with engineers and product teams Publish, present, and engage with the AI research community Lead and co‑author EU and national research grant proposals  Write and publish research articles   Our Tech Stack Languages: Python (strong OOP practices) ML: PyTorch, Transformers, TensorFlow LLMs: Hugging Face, OpenAI, custom and fine‑tuned models Systems: RAG pipelines, Multi-agent frameworks, Evaluation tools Infra: AWS, Docker, Distributed computing Practices: Git, CI/CD, reproducible research workflows What We’re Looking For PhD in ML, NLP, Computer Science, or a related field Strong, hands‑on experience with R&D grants and proposal writing (e.g. Horizon Europe, EIC, national or international research funding) 5+ years of industry or applied research experience Strong background in NLP (transformers, semantic search, RAG) Hands‑on experience with LLMs and their evaluation Solid software engineering skills and experience with Python Publications in ML/NLP conferences or journals Able to work within European time zones  🌱Why Join Iris.ai? If you want to do meaningful NLP work, help secure funding for frontier AI research, and grow in a culture built on trust, rigor, and fairness — let’s talk.We’re not your typical tech company. We believe in: Real transparency — information is shared, context is open, and questions are welcome. Fairness, designed in policies, opportunities, and growth are aligned across countries and teams. Ownership and empoweredness to make decisions without micromanagement. Metrics that guide us — but they never replace human thinking or responsibility Compensation & OwnershipPayCompensation that reflects your value. Our salaries are typically 25% above local market averages, ensuring competitive, fair pay across regions and roles. And we review it annually.EquityWe believe salary helps you get by. Stock options build wealth. At Iris.ai all colleagues receive ownership in the company, part of our ESOP pool (3%). Because when we grow, you grow — that's what shared success really means.(Just imagine: Someone once bought a Tesla option for $1 — it's worth $400 today.)BenefitsWe’ve built our benefits to reflect how we work: with trust, fairness, and room to grow. 30 days paid vacation  5 additional days paid vacation for Learning and Development Private health insurance (premium coverage) and bi-annual health checks Free MultiSport card for your physical well-being Remote-first & flexible hours — work where you're at your best Personal annual learning budget for conferences, courses, or certifications Personal equipment budget to choose the gear that suits your style Charity and volunteer activities Seasonal working camps (summer & winter) and team retreats Ongoing growth through weekly tech deep dives, mentorship, pair coding, and knowledge-sharing 🚀 Let’s Build the Future of Responsible AIIf you care about building high-quality, ethical AI — guided by data and human judgment — you’ll feel at home at Iris.ai.👉 Apply now or reach out with questions. We’re transparent by default.