Job Url: https://apply.workable.com/plum-inc/j/22490FB374/

Job Description: RemoteData ScienceFull time

San Francisco, California, United States
Austin, Texas, United States
Atlanta, Georgia, United States
New York, New York, United States
OVERVIEW
APPLICATION

Share this job
 
Description
PLUM is a fintech company empowering financial institutions to grow their business through a cutting-edge suite of AI-driven software, purpose-built for lenders and their partners across the financial ecosystem.  We are a boutique firm, where each person’s contributions and ideas are critical to the growth of the company. 

This is a fully remote position, open to candidates anywhere in the U.S. with a reliable internet connection. While we gather in person a few times a year, this role is designed to remain remote long-term. You will have autonomy and flexibility in a flat corporate structure that gives you the opportunity for your direct input to be realized and put into action. You'll collaborate with a high-performing team — including sales, marketers, and financial services experts —  who stay connected through Slack, video calls, and regular team and company-wide meetings. We’re a team that knows how to work hard, have fun, and make a meaningful impact—both together and individually.

Job Summary
We are looking for a Senior Data Scientist to lead the development of scalable Generative AI pipelines that process raw data and generate context-aware results to power Plum’s AI-driven products. You will play a central role in shaping our GenAI platform, working across the full ML lifecycle—from ingestion and retrieval to generation, evaluation, and deployment.

This role combines deep expertise in machine learning with hands-on experience in building production-grade systems. You’ll collaborate closely with various cross functional teams and operate in a fast-paced environment where innovation, autonomy, and ownership are key.

Key Responsibilities
Design and architect end-to-end Generative AI pipelines using LLMs to process and generate context-aware results.
Integrate open-source and proprietary LLMs (e.g., GPT, LLaMA) via APIs and custom orchestration.
Build and optimize workflows using frameworks such as LangChain 
Design and implement RAG (Retrieval-Augmented Generation) architecture to inject relevant, contextual data into generation prompts.
Develop robust methods to evaluate and compare LLM outputs based on relevance, personalization, and factual accuracy.
Build automated and scalable LLM evaluation pipelines using embedding-based similarity, scoring metrics, and human-in-the-loop feedback.
Implement monitoring, observability, and logging for GenAI workflows to ensure reliability in production.
Collaborate with cross-functional teams to integrate generative outputs into client-facing applications.
Requirements
Master’s degree in Computer Science, Engineering, Physics, or a related technical field or equivalent work experience.
3+ years of experience developing and deploying machine learning pipelines in production.
1+ years of experience building Generative AI or LLM-based applications.
Strong programming skills in Python, with hands-on experience in ML/AI frameworks (e.g., LangChain, Transformers, LLM APIs).
Deep understanding of LLM evaluation, prompt engineering, and text generation quality metrics.
Experience designing and implementing RAG architectures.
Hands-on experience with Databricks, MLflow, or similar platforms.