Job Title: Data Scientist

Company Name: CodePath

Job Url: https://boards.greenhouse.io/embed/job_app?token=4998255007&utm_source=jobright

Job Description: CodePath is reprogramming higher education to create the first generation of AI-native engineers, CTOs, and founders. 

We deliver industry-vetted courses and career support centered on the needs of first-generation and low-income students. Our students train with senior engineers, intern at top companies, and rise together to become the tech leaders of tomorrow. 

With 30,000 students and alumni from 700 colleges now working at 2,000 companies, we are reshaping the tech workforce and the industries of the future.

About the Role

Location: Remote, United States 

Role Type: Full-Time 

Reporting to: Lead Data Scientist 

Compensation: $100,000 to $130,000 per year

CodePath is entering a pivotal stage of growth, building the data, analytics, and AI infrastructure that powers our learning platform and supports tens of thousands of students nationwide. We are looking for a highly capable and mission-driven Data Scientist to join our growing Measurement, Evaluation, and Learning (MEL) team and help shape this future.

As a Data Scientist at CodePath, you will work across the analytics and modeling lifecycle, supporting data pipelines, conducting exploratory analysis, helping build statistical and machine learning models, and developing the insights that guide organizational strategy. You will collaborate closely with CodePath's lead data scientist, senior data engineer, and cross-functional partners to ensure our data systems, analyses, and models enable student success, operational efficiency, and accurate impact measurement.

This role is ideal for someone who enjoys solving problems in a fast-paced environment, applies clear analytical thinking, and wants to grow by supporting high-impact projects that advance CodePath’s data and AI work.

 

Key Activities

Impact Measurement: Define, track, and analyze organizational impact, student outcomes, and program performance with MEL leadership 

Data Modeling & ML: Support the development and refinement of statistical and machine learning models for outcomes analyses, forecasting, and decision support

Exploratory Analysis: Conduct exploratory analyses to surface trends, anomalies, and insights across large datasets

Dashboards & BI Tools: Work with the lead data scientist to develop and maintain dashboards, reports, and visualizations (Tableau, streamlit, or similar) that translate complex results into clear, actionable insights

Feature Engineering & Data Preparation: Partner with data engineering to develop reliable datasets and features that power modeling and analytics workflows

Pipeline Support: Support and validate data pipelines to ensure analytical datasets remain accurate, consistent, and well-structured 

Cross-Functional Work: Collaborate with program, product, and operations teams to understand their data needs and translate them into well-defined analytical questions

Documentation: Document analytical processes, models, and methodologies to ensure clarity, scaling, and reproducibility

Continuous Improvement: Identify opportunities to enhance data quality, improve modeling processes, and expand MEL’s modeling and reporting capabilities

 

Key Success Metrics 

High-quality Analytical Assets: Produces dashboards, Quarto reports, and reusable modules that meet MEL standards, with 75%+ requiring no major revision

Stronger Impact Measurement: Works with lead data scientist to improve outcomes reporting and forecasting through validated datasets, models, and analyses

Improved Data Quality & Reliability: Identifies and resolves data issues across pipelines and transformations in partnership with Data Engineering

Meaningful Modeling Contributions: Supports development of statistical or ML models that improve decision-making through greater accuracy, clarity, or adoption

 

Qualifications

2-3+ years of professional experience in data science, machine learning, analytics, or a related field

Foundation in statistics, probability, and machine learning algorithms

Proficiency with Python (pandas/polars, numpy, scikit-learn, TensorFlow or PyTorch)

SQL skills and experience working with medium- to large-scale datasets

Experience with cloud platforms (GCP, AWS, or Azure) or deploying or maintaining data-driven applications

Familiarity with data modeling concepts, data engineering workflows, and data pipelines 

Strong communicator able to present complex analyses clearly and accessibly

 

Preferred Qualifications

Experience with impact evaluation, education data, or social science research methods

Exposure to dbt, Airbyte, FiveTran, or similar tooling

Experience with experimental or quasi-experimental methods, causal inference, or A/B testing 

Ability to turn ambiguous problems into structured analytical approaches

Proactive, collaborative mindset with enthusiasm for continuous learning

 

Compensation