Job Title: Data Scientist Company Name: CodePath Job Url: https://boards.greenhouse.io/embed/job_app?token=4998255007&utm_source=jobright Job Description: CodePath is reprogramming higher education to create the first generation of AI-native engineers, CTOs, and founders.  We deliver industry-vetted courses and career support centered on the needs of first-generation and low-income students. Our students train with senior engineers, intern at top companies, and rise together to become the tech leaders of tomorrow.  With 30,000 students and alumni from 700 colleges now working at 2,000 companies, we are reshaping the tech workforce and the industries of the future. About the Role Location: Remote, United States  Role Type: Full-Time  Reporting to: Lead Data Scientist  Compensation: $100,000 to $130,000 per year CodePath is entering a pivotal stage of growth, building the data, analytics, and AI infrastructure that powers our learning platform and supports tens of thousands of students nationwide. We are looking for a highly capable and mission-driven Data Scientist to join our growing Measurement, Evaluation, and Learning (MEL) team and help shape this future. As a Data Scientist at CodePath, you will work across the analytics and modeling lifecycle, supporting data pipelines, conducting exploratory analysis, helping build statistical and machine learning models, and developing the insights that guide organizational strategy. You will collaborate closely with CodePath's lead data scientist, senior data engineer, and cross-functional partners to ensure our data systems, analyses, and models enable student success, operational efficiency, and accurate impact measurement. This role is ideal for someone who enjoys solving problems in a fast-paced environment, applies clear analytical thinking, and wants to grow by supporting high-impact projects that advance CodePath’s data and AI work.   Key Activities Impact Measurement: Define, track, and analyze organizational impact, student outcomes, and program performance with MEL leadership  Data Modeling & ML: Support the development and refinement of statistical and machine learning models for outcomes analyses, forecasting, and decision support Exploratory Analysis: Conduct exploratory analyses to surface trends, anomalies, and insights across large datasets Dashboards & BI Tools: Work with the lead data scientist to develop and maintain dashboards, reports, and visualizations (Tableau, streamlit, or similar) that translate complex results into clear, actionable insights Feature Engineering & Data Preparation: Partner with data engineering to develop reliable datasets and features that power modeling and analytics workflows Pipeline Support: Support and validate data pipelines to ensure analytical datasets remain accurate, consistent, and well-structured  Cross-Functional Work: Collaborate with program, product, and operations teams to understand their data needs and translate them into well-defined analytical questions Documentation: Document analytical processes, models, and methodologies to ensure clarity, scaling, and reproducibility Continuous Improvement: Identify opportunities to enhance data quality, improve modeling processes, and expand MEL’s modeling and reporting capabilities   Key Success Metrics  High-quality Analytical Assets: Produces dashboards, Quarto reports, and reusable modules that meet MEL standards, with 75%+ requiring no major revision Stronger Impact Measurement: Works with lead data scientist to improve outcomes reporting and forecasting through validated datasets, models, and analyses Improved Data Quality & Reliability: Identifies and resolves data issues across pipelines and transformations in partnership with Data Engineering Meaningful Modeling Contributions: Supports development of statistical or ML models that improve decision-making through greater accuracy, clarity, or adoption   Qualifications 2-3+ years of professional experience in data science, machine learning, analytics, or a related field Foundation in statistics, probability, and machine learning algorithms Proficiency with Python (pandas/polars, numpy, scikit-learn, TensorFlow or PyTorch) SQL skills and experience working with medium- to large-scale datasets Experience with cloud platforms (GCP, AWS, or Azure) or deploying or maintaining data-driven applications Familiarity with data modeling concepts, data engineering workflows, and data pipelines  Strong communicator able to present complex analyses clearly and accessibly   Preferred Qualifications Experience with impact evaluation, education data, or social science research methods Exposure to dbt, Airbyte, FiveTran, or similar tooling Experience with experimental or quasi-experimental methods, causal inference, or A/B testing  Ability to turn ambiguous problems into structured analytical approaches Proactive, collaborative mindset with enthusiasm for continuous learning   Compensation