All roles

[Remote] New Grad Data Engineer (for Health Tech Startup)๐Ÿค“

Remote ยท USA Full-time New today

Note: The job is a remote job and is open to candidates in USA. 1Phi Health is a health tech startup focused on making healthcare more accessible. They are seeking a New Grad Data Engineer to build and maintain data pipelines, ensuring data quality and collaborating with data scientists and product engineers in the healthcare data domain.

Responsibilities

  • Build and maintain data pipelines that ingest, transform, and validate large-scale Medicare claims data using SQL, Python, and Databricks (Spark). You'll work with patient-level records across billions of claim lines
  • Write and optimize complex SQL โ€” multi-step transformations, window functions, joins across large datasets, aggregations with suppression rules. SQL is the primary language of the work
  • Automate and operationalize recurring data workflows โ€” building reliable, repeatable pipelines that process CMS data extracts, dimension tables, and derived provider metrics
  • Ensure data quality by designing validation checks, reconciling source data against expected schemas, and investigating anomalies when numbers don't add up
  • Collaborate with data scientists and product engineers to define output schemas, deliver clean datasets, and support downstream analytics and application features
  • Work in cloud infrastructure โ€” primarily Databricks on AWS, with exposure to S3, Unity Catalog, and related services
  • Learn the healthcare data domain โ€” you'll develop working knowledge of claims data structures, medical coding systems (ICD-10, HCPCS, DRG), and CMS data programs

Skills

  • Build and maintain data pipelines that ingest, transform, and validate large-scale Medicare claims data using SQL, Python, and Databricks (Spark). You'll work with patient-level records across billions of claim lines
  • Write and optimize complex SQL โ€” multi-step transformations, window functions, joins across large datasets, aggregations with suppression rules. SQL is the primary language of the work
  • Automate and operationalize recurring data workflows โ€” building reliable, repeatable pipelines that process CMS data extracts, dimension tables, and derived provider metrics
  • Ensure data quality by designing validation checks, reconciling source data against expected schemas, and investigating anomalies when numbers don't add up
  • Collaborate with data scientists and product engineers to define output schemas, deliver clean datasets, and support downstream analytics and application features
  • Work in cloud infrastructure โ€” primarily Databricks on AWS, with exposure to S3, Unity Catalog, and related services
  • Learn the healthcare data domain โ€” you'll develop working knowledge of claims data structures, medical coding systems (ICD-10, HCPCS, DRG), and CMS data programs
  • You have strong SQL skills. Coursework, internships, or projects where you wrote non-trivial queries โ€” joins, CTEs, window functions, aggregations. You can reason about query performance
  • You're comfortable with Python. You've used it for data manipulation (pandas, PySpark, or similar). You don't need to be a software engineer, but you can write clean, functional code
  • You understand data pipeline concepts โ€” ETL/ELT, idempotency, schema management, data validation. Exposure through coursework, capstone projects, or internships counts
  • You're detail-oriented and methodical. Healthcare data has strict rules around suppression, privacy, and accuracy. You care about getting the numbers right
  • You're a fast learner who's comfortable ramping up on unfamiliar domains. You'll be learning Medicare claims data, CMS programs, and healthcare coding systems on the job
  • You have a BS or MS in Computer Science, Data Science, Information Systems, Statistics, or a related field
  • You've worked with Spark, Databricks, or other distributed compute environments (even in a class or personal project)
  • You have exposure to cloud platforms (AWS, GCP, or Azure) โ€” S3, IAM, or managed database services
  • You've touched healthcare data in any capacity โ€” claims, EHR, public health datasets, MIMIC, CMS public use files
  • You're familiar with version control (Git) and collaborative development workflows
  • You've built a data project end-to-end โ€” ingestion through delivery โ€” even if it was small

Benefits

  • Health insurance within 3 months of starting
  • Generous vacation policy + company holidays
  • 401K + profit share contributions
  • Quarterly evals and performance bonus (~10% at start, ~20% after 4 years)

Company Overview

  • It was founded in undefined, and is headquartered in , with a workforce of 2-10 employees. Its website is https://1phi.com/.
  • Apply To This Job

    Related roles

    [Remote] Pega Business Systems Analyst, Constellation - Remote

    Remote ยท USA Full-time

    [Remote] Accounting Manager

    Remote ยท USA Full-time

    [Remote] Solutions Engineer, Bilingual (En, Fr)

    Remote ยท USA Full-time

    [Remote] Learning & Development Administrator

    Remote ยท USA Full-time

    [Remote] Staff Product Designer, Agentic Commerce

    Remote ยท USA Full-time

    [Remote] Regional Vice President of Sales - EAST COAST

    Remote ยท USA Full-time

    [Remote] HSO IND - Power Platform Developer - Senior Technical Consultant

    Remote ยท USA Full-time

    [Remote] Senior Content Management Analyst (CPA/Audit/Financial Reporting)

    Remote ยท USA Full-time

    [Remote] Senior AI Partner Operations Manager

    Remote ยท USA Full-time

    [Remote] Municipal Finance Associate Attorney

    Remote ยท USA Full-time

    Experienced Entry-Level Data Entry Clerk โ€“ Remote Customer Service and Travel Coordination

    Remote ยท USA Full-time

    Lead Jira Admin | Nearshore (Lithuania, Poland, Romania)

    Remote ยท USA Full-time

    Experienced Part-Time Remote Data Entry Specialist โ€“ Supporting arenaflex's Operations Across the Nation

    Remote ยท USA Full-time

    Motion Graphic Designer (Remote) โ€“ Mid-Senior Level

    Remote ยท USA Full-time

    Senior Product Designer (Full Remote - Finland)

    Remote ยท USA Full-time

    Staff Software Engineer - [Auth and Admin Team]

    Remote ยท USA Full-time

    Utilization Review Nurse - Midwest Remote

    Remote ยท USA Full-time

    Experienced Customer Service Representative โ€“ Hybrid/Remote Work Opportunity at arenaflex

    Remote ยท USA Full-time

    Enterprise Sales Manager

    Remote ยท USA Full-time

    Remote Senior Applied Machine Learning Engineer - Applied Machine Learning Team

    Remote ยท USA Full-time