Youngbin Kim

LLM Engineer / Data Scientist (Clinical AI)

View my work
// about

About

Who I Am

I'm a Senior Data Scientist at NewYork-Presbyterian, where I was the first dedicated LLM hire. I build clinical NLP pipelines within HIPAA-compliant LLM infrastructure (local and cloud) and design agentic workflows that help clinicians make better decisions. I earned my PhD in Biomedical Engineering from Columbia University in 2024, focusing on machine learning approaches to cardiac signal analysis in disease models using stem cell derived engineered heart tissues.

Before NYP and Columbia, I interned at Genentech building ML models and studied Bioengineering and EECS at UC Berkeley. I've published 9 peer-reviewed papers during my PhD and built BeatProfiler, an open-source ML platform for cardiac analysis adopted by external researchers. When I'm not wrangling data, you'll find me singing with the Young New Yorkers' Chorus, biking across the city, or planning my next trip.

What I Do

  • LLM Engineering

    HIPAA-compliant infrastructure, LangGraph pipelines, agentic workflows

  • Clinical NLP

    Medical text extraction, clinical decision support, EHR integration

  • Cardiac Signal Analysis

    ML for ECG-like/calcium transient signals, BeatProfiler platform

  • Data Science

    Python, deep learning, statistical modeling

// experience

Experience

2024 – Present

Senior Data Scientist

New York-Presbyterian

First dedicated LLM hire. Built clinical NLP pipelines, HIPAA-compliant LLM infrastructure, and agentic workflows.

2019 – 2024

PhD, Biomedical Engineering

Columbia University

ML for cardiac signal analysis. Created BeatProfiler. 9 peer-reviewed publications including IEEE.

2022

Machine Learning Intern

Genentech

Built multimodal ML models for biomarker discovery from Alzheimer's drug clinical trial data.

2015 – 2019

BS, Bioengineering & EECS

UC Berkeley

Foundation in engineering and computer science.

// projects

Projects

BeatProfiler

Open Source

An end-to-end machine learning platform for automated cardiac signal analysis. BeatProfiler takes raw contractile videos or calcium transient / field potential recordings from stem cell-derived cardiomyocytes, processes them through automated pipelines, and classifies disease phenotypes — turning hours of manual analysis into seconds.

500+ downloads 📄 IEEE Published

Processing Pipeline

Calcium Transient Signal

AmplitudeTime