We are launching a campus-wide initiative to build foundation models that simulate the evolution of tumor ecosystems. You will be the lead engineer contributing to large-scale generative modelling on single-cell, spatial-omics, and clinical data.
Core responsibilities
Design, train and deploy multi-modal foundation models for single-cell and spatial cancer data
Build scalable training pipelines in PyTorch / JAX on GPU clusters and cloud HPC / ADK
Implement data-efficient fine-tuning, adaptive learning workflows and agentic frameworks for reasoning
Collaborate with machine learning experts and computational biologists to build tools for AI agents e.g. libraries, MCPs and APIs
The position is a full-time appointment jointly housed in Columbia's Irving Institute for Cancer Dynamics and The Fu Foundation School of Engineering & Applied Science. You will collaborate daily with a diverse team of AI / ML researchers, computational biologists, clinicians and bioengineers who share a mission of transforming our understanding of cancer progression and improving its treatment through next-generation AI and experimental platforms.
Required qualifications
B.S. / B.E. (minimum) in Computer Science, Biomedical / Electrical Engineering, Statistics, Bioinformatics, Applied Math, or related field
6+ years of experience in software engineering
3+ yrs hands-on experience training generative AI or large-language models at scale
Substantial expertise in training deep learning models and tuning large foundation models.
Expertise with developing efficient data loaders for large datasets and optimizing training workflows.
Deep knowledge of probabilistic modelling, self-supervised learning and representation learning, diffusion / VAE / flow matching / transformer architectures
Strong Python, PyTorch / JAX, containerization & MLOps skills; familiarity with distributed training and modern experiment-tracking stacks
Experience with AI coding tools (e.g., Copilot, Cursor)
Preferred extras
M.S. or graduate-level degree in relevant field
Experience with single-cell and spatial genomic or imaging data, and multimodal integration
Expertise in statistical causal discovery and inference
Publications or open-source contributions in generative models
Strong interest in applications and driving impact in cancer biology and immunology
Columbia University is an Equal Opportunity Employer / Disability / Veteran
Pay Transparency Disclosure
The salary of the finalist selected for this role will be set based on a variety of factors, including but not limited to departmental budgets, qualifications, experience, education, licenses, specialty, and training. The above hiring range represents the University's good faith and reasonable estimate of the range of possible compensation at the time of posting.
Staff Associate • Morningside, New York, United States