We are seeking an innovative LLM Engineer to develop and optimize large language model systems for our cutting-edge transcriptome differential expression gene (DEG) analysis platform. This role is critical in building the reasoning foundation that will transform biological data analysis from statistical correlation to mechanistic understanding. You will work at the intersection of advanced AI and precision medicine, creating systems that can reason about complex biological relationships and generate actionable insights from petabytes of genomic data.
Key Responsibilities
Core LLM Development
- Design and implement specialized LLM architectures for biological reasoning and causal inference
- Fine-tune foundation models (GPT-4, Claude, Gemma, etc.) for domain-specific transcriptome analysis tasks
- Develop custom prompting strategies that enable complex reasoning about gene regulatory networks
- Create RAG (Retrieval-Augmented Generation) pipelines integrating scientific literature with experimental data
- Implement chain-of-thought (CoT) and tree-of-thoughts (ToT) prompting for multi-step biological reasoning
Model Optimization & Scaling
Optimize LLM inference for production environments handling 20,000+ gene analysesImplement distributed processing using Ray Serve or similar frameworks for sub-second response timesDesign context compression techniques for handling large-scale genomic datasetsDevelop model ensembling strategies to reduce output variability from 30% toCreate efficient token management strategies for processing lengthy biological contextsBiological Domain Integration
Build knowledge graphs connecting genes, pathways, diseases, and literature findingsImplement causal reasoning capabilities for identifying driver vs. passenger gene mutationsDevelop specialized embeddings for biological entities (genes, proteins, pathways)Create explanation generation systems that produce clinician-friendly interpretationsDesign validation frameworks ensuring biological accuracy of LLM outputsQuality & Reliability
Implement uncertainty quantification for model predictionsDevelop robust evaluation metrics beyond traditional NLP measuresCreate testing frameworks for biological reasoning accuracyDesign fallback mechanisms for handling edge cases in genomic dataBuild monitoring systems for production model performanceRequired Qualifications
Technical Expertise
MS / PhD in Computer Science, AI, Computational Biology, or related field3+ years of experience with LLM development and deploymentExpert proficiency in Python and ML frameworks (PyTorch, TensorFlow, Hugging Face)Proven experience with prompt engineering and fine-tuning techniquesStrong understanding of transformer architectures and attention mechanismsExperience with distributed computing frameworks (Ray, Dask, or similar)Domain Knowledge
Understanding of biological terminology and genomics conceptsExperience with scientific text processing and literature miningFamiliarity with causal inference and reasoning frameworksKnowledge of medical / clinical NLP applications is a plusProduction Experience
Track record of deploying LLM systems at scaleExperience with model optimization techniques (quantization, pruning, distillation)Knowledge of MLOps practices and model versioningExperience with API design for AI servicesPreferred Qualifications
Experience with biomedical language models (BioBERT, PubMedBERT, BioGPT)Knowledge of transcriptomics and differential expression analysisFamiliarity with clinical regulatory requirements (FDA / EMA)Publications in NLP, computational biology, or related fieldsExperience with multi-modal AI systemsUnderstanding of graph neural networks for biological applicationsKey Performance Metrics
AchieveReduce LLM output variability toImprove biological reasoning accuracy to >90% on benchmark datasets
Successfully integrate 1M+ scientific papers into knowledge baseDeploy production systems handling 10,000+ analyses per dayWhat We Offer
Opportunity to work on transformative AI technology with direct patient impactCollaboration with leading scientists and AI researchersAccess to state-of-the-art computational resources and datasetsComprehensive benefits and equity participationProfessional development and conference attendance supportRemote-first culture with flexible working arrangementsIntegration with Team
You will work closely with :
Agentic AI Engineers to enable autonomous biological discovery systemsSoftware Engineers to build scalable, production-ready platformsBioinformaticians to ensure biological accuracy and relevanceClinical researchers to translate findings into therapeutic insights