Experience
Scientist.com
San Diego, CA (Remote)
Applied skills: Python, scikit-learn (PCA, K-means, HDBSCAN, KNN), umap-learn, Rust, Torch, TensorFlow, PEFT, Hugging Face, Ruby-on-Rails, JavaScript, Postgres, Elastic Search, Kubernetes, Terraform, Docker
Director of Preclinical AI
January 2024 - Present
Built and fine-tuned a custom transformer architecture from an LLM base model, DoRA adapters, and a regression head to predict scientific service pricing and uncertainty from client requests using a composite loss function (e.g. using MAE and negative-log likelihood), achieving an R2 of 0.84
Developed a FastAPI micro service application for real-time price predictions using the above-mentioned composite model
Built, trained, and ported into Rust a variational auto encoder (neural network) to map molecular fingerprints to latent space for accelerated compound similarity search using K-means clusters, reducing search space by ~10,000-fold
Developed a micro service application (Cheminée) for compound structure searches in Rust (e.g. exact match, substructure, superstructure, similarity search); currently used under-the-hood for the company's client-facing Product Hub
Data Scientist
April 2020 - December 2023
Constructed a browser-compatible tumor model search engine (Tumor Model Finder) from backend to frontend, complete with bioinformatic data processing and harmonization between different data sources for 10,000+ tumor models across 20+ providers
Conducted numerous presentations and discussions with scientific service providers as well as pharma and biotech researchers
PhenoVista Biosciences
San Diego, CA
Applied skills: Python, scikit-learn (PCA, K-means, KNN, Random Forest), custom image analysis, high-content assay development
Scientist II
January 2020 - March 2020
Built a software platform for automated, high-throughput imaging of 3D tissues, opening the company to new clientele
Scientist I
September 2018 - December 2019
Conducted many high-content assays from start-to-finish for pharma/biotech clients (included culturing of various cell types in 384-well plates, randomized drug treatments, and fixation/staining for high-content imaging)
Constructed comprehensive data analysis pipelines including plate-artifact normalization, feature detection, and implementation of ML algorithms, resulting in a publication and enabling the company to accept intensive quarter-million-dollar contracts
Programmed laboratory robots for automated pipetting for projects with thousands of drug treatments
Wrote numerous Python scripts for image analysis (e.g. OpenCV)
Conducted many presentations and collaborations with pharma and biotech researchers
Novartis Institutes for BioMedical Research
Cambridge, MA
Applied skills: Python, scikit-learn, FACS, assay development, high-throughput screening, cell culture
Doctoral Intern
September 2017 - December 2017
Developed a robust FACS-based assay for immuno-oncology targets from start to finish, testing 6000 drug treatments and resulting in a scientific publication
Harvard Medical School
Boston, MA
Applied skills: High-content assay development, proteomics, bioinformatics, data science (Python, scikit-learn), image analysis software development
Doctoral Researcher
September 2013 - August 2018
Studied reverse pharmacology of microtubule-targeting agents; multiple manuscripts published
Created comprehensive Python packages for processing and exploration of proteomic and phosphoproteomic data
Wrote extensive image analysis software in Python for single-cell tracking of EB3 comets in live cells
Collaborated with several students, post-docs, and advisors in the Systems Biology and Systems Pharmacology departments
Dana-Farber Cancer Institute
Boston, MA
Applied skills: Molecular docking and homology modeling (Schrödinger Suite), molecular dynamics (AMBER and NAMD)
Undergraduate Researcher
Summers of 2011 and 2012
Performed molecular dynamics (MD) simulations to determine binding modes for small molecules targeting the DOT1L histone methyltransferase enzyme, resulting in a publication
Built homology models to determine protein-ligand interaction modes with respect to the EZH1/2 histone methyltransferase enzymes, resulting in a publication