I'm a computer science and math major now doing a PhD in Systems, Synthetic, and Quantitative Biology at Harvard in Michael Desai's lab. I work on a bunch of different kinds of problems but my current interests are protein fitness landscapes, clinical data analysis, and Bayesian statistics.
Drug Discov. Today
The AUC and AUPR are metrics commonly used to evaluate models that predict the side effects of drugs using their molecular features. However, the baseline AUC and AUPR depend on the statistical properties of the ground truth. We analyze this dependence and ask: to what degree do models actually benefit from molecular fingerprints?
Bioinformatics
Studies have genotyped and measured the gene expression levels of many people. Using this data, one can investigate how genotype influences gene expression, useful for understanding complex traits and diseases. We model gene expression, accounting for interactions among genetic markers. By doing so, we more accurately predict the expression of a large subset of genes.
Forecasting
Thunderstorms can cause many power outages in a short period. Predicting these outages is challenging using models that summarize the weather over the entire course of the storm. Instead, we develop a framework for models to learn the dynamics of thunderstorm-caused outages directly from hourly weather forecasts.