I am currently a Graduate Student in Carnegie Mellon University’s School of Computer Science’s Language Technologies Institute. I am advised by Professor Daniel Neill and work in the Event and Pattern Detection Lab.
My Research Interests broadly include Machine Learning, Natural Language Processing, AI, and their applications to real world problems. In particular, I am interested in Probabilistic Graphical Models with Hidden Parameters, Statistical Inference, Un- and Semi-Supervised Learning Algorithms, and Bayesian Models. I like applying these algorithms to difficult real world problems such as human languages or early detection of novel disease outbreaks. I am fascinated by nontraditional and unconventional natural language datasets as they pose interesting challenges for computer scientists in terms of robustness, generality, and OOVs.
I am currently working on the application of unsupervised graphical models to disease surveillance. In particular, I am looking at the intersection of topic models and scan statistics on unstructured data for anomalous pattern detection. Low frequency terms, along with randomized initializations of probabilities, have a much greater impact on downstream performance tasks than standard applications of either set of methods on standard corpora - presenting an interesting area of study with relevance in many areas of NLP and ML. Preliminary results earned a best poster award from LTI's Student Research Symposium.
I am also leading a team of students for this year's CoNLL shared task on Grammar Correction. We are looking at a variety part-of-speech and structured prediction features for novel supervised ML methods. In addition, I am working on the WMT Quality Estimation Task using language modeling.
Prior to CMU, I graduated from Princeton University’s School of Engineering where I majored in Computer Science. My Senior Thesis at Princeton was “Summarization by Latent Dirichlet Allocation” which used a randomly selected corpus of 1,000 articles from Wikipedia and then generated 10 sentence summaries of each article. Advised by David Blei, our results were statistically significant — demonstrating better summaries than two other leading methods from the field. A copy of the thesis can be obtained under the Research Interests Section.
I also spent a couple of years in industry before returning to grad school. I worked as an Engineer at Microsoft on Dynamics CRM Online working on the back end of a large scale cloud service. My projects included the backup system, deployment, and internationalization of the product to 40 languages in 41 countries. I also spent some time working on start ups including the innovative Lighter Capital who are dying to give away some money to small businesses.