Papers & Code

2017

FLAME: A Fast Large-scale Almost Matching Exactly Approach to Causal Inference. Draft on ArXiv, 2017

Sudeepa Roy, Cynthia Rudin, Alexander Volfovsky, Tianyu Wang.

A Practical Risk Score for EEG Seizures in Hospitalized Patients. JAMA Neurology, 2017 (accepted)

Aaron F. Struck, Berk Ustun, Andres Rodriguez Ruiz, Jong Woo Lee, Suzette LaRoche, Lawrence J. Hirsch, Emily J Gilmore, Jan Vlachy, Hiba Arif Haider, Cynthia Rudin, M Brandon Westover.

Bayesian Rule Sets for Interpretable Classification, with Application to Context-Aware Recommender Systems. Journal of Machine Learning Research (JMLR), 2017 (accepted)

Tong Wang, Cynthia Rudin, Finale Doshi, Yimin Liu, Erica Klampfl, and Perry MacNeille.

Certifiably Optimal Rule Lists for Categorical Data. KDD 2017 (oral), Longer version on ArXiv. (code) | (bib)

Elaine Angelino, Nicholas Larus-Stone, Daniel Alabi, Margo Seltzer, and Cynthia Rudin

Scalable Bayesian Rule Lists. ICML 2017, Longer version on ArXiv. (bib) | (preferred code) | (R package)

Hongyu Yang, Cynthia Rudin, and Margo Seltzer
-Winner of Statistical Learning and Data Mining Student Paper Competition, American Statistical Association, 2016.

Learning Cost Effective and Interpretable Treatment Regimes in the Form of Rule Lists. AISTATS, 2017. (bib)

Himabindu Lakkaraju and Cynthia Rudin
-Finalist for 2017 INFORMS Data Mining Best Paper Competition.

Learning Optimized Risk Scores from Large-Scale Datasets. KDD 2017, Longer version on ArXiv. (code)

Berk Ustun and Cynthia Rudin.
-Winner of 2017 INFORMS Computing Society (ICS) Best Student Paper Prize

The World Health Organization Adult Attention-Deficit/Hyperactivity Disorder Self-Report Screening Scale for DSM-5. JAMA Psychiatry, April 2017. (NPR article) | (bib) | (editorial)

Berk Ustun, Lenard A. Adler, Cynthia Rudin, Stephen V. Faraone, Thomas J. Spencer, Patricia Berglund, Michael J. Gruber, Ronald C. Kessler.

2016

CRAFT: ClusteR-specific Assorted Feature selecTion. AISTATS, 2016. (code) | (bib)

Vikas Garg, Cynthia Rudin and Tommi Jaakkola.

Cascaded High Dimensional Histograms: An Approach to Interpretable Density Estimation for Categorical Data.
Working paper on arXiv, 2016. (bib)

Siong Thye Goh and Cynthia Rudin

Bayesian Or's of And's for Interpretable Classification with Application to Context Aware Recommender Systems. ICDM, 2016.
(bib) | (code)

Tong Wang, Cynthia Rudin, Finale Doshi, Yimin Liu, Erica Klampfl, and Perry MacNeille

Prediction Uncertainty and Optimal Experimental Design for Learning Dynamical Systems. Chaos,Volume 26, Number 6, 2016. (bib)

Benjamin Letham, Portia A. Letham, Cynthia Rudin, and Edward Browne.

Regulating Greed Over Time. Working paper on arXiv, 2016.

Stefano Traca and Cynthia Rudin.
-Finalist for 2015 IBM Service Science Best Student Paper Award.
-Winner of best paper award, INFORMS 2016 Data Mining & Decision Analytics (DMDA) Workshop.

The Big Data Newsvendor: Practical Insights from Machine Learning Analysis. Working Paper on SSRN, 2015.

Cynthia Rudin and Gah-Yi Vahn.

A Computational Model of Inhibition of HIV-1 by Interferon-Alpha. PLoS ONE, 2016.

Edward Browne, Benjamin Letham, and Cynthia Rudin.

Analytic Research Foundations for the Next-Generation Electric Grid. The National Academies Press, 2016.

John Guckenheimer, Thomas Overbye (Co-chairs), and committee.

Interpretable Classification Models for Recidivism Prediction. Journal of the Royal Statistical Society, September 2016.

Jiaming Zeng, Berk Ustun, and Cynthia Rudin.
- This paper won the 2015 Undergraduate Statistics Research Project Competition (USRESP) sponsored by the American Statistical Association (ASA) and the Consortium for Advancement of Undergraduate Statistics Education (CAUSE).

Bayesian Inference of Arrival Rate and Substitution Behavior from Sales Transaction Data with Stockouts. KDD, 2016. (YouTube teaser)

Benjamin Letham, Lydia M. Letham, and Cynthia Rudin.

The Factorized Self-Controlled Case Series Method: An Approach for Estimating the Effects of Many Drugs on Many Outcomes. Journal of Machine Learning Research, 2016. (link)

Ramin Moghaddass, Cynthia Rudin, and David Madigan.

2015

Clinical Prediction Models for Sleep Apnea: The Importance of Medical History over Symptoms. Journal of Clinical Sleep Medicine, 2015. (editorial)

Berk Ustun, Brandon Westover, Cynthia Rudin, and Matt Bianchi.

Learning Classification Models of Cognitive Conditions from Subtle Behaviors in the Digital Clock Drawing Test. Machine Learning, 2015. (bib) (talk)

William Souillard-Mandar,  Randall Davis,  Cynthia Rudin,  Rhoda Au,  David J. Libon,  Rodney Swenson, Catherine C. Price,  Melissa Lamar,  Dana L. Penney.
-Accompanies winning entry of the 2016 INFORMS Innovative Applications in Analytics Award.

Supersparse Linear Integer Models for Optimized Medical Scoring Systems. Machine Learning, 2015.
(bib) | (matlab code) | (python code)| (Earlier Version) | (AAAI version with Stefano Traca)

Berk Ustun and Cynthia Rudin
-Accompanies winning entry of the 2016 INFORMS Innovative Applications in Analytics Award.

A Bayesian Approach to Learning Scoring Systems. Big Data, 2015. (bib)

Seyda Ertekin and Cynthia Rudin

Reactive Point Processes: A New Approach to Predicting Power Failures in Underground Electrical Systems Annals of Applied Statistics, 2015. (supplement) | (bib) | (AAAI late breaking track version)

Seyda Ertekin, Cynthia Rudin, and Tyler McCormick.

Falling Rule Lists. AISTATS, 2015. (bib) | (python code)

Fulton Wang and Cynthia Rudin
-Winner of Statistical Learning and Data Mining Student Paper Competition, American Statistical Association, 2015.
-Finalist for Data Mining Best Student Paper Award, INFORMS 2015.

Building Interpretable Classifiers with Rules using Bayesian Analysis. Annals of Applied Statistics, 2015.
(bib) | (supplement) | (python code)

Benjamin Letham, Cynthia Rudin, Tyler McCormick and David Madigan
-Winner of Data Mining Best Student Paper Award, INFORMS 2013.
-Winner of Statistical Learning and Data Mining Student Paper Competition, American Statistical Association, 2014.
Shorter versions of this have appeared in the AAAI 2013 late breaking track, and at the KDD 2014 workshop on Data Science for Social Good.

Causal Falling Rule Lists. Working Paper, 2015.

Fulton Wang and Cynthia Rudin.

Robust Testing for Causal Inference in Natural Experiments. Working Paper, 2015.

Md. Noor-E-Alam and Cynthia Rudin

Finding Patterns with a Rotten Core: Data Mining for Crime Series with Core Sets. Big Data, 2015. (bib) | (slides)

Tong Wang, Cynthia Rudin, Daniel Wagner, and Rich Sevieri.
-This paper won second place in the “Doing Good with OR” competition at INFORMS, 2015.

Tire Changes, Fresh Air and Yellow Flags: Challenges in Predictive Analytics for Professional Racing. Big Data, 2014. (bib)

Theja Tulabandhula and Cynthia Rudin.

Robust Nonparametric Testing for Causal Inference in Natural Experiments. Working Paper, 2015.

Md. Noor-E-Alam and Cynthia Rudin

2014

Modeling Recovery Curves With Application to Prostatectomy. Working paper, 2014.

Fulton Wang, Tyler McCormick, Cynthia Rudin, and John Gore.
- Winner of Best Poster Competition, Statistical Learning and Data Mining section (SLDM) of the American Statistical Association, 2014.

On Combining Machine Learning with Decision Making. Machine Learning, 2014. (bib) | (code)

Theja Tulabandhula and Cynthia Rudin.

Generalization Bounds for Learning with Linear, Polygonal, Quadratic, and Conic Side Knowledge. Machine Learning, 2014.
(bib) | (ISAIM version)

Theja Tulabandhula and Cynthia Rudin.

Robust Optimization using Machine Learning for Uncertainty Sets. International Symposium on Artificial Intelligence and Mathematics (ISAIM), 2014. (bib) | (longer version - working paper)

Theja Tulabandhula and Cynthia Rudin.

Box Drawings for Learning with Imbalanced Data. KDD, 2014. (bib) | (code)

Siong Thye Goh and Cynthia Rudin

The Bayesian Case Model: A Generative Approach for Case-Based Reasoning and Prototype Classification. NIPS, 2014. (code) | (bib)

Been Kim, Cynthia Rudin and Julie Shah

Discovery with Data: Leveraging Statistics with Computer Science to Transform Science and Society.
American Statistical Association, July 2, 2014. (bib)

Cynthia Rudin, David Dunson, Rafael Irizarry, Hongkai Ji, Eric Laber, Jeffrey Leek, Tyler McCormick, Sherri Rose, Chad Schafer, Mark van der Laan, Larry Wasserman, Lingzhou Xue.

A Statistical Learning Theory Framework for Supervised Pattern Discovery. SIAM Conference on Data Mining (SDM) 2014. (bib)

Jonathan Huggins and Cynthia Rudin.

Learning About Meetings. Data Mining and Knowledge Discovery, 2014. (bib) (AAAI Late Breaking Track version)

Been Kim and Cynthia Rudin.

Approximating the Crowd. Data Mining and Knowledge Discovery, 2014. (bib) | (appendix)

Seyda Ertekin, Cynthia Rudin, Haym Hirsh.
Shorter versions of this have appeared at the NIPS Workshop on Computational Social Science and the Wisdom of Crowds (paper, bib, Haym's Slides), and Collective Intelligence (paper and bib)

The Latent State Hazard Model, with Application to Wind Turbine Reliability. Annals of Applied Statistics, 2015.

Ramin Moghaddass and Cynthia Rudin.

Analytics for Power Grid Distribution Reliability in New York City. Interfaces, 2014. (bib)

Cynthia Rudin, Seyda Ertekin, Rebecca Passonneau, Axinia Radeva, Ashish Tomar, Boyi Xie, Stanley Lewis, Mark Riddle, Debbie Pangsrivinij, Tyler McCormick.
- Accompanies winning entry of the 2013 INFORMS Innovative Applications in Analytics Award.

Modeling Weather Impact on a Secondary Electrical Grid. International Conference on Sustainable Energy Information Technology (SEIT-2014), 2014.

Dingquan Wang, Rebecca J. Passonneau, Michael Collins, Cynthia Rudin.

2013

Growing a List. Data Mining and Knowledge Discovery, 2013. (bib) | (python code)

Benjamin Letham, Cynthia Rudin and Katherine Heller.
-Featured on Boston Public Radio (WGBH) : “A New Way To Google”

Learning Theory Analysis for Association Rules and Sequential Event Prediction. Journal of Machine Learning Research, 2013. (bib) (COLT 2011 version and its bib)

Cynthia Rudin, Benjamin Letham and David Madigan

Sequential Event Prediction. Machine Learning, 2013. (bib)

Benjamin Letham, Cynthia Rudin, and David Madigan.

Machine Learning with Operational Costs. Journal of Machine Learning Research, 2013.
(bib) | (ISAIM version) and its (bib)

Theja Tulabandhula and Cynthia Rudin.

Machine Learning for Science and Society. Machine Learning, 2013.

Cynthia Rudin and Kiri L. Wagstaff.

Learning to Detect Patterns of Crime. ECML-PKDD, 2013. (bib) | (AAAI late breaking track version and its bib)

Tong Wang, Cynthia Rudin, Daniel Wagner, and Rich Sevieri.

The Rate of Convergence of AdaBoost. Journal of Machine Learning Research, 2013. (bib) | (COLT 2011 version and its bib)

Indraneel Mukherjee, Cynthia Rudin, and Robert Schapire.
- Solved published open problem in COLT (Computational Learning Theory).

2012

Machine Learning for the New York City Power Grid. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012. (bib)

Cynthia Rudin, David Waltz, Roger N. Anderson, Albert Boulanger, Ansaf Salleb-Aouissi, Maggie Chow, Haimonti Dutta, Philip Gross, Bert Huang, Steve Ierome, Delfina Isaac, Arthur Kressner, Rebecca J. Passonneau, Axinia Radeva, Leon Wu.
-Spotlight Paper for the February 2012 Issue.

A Hierarchical Model for Association Rule Mining of Sequential Events: An Approach to Automated Medical Symptom Prediction.
Annals of Applied Statistics, 2012. (bib)

Tyler McCormick, Cynthia Rudin, David Madigan

How to Reverse-Engineer Quality Rankings. Machine Learning, 2012. (bib) | (blog)

Allison Chang, Cynthia Rudin, Michael Cavaretta, Robert Thomas and Gloria Chou.
-Featured in Businessweek: How to Improve Product Rankings

Does AdaBoost Always Cycle? (COLT Open Problem) | (bib) | (talk)

Cynthia Rudin, Ingrid Daubechies, Robert E. Schapire.

Teaching “Prediction: Machine Learning and Statistics.” ICML Workshop on Teaching ML, 2012. (bib)

Cynthia Rudin.

Progressive Clustering with Learned Seeds: An Event Categorization System for Power Grid.
International Conference on Software Engineering & Knowledge Engineering (SEKE), 2012. (bib)

Boyi Xie, Rebecca J. Passonneau, Haimonti Dutta, Jing-Yeu Miaw, Axinia Radeva, Ashish Tomar, and Cynthia Rudin.

2011

On Equivalence Relationships Between Classification and Ranking Algorithms. Journal of Machine Learning Research, 2011. (bib)

Seyda Ertekin and Cynthia Rudin

21st-Century Data Miners Meet 19th-Century Electrical Cables. IEEE Computer, 2011. (bib)

Cynthia Rudin, Rebecca Passonneau, Axinia Radeva, Steve Ierome, Delfina Isaac.
-One of three articles featured on the cover.

Proceedings of the 2011 INFORMS Data Mining and Health Informatics (DM-HI) Workshop

Eds. Peter Qian, Yilu Zhou, and Cynthia Rudin.

A Discrete Optimization Approach to Supervised Ranking. Working paper, 2011.

Allison Chang, Cynthia Rudin, Dimitris Bertsimas
-Finalist for Data Mining Best Student Paper Award, INFORMS 2011.
Shorter version: A Discrete Optimization Approach to Supervised Ranking. INFORMS Workshop on Data Mining and Health Informatics, 2010. (bib)

Estimation of System Reliability Using a Semiparametric Model. IEEE EnergyTech, 2011. (bib)

Leon Wu, Timothy Teravainen, Gail Kaiser, Roger Anderson, Albert Boulanger, and Cynthia Rudin.

Evaluating Machine Learning for Improving Power Grid Reliability.
ICML workshop on “Machine Learning for Global Challenges,” 2011. (bib)

Leon Wu, Gail Kaiser, Cynthia Rudin, David Waltz, Roger Anderson, Albert Boulanger, Ansaf Salleb-Aouissi, Haimonti Dutta, and Manoj Poolery.

Data Quality Assurance and Performance Measurement of Data Mining for Preventive Maintenance of Power Grid.
KDD Workshop on Data Mining for Service and Maintenance (KDD4Service), 2011. (bib)

Leon Wu, Gail Kaiser, Cynthia Rudin, Roger Anderson.

Ordered Rules for Classification: A Discrete Optimization Approach to Associative Classification. Working Paper, 2011. (bib)

Allison Chang, Cynthia Rudin, and Dimitris Bertsimas.

Treatment Effect of Repairs to an Electrical Grid: Leveraging a Machine Learned Model of Structure Vulnerability.
KDD Workshop on Data Mining Applications in Sustainability (SustKDD), 2011. (bib)

Rebecca Passonneau, Cynthia Rudin, Axinia Radeva, Ashish Tomar and Boyi Xie.

2010

A Process for Predicting Manhole Events in Manhattan. Machine Learning, 2010. (bib)

Cynthia Rudin, Rebecca Passonneau, Axinia Radeva, Haimonti Dutta, Steve Ierome, Delfina Isaac. /-WIRED article | Slashdot | US News and World Report

2009

The P-Norm Push: A Simple Convex Ranking Algorithm that Concentrates at the Top of the List. Journal of Machine Learning Research, 2009. (bib)

Cynthia Rudin.
Shorter Version: Ranking with a P-Norm Push and its (bib), COLT, 2006.

Margin-Based Ranking and an Equivalence Between AdaBoost and RankBoost. Journal of Machine Learning Research, 2009. (bib)

Cynthia Rudin and Robert E. Schapire.
Shorter Version: Margin-Based Ranking and Boosting Meet in the Middle. (with Corinna Cortes and Mehryar Mohri) COLT, 2005. (bib)

Reducing Noise in Labels and Features for a Real World Dataset: Application of NLP Corpus Annotation Methods.
International Conference on Computational Linguistics and Intelligent Text Processing, 2009 (bib)

Rebecca Passonneau, Cynthia Rudin, Axinia Radeva, Zhi An Liu.

Report Cards for Manholes: Eliciting Expert Feedback for a Machine Learning Task.
International Conference on Machine Learning and Applications, 2009. (bib)

Axinia Radeva, Cynthia Rudin, Rebecca Passonneau and Delfina Isaac.
-Winner of Best Poster Award

2008 and before

Visualization of Manhole and Precursor-Type Events for the Manhattan Electrical Distribution System.
Workshop on GeoVisualization of Dynamics, Movement and Change, 11th AGILE International Conference on Geographic Information Science, 2008. (bib)

Haimonti Dutta, Cynthia Rudin, Becky Passonneau, Fred Seibel, Nandini Bhardwaj, Axinia Radeva, Zhi An Liu, Steve Ierome, Delfina Isaac.

Arabic Morphological Tagging, Diacritization, and Lemmatization Using Lexeme Models and Feature Ranking.
The 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL/HLT), 2008. (bib)

Ryan Roth, Owen Rambow, Nizar Habash, Mona Diab, and Cynthia Rudin.

Analysis of Boosting Algorithms using the Smooth Margin Function. Annals of Statistics, 2007. (bib)

Cynthia Rudin, Robert E. Schapire, Ingrid Daubechies.
Shorter versions:
Precise Statements of Convergence for AdaBoost and arc-gv. AMS-IMS-SIAM Joint Summer Research Conference, 2007. (bib)
Boosting Based on a Smooth Margin. COLT, 2004. (bib)

Re-Ranking Algorithms for Name Tagging. Human Language Technology conference - North American chapter of the Association for Computational Linguistics annual meeting (HLT-NAACL) Workshop on Computationally Hard Problems and Joint Inference in Speech and Language Processing, 2006. (bib)

Heng Ji, Cynthia Rudin, Ralph Grishman.

The Dynamics of AdaBoost: Cyclic Behavior and Convergence of Margins. Journal of Machine Learning Research, 2004. (bib)

Cynthia Rudin, Ingrid Daubechies, Robert E. Schapire.
- Solved well-known open theoretical problem as to whether AdaBoost attains maximum margins.
Shorter version: On the Dynamics of Boosting, NIPS 2003, does not contain the main result from the JMLR paper (bib)

Stability of Learning algorithms. Notes, 2003.

Cynthia Rudin.

Equilibrium Island Arrays in Strained Solid Films. Journal of Applied Physics, 1999.

Cynthia Rudin and Brian Spencer.