Code

Most of our papers have links to code. I'm only including a few links here.

Computing Rashomon Sets

Rashomon sets of Generalized Additive Models (GAMs) (code) | (paper)

Rashomon sets of sparse decision trees (TreeFARMS algorithm) (code) | (paper)

Optimal Sparse Decision Tree code

Fast Sparse Decision Tree Optimization via Reference Ensembles (GOSDT+Guesses) (code) | (paper) | (speed up using guesses): Creates optimal sparse decision trees quickly. Remember not to remove the regularization term! The predecessor to this algorithm is OSDT.

Sparse Generalized Additive Models code

Fast Sparse Generalized Linear and Additive Classification (code) | (paper): Produces sparse additive models quickly for classification or regression. Uses coordinate descent.

OKRidge (paper) | (code): Produces sparse solutions for regression with squared loss using beam search.

Risk Scores

FasterRisk: Fast and Accurate Interpretable Risk Scores (code) | (paper): Creates risk assessment scoring systems, which are linear models with integer coefficients that estimate risk.

The predecessor of FasterRisk is RiskSLIM, and its predecessor was SLIM.

Learning Optimized Risk Scores from Large-Scale Datasets (RiskSLIM) (code) | (paper): Creates risk assessment scoring systems, which are linear models with integer coefficients that estimate risk. This code is slower than FasterRisk but can incorporate constraints and get provable optimality. Try using FasterRisk instead.

Supersparse Linear Integer Models (SLIM) (matlab code) | (python code) | (matlab code) | (paper) | (bib): For building scoring systems, which are linear models with integer coefficients. Part of winning entry for 2016 INFORMS Innovative Applications in Analytics Award. Note that this code is slow and the previous two algorithms are better.

Dimension Reduction code

PaCMAP for Dimension Reduction (code) | (paper): Winner of the 2023 John M. Chambers Statistical Software Award from the American Statistical Association.

Or's of And's (Disjunction of Conjunctions) code

Bayesian Or's of And's (code and coupon data) | (data on UCI repo) | (paper) | (bib) | (code by Ritwik Mitra, Emily Dodwell, Elena Khusainova, Deirdre Paul): For classification, an alternative to decision trees, inductive logic programming and associative classification.

Box Drawings for Learning with Imbalanced Data (matlab code) | (paper) | (bib): For imbalanced classification with real-valued features.

Variable Importance for the Rashomon Set

Rashomon Importance Distribution (code) | (paper): Averages variable importance over bootstrap samples and the Rashomon set of models for each bootstrap.

Variable Importance Clouds: A Way to Explore Variable Importance for the Set of Good Models (code) | (paper): For visualizing the importance of variables from the Rashomon set in a space where variable importance for each variable is on the axes.

MCR - Model Class Reliance (code) | (paper): For assessing variable importance of a model class for a dataset.

Matching for Causal Inference

FLAME - Fast Large Almost Matching Exactly DAME - Dynamic Almost Matching Exactly FLAME-IV - Almost Matching Exactly with Instrumental Variables MALTS - Matching After Learning to Stretch AHB - Adaptive Hyperboxes (code) | (CRAN site): For large scale interpretable matching in causal inference. Honorable mention for the 2022 John M. Chambers Statistical Software Award from the American Statistical Association.

Interpretable Neural Networks

ProtoPNet Interpretable Prototype Neural Networks (This Looks Like That) (code) | (bib) | (paper)

ProtoConcepts (This Looks Like Those) (code) | (paper)

Concept Whitening: Places the unique information about a concept along an axis in the latent space. (code) | (paper)

Rule Lists and Falling Rule Lists

Certifiably Optimal RulE ListS (CORELS) (code) | (R-bindings by Dirk Eddelbuettel) (paper): For classification, an alternative to decision trees. Predecessors to this code are Bayesian Rule Lists (BRL) and Scalable Bayesian Rule Lists (SBRL) (R interface, C code - Creative Commons License). The CORELS code is more efficient, so please use that.

Optimized Falling Rule Lists and Softly Falling Rule Lists (paper) | (code) | (bib): For classification where the probabilities decrease along the list.

Falling Rule Lists (FRL) (python code) | (paper) | (bib): For classification where the probabilities decrease along the list. This algorithm is Bayesian, based on sampling. The one above uses optimization.

Name-based Ethnicity Classification

EthnicIA (code) | (paper)

Superresolution

Photo Upsampling via Latent Space Exploration of Generative Models (PULSE) (project page with code and online demo) | (paper)

Recidivism

Age of Unfairness (code) | (paper)

Summary Explanations

Globally Consistent Summary-Explanations (code) | (paper)

Recovery Curves

Recovery Curves (code) | (paper)

Multi-armed Bandits with Time Series Information

Regulating Greed Over Time (code) | (paper)

Crime Series Analysis

Series Finder (code) | (paper): For detecting crime series.

ROC Flexibility Data

ROC Flexibility Data Used for several ranking papers (data)