Farm Fresh Insights

Sign in Subscribe

Latest — 08 Jul 2025

(Better) Split Decisions: Toward A More Structurally Stable CART

Classification and Regression Trees (CART) condense data into one transparent decision tree. Their weakness is structural instability: a marginal change to the training data—sometimes one row—can replace an early split, such as “age > 65” with “income > $50 k,” propagating a completely new tree. This instability makes

More issues

Measuring Tree Stability

What is the expected stability, E[Sim(T₁, T₂)], where T₁ and T₂ are trees trained on perturbed (or bootstrapped) versions of the same dataset? Tree stability under data perturbation can be decomposed into interpretive stability (how trees explain decisions) and behavioral stability (what trees predict). Interpretive Stability Trees can

What's Wrong? Adversarial LLM Judges With Their Own Evaluation Criteria

How should we evaluate what large language models say? Metrics like BLEU or exact match don't work well when answers are open-ended or involve reasoning. Human judgments are better—but they're costly and inconsistent. So increasingly, researchers have turned to LLMs themselves to act as evaluators.

WAR For Cricket

In baseball, WAR (Wins Above Replacement) is a comprehensive metric to quantify a player’s contribution to team success, relative to a replacement-level player. The idea has slowly migrated to other sports. In cricket, however, WAR remains underdeveloped. Most public evaluations still rely on averages, strike rates, or wickets — useful,

Pareto ML Deployments

In machine learning, a common deployment strategy is to replace an existing model with one that performs better overall. Another common strategy refines this approach by limiting deployment to user segments or regions where the improvements are clear. Both approaches allow regressions: new errors on cases that the old model

(Don't) Forget About It: Toward Pareto Improving GD

Machine learning models don't improve like traditional software. When we "update" a model, it sometimes begins to mishandle cases it previously solved—an outcome known as regression or “forgetting.” This issue is well-studied in continual learning, where models learn multiple tasks sequentially (French, 1999). Standard solutions

Greedy is Good. Less Greedy May be Better.

Forward stepwise regression, agglomerative hierarchical clustering, and CART rely on a simple principle: make the best local choice at each step. Greedy choices can also be optimal when problems possess the greedy choice property—where globally optimal solutions can be reached through locally optimal decisions, as in minimum spanning trees

Hungary For More? Optimal 1-to-Many Matching for Causal Inference

The Hungarian algorithm (Kuhn–Munkres) efficiently finds optimal one-to-one matches between treated and control units by minimizing total matching cost (typically Euclidean distance in covariate space). It has been used for estimating treatment effects via matching. But it has a limitation: it is strictly one-to-one. In many causal inference settings,

Optimizing Early Trajectories in K-Means Clustering: Lookahead Initialization for K-Means

K-Means performance depends heavily on how clusters are initialized. While k-means++ improves over random starts by spreading centroids apart, it’s still greedy and can lock into suboptimal configurations—especially in noisy or high-dimensional data. This post explores a simple tweak: lookahead initialization. For each candidate seed, we simulate a