Farm Fresh Insights

Sign in Subscribe

Latest — 17 Jul 2025

Beam-GD

Gradient descent commits to a single direction at each step based on the local gradient. This myopic approach can be suboptimal when gradients are noisy, local geometry is misleading, or the loss landscape has multiple competing descent directions. The algorithm makes irrevocable decisions based on local information, potentially missing better

More issues

Good Enough: Satisficing in Production Machine Learning

Herbert Simon observed that managers rarely chase the global optimum. Instead, they set an aspiration level, a “good enough” performance, and quit searching once they found an option that met it. Simon called this satisficing. That habit makes sense because every decision is a trade‑off. In modeling, the benefit

Always-On Probability Calibration with Multiplicative Weights

Modern ad systems, recommendation pipelines, and risk models often rely on predicted probabilities—click-through rates (CTRs), conversion rates, or user intents. But raw model outputs are rarely well-calibrated. Left unchecked, miscalibration can distort bids, break fairness goals, or erode user trust. The Problem Suppose your model produces raw probabilities, $p_

Advertising Without Signal: The Rise of the Grifter Equilibrium

Economists credit ads with two welfare‑enhancing roles: 1. Informative – trimming search costs (Stigler 1961 (pdf)). 2. Signaling – In classic models, high-quality sellers are more willing to incur large, sunk ad costs because they expect to recoup them through future sales—especially in experience-good markets where quality is learned over

Boosting Stability: Fixing XGBoost Instability Under Row Permutation

Shuffle your training data, and XGBoost might give you a different model. Even when you keep features, hyperparameters, and random_state fixed. This behavior violates what most practitioners reasonably expect: that models should be invariant to row permutation. And can lead to silent drift, flaky tests, and spurious alerts. This,

From Autopilot to Copilot: Designing Coding Assistants For Experts

Current assistants often generate large swaths of code in a single pass. This verbosity forces developers to reverse‑engineer the AI’s intent, verify subtle corner cases, and retrofit the output to project conventions. Because design rationale is rarely surfaced, reviewers struggle to trace decisions, and tests—if provided at

A Pink Revolution in Indian Policymaking

The 2025 Union budget earmarks 8.86%* of the total expenditure for gender-related programs (see MWCD). Pair that with the fact that over the last five years, more than 14 states have introduced women-centric unconditional cash transfer programs, reaching over 110 million women—nearly a fifth of India’s adult

(Better) Split Decisions: Toward A More Structurally Stable CART

Classification and Regression Trees (CART) condense data into one transparent decision tree. Their weakness is structural instability: a small change to the training data can replace an early split, such as “age > 65” with “income > $50 k,” propagating a completely new tree. This instability makes trees unsuitable for

Measuring Tree Stability

Consider training a decision tree to predict customer churn. You split your data in half and build a model on one half. The tree identifies tenure, monthly charges, and contract type as key factors. Satisfied with the results, you rebuild the model on the other random split—but now the

What's Wrong? Adversarial LLM Judges With Their Own Evaluation Criteria

How should we evaluate what large language models say? Metrics like BLEU or exact match don't work well when answers are open-ended or involve reasoning. Human judgments are better—but they're costly and inconsistent. So increasingly, researchers have turned to LLMs themselves to act as evaluators.