← Back to notes

MCM / February 8, 2026

In MCM, we tried to estimate a number that is never publicly released

What fascinated me most about 2026 MCM Problem C was simple: the weekly fan vote is never disclosed, yet the eliminations are real. You are given incomplete information and still asked to reconstruct the structure behind it.

Core problem

Inferring fan votes from elimination outcomes

What we built

Three parallel models for comparison

Final recommendation

The percentage-based mechanism performed best overall

I like it because it is a classic “inverse problem”

This problem does not give you a complete dataset and ask you to run regression or classification. Instead, it presents partial outcomes and asks: can you infer the hidden components in between?

Fan votes are unobserved, but weekly eliminations, scoring rules, and judges’ scores are all observable. In other words, what you are given is not the answer, but a set of constraints. Your task is to construct an estimation system within these constraints that is internally consistent, logically sound, and capable of explaining subsequent outcomes.

There is something inherently compelling about producing an answer that is logically bounded, interpretable, and implementable in code, even under incomplete information.

Why we ended up with three models

Initially, we also tried to solve everything with a single method. But as we progressed, it became clear that this problem does not lend itself to a single-path solution. So we eventually divided our approach into three categories.

The first was a relatively direct optimization-based method. It is fast, transparent, and suitable for running through the entire pipeline. The second was a more theoretically rigorous Bayesian + MCMC framework, which explicitly models how fan popularity evolves over time. The third was a hybrid MAP-Laplace model, designed to balance rigor and computational efficiency.

What interested me most was not which model is “more advanced,” but the clarity that emerges when placing them side by side. Each method represents a different trade-off among accuracy, efficiency, and interpretability. They do not replace one another; they answer different aspects of the same problem.

What this modeling process really trained me in was tightening logic

It is easy in competitions like MCM to fall into the illusion that more models are better. But this time, I felt more strongly that models are just shells. What truly matters is whether you have fully understood the problem.

Why can elimination outcomes be translated into inequality constraints? Why do rank-based and percentage-based approaches lead to fundamentally different information loss? Why is temporal continuity worth modeling? If these questions are not resolved, the code becomes hollow.

I came to value a particular working state: instead of rushing toward conclusions, we repeatedly questioned whether each assumption holds, whether each step truly explains the data, and what exactly each model solves. This matters more than simply making the program run, because it trains internal consistency in thinking.

If I had to describe what MCM left me with, it would be a kind of patience toward complex problems, a preference for work that requires structure, explanation, and actual implementation.