Predicting March Madness with an ML Model

March Madness is one of the most unpredictable events in sports. Every year millions of brackets get busted in the first round. I wanted to see if I could do better than gut feeling, so I built a logistic regression model from scratch and trained it on 572 college basketball games.
The idea was simple. Take two teams, compute the gap between them across six measurable dimensions, and let the model learn how much each dimension matters for predicting who wins. No neural networks, no ensemble methods. Just a clean sigmoid curve mapping feature vectors to win probabilities.
I settled on six features after experimenting with different combinations. Spread is the raw point spread between teams. Offense gap captures the difference in offensive efficiency ratings. Defense gap does the same for the defensive side. Win % gap looks at how often each team actually wins across the season. Experience gap measures games played, since teams with more reps tend to perform better under pressure. And schedule strength accounts for the quality of opponents each team has faced, so a 12-win team from a strong conference gets more credit than one from a weak one.
Each feature gets its own learned weight, plus a bias term, giving 7 total parameters. The model trains over 300 epochs using gradient descent with a learning rate of 0.01. Each epoch it makes predictions on all 572 games, computes the binary cross-entropy loss (which penalizes confident wrong predictions harder than uncertain ones), and nudges each weight in the direction that reduces error.
The learned weights after training:
- Spread: +0.0983 (point spread between teams)
- Offense Gap: +0.0971 (raw offensive rating gap)
- Defense Gap: +0.0995 (raw defensive rating gap)
- Win % Gap: +0.1196 (season win rate difference)
- Experience Gap: +0.1012 (games played difference)
- Schedule Strength: +0.0997 (overall power rating gap)
- Bias: +0.0833 (base offset before features)
The weights are pretty evenly distributed, which tells me all six features carry real signal. Win percentage gap came out as the strongest at +0.1196, which makes intuitive sense. Teams that win more tend to keep winning, especially in high-pressure tournament settings. Defense gap and schedule strength are close behind, reinforcing the old saying that defense wins championships and strength of schedule matters more than raw record.
The core math is straightforward. Logistic regression maps any input to a probability between 0 and 1 using the sigmoid function: P(win) = 1 / (1 + e^(-w·x)). The sigmoid is that S-shaped curve you see in textbooks. When the weighted sum of features is large and positive, the probability approaches 1. When it's large and negative, it approaches 0. Near zero, it's a coin flip.
Training accuracy landed at 78.1%, which I was happy with for a model this simple. More importantly, the calibration is solid. When the model predicts a 75% win probability, those teams actually win about 75% of the time. That calibration matters more than raw accuracy because it means the probabilities are trustworthy, not just the binary predictions.
At inference time, every prediction in the app runs through the same pipeline. Take two teams, compute their six feature gaps, multiply by the learned weights, add the bias, pass through the sigmoid function, and you get a win probability. The whole thing runs client-side so predictions are instant.
There are obvious ways to improve this. Adding more features like turnover rate, free throw percentage, or recent form would probably push accuracy into the low 80s. Using a more complex model like a random forest or gradient-boosted trees could capture non-linear relationships between features. But part of the point was to see how far you can get with the simplest possible approach, and 78.1% with a single sigmoid is a pretty good answer to that question.