~ Addison Kline & Ryan Heaton

diagram.png

Figure: Overview of the model’s forward pass. Context variables (pitcher ID, count, game state) are processed through a Context Adapter and fused with token embeddings, then fed through 12 xLSTM blocks to predict the next token in the sequence.


Can you predict what a baseball pitcher will throw next? Not just the pitch type—but the speed, the spin, the trajectory, where it crosses the plate, and what happens when it gets there? We trained a 20-million parameter neural network on nearly a decade of MLB pitch data to find out. The model achieved 65.8% accuracy across all predicted variables: 37 percentage points better than always guessing the most common value. But the more interesting question isn’t how well it predicts, it’s what it can and can’t predict. Some aspects of a pitch are nearly deterministic once you’ve seen a few from that pitcher (forward velocity: 95% accuracy). Others are fundamentally unpredictable, even for the pitcher themselves (plate location: 23%). The model learned real baseball strategy, picked up on pitcher mechanics, and figured out which parts of the game have signal and which are just noise. This is the story of how we built it.


Background

Addison first began PitchPredict as a side project in October 2024, around the time of the World Series. Prior to that, he had extensive experience mining and analyzing baseball data in Python, including projects like projection algorithms for players and teams, daily MLB game odds, and up-to-date league leaderboards with custom statistics. He runs a sabermetrics blog, baseball-analytica.com, where these projects (and much more) can be found.

Ryan has been working in ML since 2018, with notable previous work training medical pathology AI. He now runs Charon Labs LLC, where his team (of which Addison is a founding member) applies traditional ML concepts to swarms of LLM agents to enable new capabilities and higher reliability.

Development History

The first versions of PitchPredict used a similarity-based algorithm—calculating scores between pitches and contexts, then predicting outcomes based on the most similar historical pitches.

The code is straightforward Python, leveraging numpy, pandas, and pybaseball. PitchPredict has a basic API and as of December 2024 is publicly available on PyPI (pip install pitchpredict).

In early 2025, Addison began developing a neural network algorithm alongside the similarity approach. This would leverage deep learning and GPU power to predict pitches using a custom architecture. Though he works in AI, lower-level ML and model architecture isn’t his forte, and the initial model performed poorly. This discouraged further work on the project for several months.

In November 2025, Addison returned to PitchPredict unexpectedly. At work, the team was building agent environments to test models on—and Addison’s job was building an environment simulating a baseball game. An AI agent could play as a pitcher trying to get batters out, a batter trying to score runs, or a manager making the moment-to-moment decisions. The simulation needed to handle state transitions (the dynamics of the simulated baseball game), and rather than reinvent the wheel, Addison revisited PitchPredict.

The existing API didn’t take enough parameters for what the simulation needed. Armed with a concrete goal, Addison cloned the repo and began rebuilding—updating the API, rewriting the similarity algorithm, and reconsidering the neural network approach. When he mentioned this to Ryan, it sparked a collaboration. Ryan suggested that rather than a primitive feedforward network, we could use an xLSTM architecture—and offered to help build it.

The division of labor made sense: Addison would handle the API, data pipeline, and pitch-level tokenization; Ryan would build the xLSTM and tune hyperparameters.

The choice of xLSTM over Transformers came down to hardware. Training locally on a 6× RTX 4090 system meant memory was the bottleneck. xLSTM’s linear memory scaling with sequence length—versus the quadratic scaling of attention—made it the right choice for our setup.

Together we developed a tokenization protocol for pitches. Rather than creating tokens for every possible pitch, we encoded pitch attributes (type, speed, spin, location, etc.) as separate tokens within each pitch. Since LSTMs predict sequences, this approach fit naturally. Throughout November and December, we mined gigabytes of MLB pitch data and experimented with architectures, implementations, and hyperparameters.

The Model in Brief

At its core, PitchPredict is a neural network trained to predict sequences—specifically, sequences of pitches. Neural networks learn patterns from data, and this particular flavor (called an xLSTM, for “extended Long Short-Term Memory”) is designed for sequential prediction. It processes tokens one at a time, building up an internal representation of what it’s seen so far, then uses that representation to predict what comes next. The “extended” part refers to architectural improvements that allow longer sequences without running into memory constraints—important when a pitcher might throw 20+ pitches in a session.