Quantum Equilibrium Reactions - The Next Evolution of GTO Solvers

At the end of April, GTO Wizard introduced a new solver feature, Quantum Response Equilibrium (QRE). This is not an invention of the GTO Wizard developers, but a concept from game theory that appeared back in 1995.

Traditional Nash Equilibrium solvers assume perfect play by our opponent and ignore non-GTO lines. A new term has recently emerged for the latter: ghost lines. When real opponents make unexpected moves, the solving process of regular servers is interrupted. GTO Wizard AI is the first solver to implement the QRE feature. The feature allows you to derive an accurate strategy against non-optimal opponents without resorting to Node-lock.

Poker theory expert Tom Boshoff compares QRE and Nash in a 30-minute video and shows the advantages of the former using poker hands as an example. Here's a quick rundown of the main points.

The first commercial GTO solvers appeared on the market in 2015, and for the last decade, Nash equilibrium has reigned supreme as the gold standard of optimal poker strategy. Today, we challenge that paradigm by introducing quantal response equilibrium—a groundbreaking new approach for solving optimal poker. QRE represents the next evolution of GTO solving and will soon become the state of the art.

Let me show you why.

The Problem With Nash Equilibrium

This is the Nash equilibrium strategy for a 30-big blind heads-up pot. Everything is optimized and finely balanced, like the gears of a Swiss watch. But what happens when your opponent takes that Swiss watch and smashes it on the ground? What happens when they make a move that has a 0% frequency, representing no hands whatsoever?

This is what happens—a ghost line.

This is the fundamental problem with Nash equilibrium solvers: they have very poor responses to nodes that aren't supposed to happen.

So what's going on here? Why is the solver calling with Ten-Six and folding Ace-Nine? What is this ridiculous strategy?

What you're looking at is essentially a node in the game tree that has not converged. The solver hasn't finished solving the strategy here. You see, very early in the solving process, the small blind figured out, "Hey, I never have to shove here." And as soon as that happened, the big blind stopped improving their response to the shove.

What we're left with is this barely converged strategy.

Now here's the issue: in reality, your opponents will take non-GTO lines. In reality, they might just open-shove here. And it would be nice if we had a good strategy against that without having to node lock or force the small blind to take some Astrategy—without making guesses about what their range ought to look like.

That is the problem that quantal response equilibrium solves.

Now I'll solve this spot using GTO Wizard AI, which uses quantal response automatically.

How QRE Solves This Problem

Note that anytime you're doing any sort of custom solving from now on, it's going to use QRE. The default solutions—the pre-solved spots—will continue to use Nash. So I'll load up the quantal response equilibrium.

The first thing you'll notice is that it looks basically the same in common spots. In regular spots, QRE and Nash look identical. Where QRE differs is in the uncommon spots—in the ghost lines.

So when the small blind shoves here, instead of getting noise, we get a nicely converged solution where every hand that calls is a plus-EV call. In addition, if I click the ranges tab, I can actually see what the small blind’s range ought to look like. Apparently, it's a lot of vulnerable pocket pairs, strong Ace-X hands, and some trash with decent blockers—kind of what you'd expect the range to look like in this spot.

Now, I didn’t tell the small blind to use this range. I didn’t node lock it. I didn’t force it to bet. It came up with this on its own.

Practical Benefits of QRE

So, how does this help me as a poker player?

There are a number of key benefits.

Increased EV:
Because QRE handles ghost lines better, it's going to capture more EV against real-world mistakes—without needing to node lock. You can node lock if you want, but you don’t need to. That’s all baked into the rationality framework of QRE. And because of that, it consistently outperforms Nash equilibrium against imperfect opponents. Imperfect opponents sometimes take non-GTO lines—and, well, frankly, all of your opponents are at least a little bit imperfect. That leads to better real-world results.
Find More Leaks and Solutions:
Another cool benefit is that you can see what rational mistakes look like and learn how to punish them.
Superior Training:
QRE also leads to a better training experience. In a regular GTO solver, if you land in a spot that’s not supposed to happen, the solver just has no idea what to do. It looks at you like, “How did you get here with Ace-Queen, man? That’s not a line—I don’t know what to do.” Quantal response, on the other hand, knows what to do. It has a well-defined strategy even in these ghost lines.
Quicker Solves:
QRE also tends to lead to faster solve times on large, complicated game trees. So if you're the kind of person who likes to solve with a bunch of different bet sizes at once, QRE is going to be for you. In general, we’ve improved our neural network architecture to produce sharper, less exploitable strategies earlier in the solving process. On early streets, QRE is about 25% less exploitable, and we’ll cover that in our benchmarks.
Rock Solid Stability:
Lastly, the real motivation behind QRE is that it has this special quality called perturbation stability. You can poke at the equilibrium, and it remains stable—even in wildly complicated multi-way spots where many different equilibria might exist. So QRE is going to be a key component for multi-way solving and a bunch of other innovations we plan to roll out.

Hang in there—2025 is going to be a really exciting year.

Example of QRE In Action on a Real Hand

Let me give you a real-world example.

Seven months ago, somebody posted this to the r/pokertheory subreddit: “Can’t get GTO Wizard to give me an answer. Villain limp-called a 4x raise heads-up, then called a 70% bet. Then, versus my check, he rips it in. I can’t remotely figure out what to do in these spots, and GTO Wizard refuses to balance a realistic range.”

So he's sitting there with , facing this 4x overbet on a flush-completing turn, thinking: “What the hell am I supposed to do here?”

And in the past, there just wasn’t a good solution to this. You either had to node lock your opponent’s strategy—guess what they’re shoving—or just walk into some ghost line that the solver doesn’t know how to handle.

But now, we actually have a proper solution. Let me first show you what the Nash response looks like to this shove.

Nash Equilibrium Response

Here’s the setup: we’re in a heads-up spot. Villain, in the small blind, limps. We raise 4x, and they call.

Flop comes .
We bet 75% pot. This is a spot where solvers often mix between overbets and small bets, mostly leaning toward small bets, but 75% is fine.
They call.
Turn is the , which completes the flush draw.
They check, and then villain rips it all in—450% of the pot.

It may come as a surprise to you, but betting 450% of the pot is not an ideal strategy here. In fact, when they shove, they have no hands. The selected line is rarely used in GTO, so we get an error: “Solution may be inaccurate.”

If we go into what our opponent’s range looks like, we can see it’s... nothing.

So what is the best response to a nothing range? Well, we end up with a strategy that just hasn’t converged.

If I look at the equity, it says everything is 50%. It’s like the old saying: “It’s 50/50—either he has it, or he doesn’t.”

This is just the solver giving up, essentially. It’s an error based on the fact that the small blind has no range in this spot.

Even worse, it’s not a very good strategy. We’re folding some flushes. That’s pretty odd.

If we look at two pair hands, you’d think having the should make you call more than, say, . Or if you have top pair with the , that hand should pure call—or at least call more than other top pair hands.

But none of that is happening. Again, this spot just hasn’t converged. The small blind shouldn't be shoving here at all.

So that’s the Nash response. And it’s not good.

QRE Response

Again, they shouldn’t be shoving here. This is a bad line. But now, with QRE, we actually get a much better strategy.

First of all, we’re not folding flushes against this size.

If we take a look and compare two pair, we see things like: if you have the , you call more often than if you don’t. If we look at top pair hands, we see that calls more often than without a .

These are the obvious, intuitive adjustments you’d expect. And now we actually see them in the output.

We can also see the equity and expected value, which are clearly defined. Everything is nicely converged, even in this ghost line.

Even though the small blind shouldn’t be shoving, we still get a strong, rational response.

And even better, we can now look at what their perceived range ought to look like.

If I click on the ranges, we can actually see what they would shove if they were to take this action. On the right-hand side, we see shoves with bottom set—pocket twos. Interesting choice, since you’re going to get called by a lot of flushes, but you have outs. We see pocket threes—just a pure bluff—and pocket deuces.

There’s about 30% flushes, 16% sets. For bluffs, we see hands like —combo draws—and then a lot of top-end flushes.

So it’s a kind of realistic-looking strategy.

And that’s the benefit of QRE. It’s not just that we get good responses to ghost lines. We also get visibility into what a plausible mistaken strategy might look like.

Technical Explanation: How QRE Works

First, understand this: there are multiple paths to perfection.

People talk about GTO as if it’s one singular unexploitable strategy. In reality, there are many strategies that fall within even the strictest accuracy bounds.

Say we solve to 0.1% of the pot. That leaves a small margin of wiggle room—and within that space, there are lots of valid strategies. Unlike Nash equilibrium, which assumes perfect rationality, QRE allows for a small degree of irrationality. It intentionally makes tiny mistakes, defined by a constant we call lambda (λ). The higher the lambda, the more irrational the model. QRE uses a softmax function to choose actions, and lambda is part of that.

In Nash equilibrium, lambda equals zero—perfect rationality. QRE starts with a higher lambda and slowly reduces it (this is called annealing) as the solve progresses. You don’t have to anneal it, but it helps the solver converge faster. So we start with mistakes, then reduce them. But along the way, the solver learns how to respond to those mistakes—and remembers.

To illustrate this, I built an interactive calculator (Editor's Note – test this for yourself at with this spreadsheet from GTO Wizard). This is going to show how QRE works.

Let’s say you’re on the river with a hand that can either check and always lose, or bluff and make 5% of the pot.

In a classic Nash solver, it would always bluff—because that’s the higher EV play.

But in QRE, it sometimes chooses the mistake. The frequency depends on lambda.

If lambda is very high, the strategy becomes random. If lambda is low (like 0.1), it mostly chooses the right action but still occasionally makes a small mistake.

In practice, we only use a very low amount of irrationality—just enough to create well-rounded strategies that handle ghost lines.

I thought this really helped me understand what was going on with QRE.

You might also wonder how it works when there are more than two available actions. So I built a second calculator for that situation—specifically for spots where you’re facing a bet.

In this case, you have three options: fold, call, or raise. I initially set all expected values to zero, so the model takes each action equally. Everything is perfectly split.

Now let’s say raising is clearly profitable, and calling is clearly not. In a rational model, you’d always raise because it's the highest EV action—and that’s exactly what Nash equilibrium would do.

But with an irrational model, sometimes it will still choose to call or fold. At a low frequency, these mistakes still occur. As we solve, these imperfections get reduced, but the solver remembers how to respond to them.

Of course, this is only a small part of what goes into QRE. There’s a lot more happening under the hood—some of it beyond my technical knowledge—but I still found it helpful to get a feel for how mistake probabilities work in practice. If you’re a spreadsheet nerd like me, I think you’ll appreciate it too.

Quantal Response Equilibrium is the natural next step in the evolution of GTO strategy.

While Nash equilibrium optimizes for standard lines, QRE optimizes every decision—including the ones that aren't supposed to happen.

Put simply, this algorithm outperforms Nash against opponents who make mistakes. And since all of your opponents make mistakes, that means it outperforms in the real world.

But this upgrade isn’t just about responding better to ghost lines. QRE is also a foundational step toward solving more complex formats, like multi-way pots, and enables several new projects our engine team is actively working on.

You can always buy GTO Wizard in the GipsyTeam store: we support multiple convenient payment methods and provide healthy cashback.

Quantum Equilibrium Reactions – The Next Evolution of GTO Solvers