Phil Galfond Explains Game Theory Optimal in 5 Levels

Hey everybody, it's Phil. Today, I've been asked to explain Game Theory Optimal (GTO) poker in five levels of increasing complexity.

I'm starting with my six-year-old son, all the way up to a poker expert.

Game theory optimal play in any game is a balanced strategy that can't be exploited. In poker, we use GTO solvers to enhance our understanding of how the game works, even though the strategies are too complex to memorize.

GTO Level 1: My 6-Year Old Son

Imagine we're playing a game where I put a piece of candy in one of my hands, and you had to guess which one has the candy. If you got it right, you'd get the candy, otherwise, you wouldn't.

Let's say we played it again and again and again. And let's say I told you exactly what my strategy was going to be. So if my strategy was I'm going to keep it in my right hand every time, what would you guess on the first try?

Yeah, you'd guess my right hand. And what would you guess on the second try?

You'd guess my right hand.

So, that strategy for me would not be very good if I want to keep my candy.

How am I going to come up with a strategy that I tell you that you can't figure out? Do you have any guesses?

Let's say that between each turn, I went over into the bathroom and I tossed a coin. And if it landed on heads, I would put it in my right hand. And if it landed on tails, I'd put it in my left hand. Now, let's say the first time I go away, I toss a coin.

Which one would you guess, and would you know that it was in there?

No, you wouldn't.

And on the second time, which would you guess? And would you know?

No, you would never know because I was doing something called randomizing. I was flipping the coin, and you don't know what side it's going to land on, and that's how I picked which hand to put it in. So, I told you my strategy, and you still couldn't figure it out.

That is game theory optimal play.

GTO Level 2: Teenager

Sup? So, let's say we were playing rock, paper, scissors and I wanted to come up with a strategy that could not be beaten. What do you think that might be?

So, let's say I switched between paper and rock and occasionally threw scissors, but mostly paper, rock, paper, rock, a little bit of scissors. Could I be beat?

The thing is, I could because somebody could play against me and notice a pattern. And in actuality, the only nonpattern would be randomizing between rock, paper, and scissors.

The way I would do this is, let's say I have a random number generator app on my phone, which you probably do. I would put three numbers in there. I'd go away. I'd press it. I'd look at number two, which is scissors for me, and then I would throw scissors. And that way, you would never know what I was going to do next. And while my strategy is not going to beat your strategy, because I'm not even guessing.

It's just random. You're not going to be able to come up with a strategy that beats mine.

Now, we apply that to poker, but in a much bigger way.

GTO Level 3: College Student / Amateur Poker Player

Today, I'm going to teach you about the Ace King Queen game. This is a toy game, which just means a simplified version of poker. Obviously, you can have toy games in other games, but a simplified version of the game that we can actually solve and come up with a GTO solution.

There's $100 in the pot.
We each have $100 in our stack.
You are going to get dealt a
I'm going to get dealt an or a
If either of us bet, we are all in.

It's your turn to act first. So, let's figure out the perfect way to play.

Starting with you, you have a . Half the time I have an , half the time I have a . Are you going to bet or check?

Check?

So, why did you check?

There is no reason for a to bet because if you go all-in, then I am going to call you when I have an and win. I am going to fold when I have a , and you are not going to get any more money. So your play is to check. We actually just solved for one player.

Solving the Counter-Strategy

Now, what is my strategy going to be? When I have an , I am going to go all-in because there is no reason not to. When I have a , I am not always going to go all-in.

If I go all-in with an every time and a every time, then you have to call $100 to win $200 and I would be bluffing half the time. You would win $200 half the time and lose $100 half the time, which makes it an easy call.

I actually have to balance how often I bluff with my . Since you are getting those 2:1 odds, I need to have an twice as often as I have a . I am going to go all-in with all of my and half of my . That way, you cannot exploit my strategy.

The Final Decision: Indifference

We have solved the first bet for you and the first one for me. Now we get back to you for one final decision. You are facing an all-in bet. What do you do?

Well, it does not actually matter because I balanced my value bets and bluffs perfectly. But you actually have to balance your calls because if you do not balance your calls perfectly, I can exploit you either by never bluffing with my or always bluffing with my .

You have to get your fold ratio correct. Let us look at what I am risking with my bluffs and what I am getting. I am risking $100 with my bluffs to win $100 in the pot. We call that 1:1 odds. When I have 1:1 odds, it means that when I win, I win $100, and when I lose, I lose $100. Much like I am bluffing with my Queen half the time, you have to call with your King half the time.

Now we have solved the entire game. The Ace-King-Queen game is very similar to river play in poker. We use these concepts to get a rough idea of how we play on the river, but it is a lot more complicated in practice.

GTO Level 4: Grad Student / Regular Poker Player

You understand the Ace-King-Queen game. You understand that on the river we have to balance our value hands and our bluffs, but we also have to do that on the flop and on the turn. This is called Range Construction.

We have to think about our betting range on the flop, the turn, and the river, as well as pre-flop. This ensures we cannot be exploited throughout the hand. It gets more complicated as we go because there are so many flops, turns, and rivers.

Avoiding the Vacuum

A common mistake is playing the flop, turn, or river in a vacuum. If you play the river in a vacuum but your flop and turn ranges are out of whack, your river ranges will be out of whack as well.

One thing we must consider on the flop is Board Coverage. When you are making a river bet, there are no more cards to come, so you do not have to think about board coverage. But on the flop, we have a turn and a river to come.

If we construct our strategy so that we never connect with straight-completing turns or flush-completing turns, or we never connect with the middle card pairing because we do not bet middle pair, that is a problem. On those boards, our range will not be well-supported. We will not be able to take aggressive action or take pots down even with our bluffs.

Bet Sizing and Range Advantage

Often on the flop, we do a lot of small betting, whereas on the turn and river, we incorporate larger bets. Sometimes a small bet will be my only sizing on the flop. The reason for that is that if you bet big, there are hands you really do not want to include, like middle pair or bottom pair. You would be forced to either include them, which is bad, or not include them, which is also bad because you lose board coverage.

Usually, the player with the Range Advantage does a lot of small betting on the flop to nudge that advantage. Then on the turn and river, we start opening things up.

Minimum Defense Frequency (MDF)

Just like on the river, we have to think about Minimum Defense Frequency. When you are calling a bet, you have to defend enough that your opponent's bluffs are indifferent. This is also true on the flop and turn.

However, there are exceptions. On some flops, turns, and rivers, you should fold a lot more or call a lot more than MDF would indicate. This is due to range advantage.

If your opponent has a range advantage, you can actually over-fold. If the board is particularly draw-heavy, you want to over-call often because you have more hands with equity.

Ask yourself these questions to help construct your range:

What does my range look like here?
Does my range like this card or this board?
Am I going to be able to hit different kinds of river cards?
Will I have good hands and bluffs on all runouts?

GTO Level 5: Expert / Poker Pro

You already understand how GTO works. You have studied solvers, memorized some strategies, and implemented them decently well. The main thing I want to talk to you about today is shifting your mindset regarding studying with solvers.

When you go into a solver trying to memorize all of these outputs and strategies, you are trying to emulate a theoretically break-even strategy. You are not going to nail it. When you try to play perfect solver poker, you fail because it is impossible. In doing so, you leave yourself exploitable because you are not trying to counter the adjustments of other players.

The Risk of Improper Range Construction

If, for example, you do not realize that you are probing too many value hands on the turn and not probing enough with bluffs, an astute opponent will take advantage of you. They will over-fold to your turn probe and bluff frequently against your check. If you are not seeing what they are doing and trying to counter it, they will get away with it repeatedly because your range construction is off.

You have tried to emulate the solver strategy, and you have not done it successfully because it is impossible. This is why I always advocate for not trying to play like a solver, but learning to think like a solver.

Understanding the "Why"

When I study with solvers, I am not trying to memorize a strategy. I am basically never trying to memorize a strategy, with some exceptions for pre-flop play. What I am trying to do is understand exactly why the solver is doing what it is doing.

Even though a solver uses brute force calculations to come to its conclusions, and the solver itself does not "know" why it is making a move, it ends up playing a perfect strategy that contains human-understandable logic.

When I study, I might come up with general heuristics:

These are the boards I want to bet small on.
These are the boards I want to bet big on.

Other than bucketing those together, I am just trying to pull concepts and understand the reasoning behind this perfect game theory optimal play.

Freeing Up Mental Bandwidth

When I get to the table, I am not trying to recall different outputs of simulations and copy them. Instead, I think: "I understand conceptually how this spot plays. You have a range advantage, so I should be defending with these kinds of hands and making sure to bluff with roughly these kinds of hands."

I execute that, and because I am not digging deep into my memory and using up all my mental bandwidth, I have room to take advantage of what you are doing. That is the hidden value of solver study: when you simplify your strategy and focus on concepts, you free up room for great exploitative play.

Where the Money is Made

Exploitative play is where all of the money is made. I have been in many situations, even against great players, where they are betting the river and I think they are bluffing 60% of the time when they are supposed to be bluffing 30%. Or perhaps they are bluffing 5% of the time when they should be at 30%.

I know this because of:

What I know about them as players.
The bet sizes they used throughout the hand.
The way the board ran out.
How difficult it is to actually have enough bluffs on a specific texture.

If you are just sitting there trying to regurgitate a memorized strategy, you have no room to make those plays that are worth tons of big blinds.