Solving Jimbo
· 5 min · algorithms , game theory
I recently started playing the survival game Don’t Starve Together (DST). A while back DST had a collaboration with Balatro, and from it came a machine called JIMBO. Players can play a Balatro-style minigame to earn points, which may then result in either a reward or a penalty. While the full set of rules can be found in the wiki, here is a summary:
- The game starts by choosing one of three randomly offered Jokers out of a pool of 18
- A five-card hand is dealt from a standard 52-card deck
- The player may discard any subset of the hand up to two times. The empty slots will be redrawn from the deck.
- The chosen Joker can modify scoring and may react to draws, discards, or the final hand
- Each hand type provides a number of chips and a multiplier, Joker effects may modify either value, and the final score is their product
The game doesn’t take long to play, and the in-game rewards for getting high scores were actually really good, so I wanted to maximize my earnings. However, online strategies for this minigame were mostly undocumented, and the few resources that appeared were speculative and all anecdotal, so I decided to a more exact method: a solver that, given a joker and a starting hand, returns the discard with the highest expected final score.
What first stuck out to me was how small the decision space was. At each round, there are only possible moves the player can make, corresponding to the subsets of the hand to discard. The number of 5-card hands is also not that large, , small enough to enumerate once. However, since the game requires two rounds of discarding, a complete search was still infeasible due to the how quickly the search space balloons.
Expected Contribution#
The game uses this scoring rule:
By the time both discards are done, part of the score is already fixed. Let the current chip count and multiplier be and , respectively. A final hand contributes additional chips and additional multiplier , so the score becomes:
Once and are fixed, a hand enters the score through only three quantities: , , and their product . A discard induces a set of equally likely final hands, and we want its expected score. Taking the expectation of the expansion, each expectation is a sum over the set divided by its size.
If every final hand carries the vector
then summing over any set of final hands gives the three numbers we need: , , and .
Some interesting title#
The nice thing about is that it depends only on the final hand and not on how we got there. This means we can compute it once for every hand the deck can produce and reuse those numbers no matter what situation the solver is asked about.
For a set of cards , define as summed over every five-card hand that contains :
Using this, we can quickly calculate the expected contribution to a set of cards before the final hand using a single lookup. However, this will overcount as when a card is discarded, it will no longer show up in future rounds, but still counts hands that draw it.
Let be the set of discarded cards. Using inclusion-exclusion, we are able to group hands that contain and avoid :
A discard keeps some subset of the visible five-card hand and throws the rest away. The thrown-away cards are now dead too. Therefore, the intersection between and a legal final hand must be exactly :
Every hand counted by has some exact intersection with , and that is a superset of , so
Employing a Möbius inversion, we can recover the term we actually want:
It’s worth noting why these two corrections were split. Asking for is the same as asking for a hand that contains while avoiding both the cards discarded in earlier rounds, , and the newly discarded visible cards :
For one choice of , this formula is simpler. The problem is that the solver wants all 32 choices at once. If we add into the dead cards, then every choice of needs something slightly different. Across all choices, that becomes
lookups. Since the Möbius inversion is able to compute the values of in one pass, the split method becomes much less expensive.
Solving the Rounds#
We work backwards. In round 2 there are no future decisions left, so each possible choice of cards to keep can be evaluated directly: look up its value, convert it into an expected score, and keep the best move.
Starting from Round 1 is brute force. Pick which cards to keep in the first round, enumerate every possible replacement draw, and solve the resulting second-round hand optimally. The value of the first-round move is the average of those second-round values, and the solver chooses the move with the highest average.
Limitations#
The solver only supports joker effects that don’t require the set of cards to be ordered. For example, Wendy is a joker that rewards discards that become the same suit. This forces me to consider the order of the hand and redraws, which drastically increases the computation to levels that are unrealistic to run quickly and on consumer-grade hardware.
Code#
An implementation of this is available on github. I used a substantial amount of AI to make it, so it may not work that well.
A version of this with a UI is available here.