Solving Jimbo · Andy Chen

I recently started playing the survival game Don’t Starve Together (DST). A while back DST had a collaboration with Balatro, and from it came a machine called JIMBO. Players can play a Balatro-style minigame to earn points, which may then result in either a reward or a penalty. While the full set of rules can be found in the wiki, here is a summary:

The game starts by choosing one of three randomly offered Jokers out of a pool of 18
A five-card hand is dealt from a standard 52-card deck
The player may discard any subset of the hand up to two times. The empty slots will be redrawn from the deck.
The chosen Joker can modify scoring and may react to draws, discards, or the final hand
Each hand type provides a number of chips and a multiplier, Joker effects may modify either value, and the final score is their product

The game doesn’t take long to play, and the in-game rewards for getting high scores were actually really good, so I wanted to maximize my earnings. However, online strategies for this minigame were mostly undocumented, and the few resources that appeared were speculative and all anecdotal, so I decided to a more exact method: a solver that, given a joker and a starting hand, returns the discard with the highest expected final score.

What first stuck out to me was how small the decision space was. At each round, there are only $32$ possible moves the player can make, corresponding to the $2^5$ subsets of the hand to discard. The number of 5-card hands is also not that large, $\binom{52}{5} = 2{,}598{,}960$ , small enough to enumerate once. However, since the game requires two rounds of discarding, a complete search was still infeasible due to the how quickly the search space balloons.

Expected Contribution#

The game uses this scoring rule:

\text{score} = \text{chips} \times \text{mult}

By the time both discards are done, part of the score is already fixed. Let the current chip count and multiplier be $C$ and $M$ , respectively. A final hand $H$ contributes additional chips $\Delta C(H)$ and additional multiplier $\Delta M(H)$ , so the score becomes:

\bigl(C + \Delta C\bigr)\bigl(M + \Delta M\bigr) = CM + C\,\Delta M + M\,\Delta C + \Delta C\,\Delta M,

Once $C$ and $M$ are fixed, a hand enters the score through only three quantities: $\Delta C$ , $\Delta M$ , and their product $\Delta C\,\Delta M$ . A discard induces a set $\Omega$ of equally likely final hands, and we want its expected score. Taking the expectation of the expansion, each expectation is a sum over the set divided by its size.

\begin{aligned} \mathbb{E}[\text{score}] &= CM + M\,\mathbb{E}[\Delta C] + C\,\mathbb{E}[\Delta M] + \mathbb{E}[\Delta C\,\Delta M] \\ &=CM + \frac{M\sum \Delta C + C\sum \Delta M + \sum \Delta C\,\Delta M}{\left| \Omega \right|} \end{aligned}

If every final hand carries the vector

\mathbf{w}(H) = \bigl(\,\Delta C,\ \Delta M,\ \Delta C\,\Delta M\,\bigr)

then summing $\mathbf{w}$ over any set of final hands $\Omega$ gives the three numbers we need: $\sum \Delta C$ , $\sum \Delta M$ , and $\sum \Delta C\,\Delta M$ .

Some interesting title#

The nice thing about $\mathbf{w}(H)$ is that it depends only on the final hand and not on how we got there. This means we can compute it once for every hand the deck can produce and reuse those numbers no matter what situation the solver is asked about.

For a set of cards $S$ , define $A(S)$ as $\mathbf{w}(H)$ summed over every five-card hand that contains $S$ :

A(S) = \sum_{H:\ S \subseteq H} \mathbf{w}(H).

Using this, we can quickly calculate the expected contribution to a set of cards before the final hand using a single lookup. However, this will overcount as when a card is discarded, it will no longer show up in future rounds, but $A(S)$ still counts hands that draw it.

Let $D$ be the set of discarded cards. Using inclusion-exclusion, we are able to group hands that contain $S$ and avoid $D$ :

A_D(S) = \sum_{H:\ S \subseteq H,\ H \cap D = \emptyset} \mathbf{w}(H) = \sum_{X \subseteq D} (-1)^{|X|}\, A(S \cup X).

A discard keeps some subset $S$ of the visible five-card hand $V$ and throws the rest away. The thrown-away cards are now dead too. Therefore, the intersection between $V$ and a legal final hand $H$ must be exactly $S$ :

E(S) = \sum_{H:\ H \cap V = S,\ H \cap D = \emptyset} \mathbf{w}(H).

Every hand counted by $A_D(S)$ has some exact intersection $T$ with $V$ , and that $T$ is a superset of $S$ , so

A_D(S) = \sum_{T:\ S \subseteq T \subseteq V} E(T).

Employing a Möbius inversion, we can recover the term we actually want:

E(S) = \sum_{T:\ S \subseteq T \subseteq V} (-1)^{|T| - |S|}\, A_D(T).

It’s worth noting why these two corrections were split. Asking for $H \cap V = S$ is the same as asking for a hand that contains $S$ while avoiding both the cards discarded in earlier rounds, $D$ , and the newly discarded visible cards $V \setminus S$ :

E(S) = A_{D\,\cup\,(V \setminus S)}(S).

For one choice of $S$ , this formula is simpler. The problem is that the solver wants all 32 choices at once. If we add $V \setminus S$ into the dead cards, then every choice of $S$ needs something slightly different. Across all choices, that becomes

\sum_{S \subseteq V} 2^{|D| + |V \setminus S|} = 2^{|D|} 3^5

lookups. Since the Möbius inversion is able to compute the values of $E(S)$ in one pass, the split method becomes much less expensive.

2^5 2^{|D|} + 5 \cdot 2^5 = 32 \cdot 2^{|D|} + 160.

Solving the Rounds#

We work backwards. In round 2 there are no future decisions left, so each possible choice of cards to keep can be evaluated directly: look up its $E(S)$ value, convert it into an expected score, and keep the best move.

Starting from Round 1 is brute force. Pick which cards to keep in the first round, enumerate every possible replacement draw, and solve the resulting second-round hand optimally. The value of the first-round move is the average of those second-round values, and the solver chooses the move with the highest average.

Limitations#

The solver only supports joker effects that don’t require the set of cards to be ordered. For example, Wendy is a joker that rewards discards that become the same suit. This forces me to consider the order of the hand and redraws, which drastically increases the computation to levels that are unrealistic to run quickly and on consumer-grade hardware.

Code#

An implementation of this is available on github. I used a substantial amount of AI to make it, so it may not work that well.

A version of this with a UI is available here.