NOTE: The mod10-analytics project has been significantly modified since this post was written. For a more current view of it, see the project’s README in the GitHub repository.

This past January, I decided to learn Haskell. The wonderful Real World Haskell got me through my first sessions in ghci, and I still go back to it regularly. Learn You a Haskell for Great Good! was also useful, but in retrospect, it might have been more helpful had I started with it. I should mention that learning Haskell wasn’t my first experience with functional programming. Racket—or Scheme, if you prefer—is familiar from some of my coursework at Oberlin, so none of the basic concepts of FP were new when I started with the new language.

As I was taking my first steps in Haskell, I was also dusting off my Winter Term project from last year, for which I wrote a Cocoa application to play a card game called Mod10. I’ve written about that project recently; if you’ve never heard of Mod10, you should read up before continuing here. (If you’re on OS X and you’d like to try the game, you can find the Cocoa app I wrote on GitHub.) When the time came to try to “make something” in Haskell, I had that project on my mind and decided to run with it. Mod10 is not a game that anyone wins very often, and I’ve always wondered, “Just how hopeless is it?” A few hundred lines of Haskell later, I’ve got answers to more than just that.

Infinite Loops

Before I get to the details of this project, I should explain one of the quirks of the game. That is, the game occasionally enters an infinite loop that’s inescapable without breaking the rules. It happens when the ordering of the deck and the stacks is such that the deck is cycled in some constant order, where each card is consistently dealt to the same stack in the same order. To see this, consider the simplest example, in which a single stack is left on the table, and a single card is left in the deck:

The 3d—“three of diamonds”, that is—-is dealt, the (Kd,7c,3d) triplet is returned to the deck, the Kd and 7c are dealt, and the process starts over. Having a experienced this a few times in manual play, I knew to watch out for it in the simulation.

The Implementation

Building the simulation was remarkably straightforward (The interesting part has been figuring out what to do with all the resulting data!) The code is all on GitHub, so I’ll leave the grittier details to the code and the comments therein. It’s quite well documented, compiles without warnings, etc. For the sake of discussion, these are the project’s executables:

All three programs run without memory leaks, although making Scrape leak-free required some judicious use of strictness annotations and the Control.DeepSeq library. Compiled with the -O2 optimization flag, Mod10 plays about 2300 games per second on my pre-unibody MacBook Pro. Oberlin’s newer CS lab machines (iMacs running on 3.06GHz Core 2 Duos) manage to play almost 3000 games per second. All this code is single-threaded, but for big runs on a multi-core machine, it’s simple to run multiple instances of Mod10 and concatenate the output.

Getting Some Answers

I started out looking at the statistics of “small” runs of games. Here’s the output from Scrape for a run of 10,000 games:

output/data10k.txt
  games played: 10000
winning
      (n,frac): (239,2.39e-2)
     (min,max): (43,892)
  (mean,stDev): (307.10041841004187,170.91967731148208)
losing
      (n,frac): (9711,0.9711)
     (min,max): (37,1459)
  (mean,stDev): (137.34723509422304,75.4475497444211)
timeout
      (n,frac): (50,5.0e-3)
    win freq.s: [(7,0),(10,0),(13,0),...,(1993,0),(1996,0),(1999,0)]
   loss freq.s: [(37,1),(40,2),(43,7),...,(1993,0),(1996,0),(1999,0)]

So about 2.4% of the simulated games are winners, and about 0.4% either enter an infinite loop or randomly run longer than 2000 turns. The min, max, mean, and stDev figures refer to number of turns in a game. The obvious differences in the distribution of the number of rounds in winning games versus losing games inspired FreqPlot. Pass it a file with the info above (un-truncated, of course), and we get a .png file showing this plot:

Interesting, but the winning data is a little too sparse. The maximum is 5, and many “winnable” turns—every third turn in the game—have none. Each winnable turn has its own bin in these plots, so one option is to increase the bin size. Another option is simply to produce more data, and that’s what I did. A run of 50 million games resulted in this plot:

Further Steps

There’s obviously a lot more to be gleaned from all this data than what Scrape currently outputs. For example, the frequency plots suggest that the likelihood of winning goes up significantly after 95 turns. And what about the verbose output from Mod10, in which each game round is printed? In its current form, Scrape ignores everything except the endgame states. Rather than a single-file executable, Scrape really deserves to be its own library.

There isn’t much strategy that goes into playing Mod10, but there are a few circumstances in which the player has some choice. Consider when a stack has multiple valid triplets. Choosing one over the other(s) may lead to other immediately valid triplets that can be picked up. Also, stack configurations with cards in the top two, bottom two, or top and bottom positions which sum to 10 or 20 are advantageous, since 10-valued cards (which cause such a stack to have a valid triplet) are more common than other-valued cards, so picking up triplets that leave the stack in those configurations should be a good strategy. Investigating these strategic options is one avenue for further development of the simulation itself.

In addition to the programs' core functionality, they could all be more tightly integrated. Especially when working with the full verbose output from Mod10, it would be ideal to avoid writing all of it to disk, just to read it again immediately and probably delete it soon after. And the code is not free of the silly blunders of someone who’s new to a language. For example, I noticed while writing this post that Scrape accumulates a lot of redundant data while reading an input file.

There’s lots more to do and learn here, so you can count on more Mod10 posts in the future. In the meantime, hit the comments and tell me what you think so far!