Fall 2015: Life is Not Chess

LIFE is NOT Chess

COUNTED AS A LOSS OR A “STATISTICAL TIE,” POKER MATCHUP POINTS TO BETTER DECISION MAKING

Life is not a chess game.

The world is not a chessboard, with every piece visible. Life is more like a hand of poker, according to Tuomas Sandholm of Carnegie Mellon University’s School of Computer Science. Other players have cards we can’t see and try to trick us. Could our decisions be better if we leveraged artificial intelligence?

With graduate students Noam Brown and Sam Ganzfried, Sandholm has used PSC’s Blacklight to build an artificial poker player. This program, Claudico, recently lost a squeaker of a tournament against top human poker players. Lessons learned from that match up promise to transform how we navigate a world with adversaries and incomplete information.

WEIGHTING DECISIONS

Sandholm is quick to say his team did not actually write a poker-playing program. “We really didn’t write a program called ‘Claudico,’” he says. “The algorithms we’ve developed for solving incomplete information games are general-purpose.” Claudico emerged from those algorithms as output, given only the rules of the game as input.

The algorithms try to approximate “game-theory-optimal” play, subject to computational limitations. Heads-up, no-limit Texas Hold’em poker, in which players can bid as much as they like on a given hand, contains some 10¹⁶¹situations (called information sets). That’s far more than there are atoms in the Universe and way beyond any foreseeable computing capability.

“The first step is creating an abstraction,” Sandholm explains. “The algorithm takes the rules of the game and outputs a smaller game that’s strategically similar.”

The algorithm treats similar hands as identical; for example, possibly equating two Jacks with two Queens. But as the game progresses, this “rounding off” error amplifies. The CMU researchers countered this problem using Blacklight, whose large cache-coherent memory allowed a much finer-grained abstract than otherwise possible. In computing the strategy for Claudico, the researchers routinely used an enormous eight Terabytes of RAM—4,000 times as much as in a top-line computer tablet.

The team’s earlier AI, Tartanian7, dominated the 2014 Annual Computer Poker Competition. Claudico, by comparison, performs an even finer-grained abstraction and uses two different abstractions, depending on whether it is the first or the second mover in the game. Also, its algorithm runs on Blacklight 24/7, in parallel with its lighter, real-time thinking, which takes place on a commodity server.

Claudico could beat the tar out of Tartanian7. But was it ready, like Deep Blue in chess in 1996 or Watson in Jeopardy! in 2011, to battle humanity’s best?

“PLAYING A MARTIAN”

Claudico comes up with some strategies that humans find to be downright alien.

“Playing Claudico is like playing a Martian,” Sandholm says. In particular, it likes to “pass” on the first move—meet the other player’s bid without raising it or folding. Pros denigrate that as a rookie move, calling it “limping.” But Claudico owned that strategy— its name is Latin for “I limp.”

“Humans learn how to play poker in two ways,” Sandholm notes. “One is that they play a lot of poker against other humans; the other is that they read books about how to play poker—but who wrote those books?” Possibly, the researchers thought, humans had evolved into a point in the strategy space that Claudico could beat.

At the Rivers Casino in Pittsburgh from April 24 through May 8, 2015, and after 80,000 hands against four of the top-10 ranked poker players in the world, Claudico fell a little short of that goal. But it came close. Out of some 170 million virtual dollars wagered, Claudico wound up just $732,713 behind the humans. Statistically, the contest was a tie.

“Claudico was a very strong opponent,” said Doug Polk, one of the contestants and the number one ranked poker player in the world. “It’s extremely aggressive.”

Polk noted that at times Claudico would risk tens of thousands of dollars to win hundreds. He said that human players often hesitate to take such risks, and pros succeed by being willing to lose in the short term with a strategy that will pay off in the long run.

“Claudico just takes that to the next level,” he adds.

NOT JUST A POKER PLAYER

Ultimately, Claudico is about more than poker. Sandholm’s group has conducted AI research in several incomplete-information fields:

The FCC periodically runs “spectrum auctions,” in which tens of billions of dollars can be bid for little-used radio frequencies. An AI could help participants bid more rationally.
Allocation of air marshals, detection dog teams, and other resources might be more efficient if an AI could weigh the odds of a terror threat at a given place and time.
A Claudico-like AI could help avoid committing cybersecurity resources at moments when hacking attempts are unlikely.

Medical decision making may represent the most exciting opportunity, Sandholm says. “Most medical treatment today is myopic,” he explains. “We throw one treatment at a problem at a time.” An incomplete-information game solving AI might help doctors design multi-step treatment plans with better outcomes.

A series of pharmaceutical “nudges,” for example, might steer an HIV infection to a less life-threatening state. Similarly, therapies could push cancer-cell populations toward less malignancy or populations of bacteria away from antibiotic resistance.

“It’s not specific to any particular disease,” much less any particular “game,” Sandholm adds. “That’s a big vision and I’m very excited about it.”

PSC staff helped Tuomas Sandholm’s work with the support of programs within the NSF’s XSEDE network of supercomputing centers: Extended Collaborative Support Service; Novel and Innovative Projects Program.