Diplomacy game AI can negotiate, form alliances, and persuade people

Called CICERO, it combines AI models for strategy and language.

November 30, 2022

Meta has debuted a new AI capable of besting human opponents in the game of Diplomacy.

The game has been seen “as a near-impossible grand challenge” for AI, Meta wrote in a blog post about the AI, called CICERO.

Diplomacy is especially difficult, even compared to complex games like chess and go, because it requires a mastery not of hard and fast rules, but of soft skills. Players must know the art of understanding other people’s perspectives and needs, wants and wonts; make complex, living plans that can change with human whims; and then persuade other players to work with them and against others. (That last one’s trickier for an AI than you or me.)

In short, it’s mainly a test of social — not strategic, logical, or mathematical — skill.

Because it relies on social — not strategic, logical, or mathematical — skills, Diplomacy has long been seen as a “near-impossible” challenge for an AI.

“Despite much progress in training AI systems to imitate human language, building agents that use language to communicate intentionally with humans in interactive environments remains a major challenge,” Meta’s research team wrote in a paper on CICERO, published in Science.

An AI in a game like Diplomacy must talk like a real person, the researchers wrote in their blog post, demonstrating empathy and a knowledge of the game to build relationships. And the imperative of language works both ways: if the AI cannot recognize a bluff, or the subtext of what others are saying, it will quickly be outmaneuvered.

To craft an AI able to win, the CICERO team combined a natural language processing model (think of the famed GPT-3) and a strategic reasoning model (like the Stockfish engine used to play chess), as Ars Technica nutshelled it.

Meta calls this capability a “controllable dialogue model.” Throughout the game, CICERO turns its analytical eye on the state of the game and the conversation history of the players, predicts how others will act, then designs a plan around those scenarios.

“CICERO can deduce, for example, that later in the game it will need the support of one particular player, and then craft a strategy to win that person’s favor – and even recognize the risks and opportunities that that player sees from their particular point of view,” the team wrote.

An AI in Diplomacy must talk like a real person to build relationships. And it cuts both ways: if the AI cannot recognize a bluff, or the subtext of what others are saying, it will quickly be outmaneuvered.

With its plan in place, CICERO then uses natural language processing to craft messages human enough to put its plan into motion. The model was pre-trained on the chaotic corpus of the internet, much like GPT-3, and then cut its teeth on 40,000 archived games on webDiplomacy.net.

The system seems to work pretty well. When CICERO played 40 games of webDiplomacy against human competitors, it averaged more than double human players’ average score, ranking it in the top 10% of online players who have played more than one game.

That’s not as dominant as today’s chess engines — chess has precisely quantifiable correct moves, and modern computers will essentially never lose (unless they’re programmed to take it easy on you).

But scoring among the top players in a game like Diplomacy is a huge advance, given how fuzzy and subjective the situation is. In fact, the CICERO AI was so good that people often preferred it as an ally.

CICERO combines a natural language processing model (think GPT-3) and a strategic reasoning model (like the Stockfish engine used to play chess), to develop and then execute a plan.

“CICERO is so effective at using natural language to negotiate with people in Diplomacy that they often favored working with CICERO over other human participants,” the researchers wrote.

The Meta team believes that their model could eventually help streamline communication between people and AI — imagine an AI that can hold a convo long enough to teach you something, they suggest. Or, in a less ambitious but honestly more fun goal, it could power more realistic video game NPCs who will adapt to your player.

But, as Ars Technica pointed out, the ability to cooperate is a knife’s edge from the ability to manipulate. To that end, they’ve taken measures to detect harmful speech CICERO may encounter, and the team has “high hopes” that people will “build responsibly” with their work, which is available open-source on GitHub.

We’d love to hear from you! If you have a comment about this article or if you have a tip for a future Freethink story, please email us at [email protected].