Negotiation Benchmark

Code for the paper:

TODO: Add paper title, authors, venue, and link here TODO: arxiv / proceedings link

Overview

This repository implements a multi-player bilateral negotiation benchmark for evaluating AI negotiation agents. Players take turns proposing joint actions to partners in a round-robin schedule. Each proposal is either accepted (if it weakly improves the partner's payoff) or rejected. The benchmark supports exact solvers, heuristic methods, MCTS, dynamic programming lookahead, and LLM-based agents, and includes a procedural game generator for sweeping across diverse negotiation scenarios.

Repository Structure

.
├── config/
│   └── game_configs.py          # Game generation and satisfaction mask utilities
├── core/
│   ├── equilibrium.py           # Nash equilibrium checking and regret computation
│   ├── game_logic.py            # Payoffs, goal satisfaction, offer generation
│   ├── game_state.py            # NegotiationState class (turn order, policy matrix)
│   └── one_shot_optim.py        # Baseline and MIP-based one-shot optimisation
├── methods/
│   ├── baselines.py             # LLM-based negotiation agent (OpenAI API)
│   ├── mcts.py                  # MCTS and DP-based partner selection
│   └── negotiation.py           # Offer search, value estimators, exact solvers
├── experiments/
│   └── runner.py                # Single-game runner and multi-method experiment loop
├── main.py                      # Local parallel sweep runner 
└── README.md

Installation

pip install numpy cvxpy scipy joblib tqdm pandas openai tenacity

MOSEK is required for the MIP-based methods (optimize_P_via_masks_with_NE and best_offer_linear_mip). A free academic licence is available at mosek.com.

For the LLM baseline, set your OpenAI API key:

export OPENAI_API_KEY="your-key-here"

Core Concepts

Game representation

Each game is defined by:

G — a (N_GOALS, N_PLAYERS) matrix where G[g, p] is player p's valuation of goal g.
Policy matrix P — a binary (N_PLAYERS, N_ACTIONS) matrix where P[p, a] = 1 means player p has committed to action a. Actions are binding: bits can only flip from 0 → 1, never 1 → 0.
Satisfaction masks — one binary matrix per goal indicating which (player, action) pairs are required to satisfy it.
Goal types — goals are either linear (satisfaction scales with the fraction of required actions taken) or binary (satisfied only when all required actions are taken).

Turn structure

Players negotiate in a shuffled round-robin order. On each turn, the current proposer selects a partner and proposes a joint action. The partner accepts if the offer weakly improves their estimated terminal payoff; otherwise the turn is rejected. The game ends after all scheduled turns.

Negotiation Methods

Method	Description
`reward`	Greedy offer maximising proposer payoff; random partner selection
`upper`	Greedy offer using an optimistic upper-bound value estimator
`lower_tighter`	Greedy offer using a tighter pessimistic lower-bound estimator
`LLM_full`	LLM agent (GPT-4o-mini) selecting partner and offer from raw game state

Methods are passed as configuration dicts to the runner. The how_fallback key selects the value estimator; MCTS is used for partner selection when n_sims > 0.

Reproducing Figure 1 and Table 1

TODO: Add precise description of what Figure 1 and Table 1 show once the paper link is confirmed.

The results are generated by running the full parameter sweep in run_cloud.py (or run_local.py for local execution). The sweep covers:

Parameter	Values
Structure type	`adversarial`, `cooperative`
Binary fraction	`0.0`, `0.15`, `0.30`, `0.50`
Latent factors (k)	`5`, `15`
Zipf complexity	`1.6`, `3.0`
Payoff shift	`negative`, `positive`, `balanced`
Game size	`small` (exact solver), `large` (baseline)
Seeds	`0–49`

This produces 9,600 tasks in total.

Running locally

python run_local.py

Results are saved to ./results/<size>_<shift>_games/<uuid>.pkl.gz. Uses all available CPU cores via joblib.

Results are saved to the negotiation-results-vol Modal Volume. To download:

modal volume get negotiation-results-vol /root/cloud_data/<folder> ./local_results

Loading results

Each .pkl.gz file contains a dict with a single "results" key:

import gzip, pickle

with gzip.open("path/to/file.pkl.gz", "rb") as f:
    data = pickle.load(f)

# data["results"] is a dict keyed by (method_name, game_name)
# Each value contains "payoff_vector", "sum_payoff", "is_equilibrium", etc.

Generating Custom Games

from config.game_configs import ScenarioProfile, generate_game_config, create_sat_masks

profile = ScenarioProfile(
    structure_type="adversarial",   # or "cooperative"
    binary_fraction=0.2,            # fraction of goals that are binary
    complexity_zipf_a=2.0,          # Zipf shape for goal complexity (must be > 1)
)

game_config = generate_game_config(
    n_players=5,
    country_idx2num_actions={i: 4 for i in range(5)},
    n_goals=10,
    k_factors=3,
    seed=42,
    profile=profile,
    shift="negative",   # "negative", "positive", or None
    inject_pp=False,    # set True to inject a poison-pill scenario
)

sat_masks = create_sat_masks(game_config)

Running a Single Experiment

from experiments.runner import run_experiment

method_configs = [
    {
        "name": "reward",
        "how_fallback": "reward",
        "n_sims": 0,
        "c_ucb": 1.0,
        "use_prior": False,
        "max_changes": 2,
        "dp_k": 0,
        "k": 1,
    },
    {
        "name": "upper",
        "how_fallback": "upper",
        "n_sims": 0,
        "c_ucb": 1.0,
        "use_prior": False,
        "max_changes": 2,
        "dp_k": 0,
        "k": 1,
    },
]

results = run_experiment(
    game_names=["my_game"],
    method_configs=method_configs,
    n_trials=10,
    models={},
    allowed_actions_dict={},
    forbidden_actions_dict={},
    given_configs={"my_game": game_config},   # pass your own config here
)

from experiments.runner import print_comparison_table
print_comparison_table(results)

Citation

TODO: Add BibTeX entry here once the paper link is confirmed.

License

TODO: Add licence information.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
src		src
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Negotiation Benchmark

Overview

Repository Structure

Installation

Core Concepts

Game representation

Turn structure

Negotiation Methods

Reproducing Figure 1 and Table 1

Running locally

Loading results

Generating Custom Games

Running a Single Experiment

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Negotiation Benchmark

Overview

Repository Structure

Installation

Core Concepts

Game representation

Turn structure

Negotiation Methods

Reproducing Figure 1 and Table 1

Running locally

Loading results

Generating Custom Games

Running a Single Experiment

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages