AoE2 · LLM Arena

Deployment Guide: Mac + Windows VM Setup

Run the detection server on macOS (Apple Silicon) and the gameplay agent on a Windows VM. The agent sends screenshots to the Mac over HTTP for fast YOLO inference via CoreML/Neural Engine. (Screenshots leave the VM only for YOLO detection; the strategist reads the resource bar locally via OCR and sends Claude a text-only prompt — no image is sent to any LLM.)

┌─────────────────────────────┐         HTTP          ┌─────────────────────────────┐
│       macOS Host            │◄───────────────────────│       Windows VM            │
│                             │                        │                             │
│  Detection Server (:8420)   │  POST /detect          │  AoE2:DE (game)             │
│  CoreML / ONNX model        │  POST /detect/sahi     │  Gameplay Agent             │
│  ~15ms per tile inference   │  GET  /health          │  Screenshots → HTTP → Mac   │
│                             │                        │  Actions → pyautogui        │
└─────────────────────────────┘                        └─────────────────────────────┘

Prerequisites

MachineRequirements
MacPython 3.11+, Apple Silicon recommended
Windows VMPython 3.11+ (x64 installer, NOT ARM64), AoE2:DE installed, VMware Fusion or similar
BothNetwork connectivity between host and VM
API KeyAnthropic API key (ANTHROPIC_API_KEY)

Part 1: macOS Host — Detection Server

1.1 Clone and install

cd ~/Projects/home/aoe2-llm-arena/agent

# Create a venv (if not done already)
python3 -m venv venv
source venv/bin/activate

# Install server dependencies
pip install -r server/requirements.txt

# Optional: install CoreML support (recommended on Apple Silicon)
pip install coremltools

1.2 Verify model file

The ONNX model should be at:

detection/inference/models/aoe2_yolo_v5.onnx

If you have the .pt weights and want CoreML (faster on Apple Silicon):

# Export to CoreML (optional, ONNX works fine)
just export-coreml detection/inference/models/aoe2_yolo_v5.pt

1.3 Start the server

# Using justfile
just server --model detection/inference/models/aoe2_yolo_v5.onnx

# Or directly
python -m detection_server --model detection/inference/models/aoe2_yolo_v5.onnx --host 0.0.0.0 --port 8420

For CoreML model:

just server --model detection/inference/models/aoe2_yolo_v5.mlpackage

You should see:

INFO:     Model loaded: onnx_cpu (or coreml)
INFO:     Uvicorn running on http://0.0.0.0:8420

1.4 Verify the server is running

curl http://localhost:8420/health

Expected response:

{"backend": "onnx_cpu", "classes": 60, "model": "aoe2_yolo_v5.onnx"}

1.5 Find your Mac’s IP address

The VM needs to reach the Mac. Find the IP depending on your VM software:

VMware Fusion — the host is typically reachable at 192.168.64.1 from the VM. Verify:

# On the Mac
ifconfig vmnet8 | grep inet
# or
ifconfig bridge100 | grep inet

Alternative — check your Mac’s local network IP:

ipconfig getifaddr en0

Note this IP (e.g., 192.168.64.1). You’ll need it in Part 2.


Part 2: Windows VM — Game Agent

2.1 Transfer the code

Option A — Git clone:

git clone <repo-url> aoe2-llm-arena
cd aoe2-llm-arena\agent

Option B — ZIP transfer:

# On Mac: create a zip of the agent directory
cd ~/Projects/home/aoe2-llm-arena
zip -r agent.zip agent/ -x "agent/venv/*" "agent/logs/*" "agent/.superset/*" "agent/*.tar.gz" "agent/*.pt" "agent/*.zip"

# Transfer to VM (replace VM_IP with your VM's IP)
scp agent.zip user@VM_IP:~/

Then on the VM:

cd %USERPROFILE%
mkdir aoe2-llm-arena
cd aoe2-llm-arena
tar -xf %USERPROFILE%\agent.zip
cd agent

2.2 Set up Python environment

Important: Use the Python x64 installer on Windows ARM64 VMs. ARM64 Python lacks many wheel packages.

python -m venv venv
venv\Scripts\activate
pip install -r gameplay_agent\requirements.txt

If you get torch/scipy conflicts:

pip install scipy numpy --force-reinstall
pip install -r gameplay_agent\requirements.txt

2.3 Set environment variables

Command Prompt:

set ANTHROPIC_API_KEY=sk-ant-...your-key...
set AOE2_DETECTION_HOST=http://192.168.64.1:8420

PowerShell:

$env:ANTHROPIC_API_KEY = "sk-ant-...your-key..."
$env:AOE2_DETECTION_HOST = "http://192.168.64.1:8420"

Replace 192.168.64.1 with your Mac’s IP from step 1.5.

Optional tuning:

set AOE2_LOOP_DELAY=0.3
set AOE2_EXECUTOR_EFFORT=low
set AOE2_STRATEGIST_INTERVAL=10
set AOE2_SAVE_SCREENSHOTS=true

2.4 Verify connectivity to the Mac server

curl http://192.168.64.1:8420/health

If this fails:

2.5 Start AoE2

  1. Launch Age of Empires II: Definitive Edition
  2. Start a Single PlayerSkirmish match
  3. Pick your civilization, set opponent to AI
  4. Start the game and wait until you see your Town Center

2.6 Run the agent

cd aoe2-llm-arena\agent
venv\Scripts\activate
python -m gameplay_agent

Or with options:

:: Run limited iterations
python -m gameplay_agent --iterations 50

:: Single test iteration (no action execution)
python -m gameplay_agent --test

You should see logs like:

detector_initialized         mode=remote server=http://192.168.64.1:8420
game_loop_start              detection=True executor_model=claude-sonnet-4-6
iteration_start              iteration=1
screenshot_captured           width=1920 height=1080
detection_complete           entity_count=12
strategist_goals_updated     turn=1 goal_count=4
llm_response                 iteration=1 action_count=3
actions_executed             iteration=1 total=3 successful=3

Part 3: Troubleshooting

Server won’t start

ProblemFix
ModuleNotFoundError: No module named 'server'Run from the agent/ directory: cd agent && python -m detection_server ...
onnxruntime import error on macOSpip install onnxruntime (not onnxruntime-gpu)
CoreML model fails to loadFall back to ONNX: --model path/to/model.onnx

Agent can’t connect to server

ProblemFix
Connection refusedVerify server is running and bound to 0.0.0.0 (not 127.0.0.1)
Connection timed outCheck firewall on Mac — allow port 8420. On macOS: System Settings → Network → Firewall
Wrong IPRe-check with ifconfig on Mac. VMware uses vmnet8 or bridge100

Agent can’t find the game window

ProblemFix
game_not_foundAoE2 must be running and visible. Don’t minimize it
could_not_focus_gameClick the game window once, then restart the agent
Coordinates are offMake sure the game runs windowed or the agent captures the right monitor

Detection quality issues

ProblemFix
Few entities detectedThe agent zooms in on turn 1. If entities are tiny, they may be too far away
False positivesPer-class thresholds in packages/detection/src/inference/thresholds.py can be tuned
Slow detectionUse CoreML model on Mac for ~15ms/tile vs ~100ms/tile with ONNX CPU

Agent falls back to local detection

The remote detector logs remote_detector_unavailable and falls back to local ONNX. This is slower but works. Check server connectivity to fix.


Part 4: Configuration Reference

Environment Variables

VariableDefaultWherePurpose
ANTHROPIC_API_KEYVMClaude API key (required)
AOE2_DETECTION_HOST""VMDetection server URL, e.g. http://192.168.64.1:8420
AOE2_MODELclaude-sonnet-4-6VMExecutor LLM model
AOE2_EXECUTOR_EFFORTlowVMExecutor effort (low/medium/high)
AOE2_STRATEGIST_MODELclaude-sonnet-4-6VMStrategist LLM model
AOE2_STRATEGIST_INTERVAL10VMRun strategist every N turns
AOE2_LOOP_DELAY0.3VMSeconds between game loop iterations
AOE2_SAVE_SCREENSHOTStrueVMSave screenshots to logs/

Server CLI Flags

python -m detection_server --model PATH --host HOST --port PORT
FlagDefaultPurpose
--model(required)Path to .onnx or .mlpackage model
--host0.0.0.0Bind address
--port8420Bind port

Agent CLI Flags

python -m gameplay_agent [--test] [--iterations N] [--overlay]
FlagPurpose
--testSingle iteration, no action execution
--iterations NStop after N iterations
--overlayShow live detection overlay on game window

Quick Start Cheatsheet

Mac (Terminal 1):

cd ~/Projects/home/aoe2-llm-arena/agent
source venv/bin/activate
just server --model detection/inference/models/aoe2_yolo_v5.onnx

Windows VM (Command Prompt):

cd aoe2-llm-arena\agent
venv\Scripts\activate
set ANTHROPIC_API_KEY=sk-ant-...
set AOE2_DETECTION_HOST=http://192.168.64.1:8420
python -m gameplay_agent

Beyond the real-game tier

This guide covers the Mac + Windows VM setup for the real-game agent. If you also want to bring up the synthetic-arena tier (compose stack with Langfuse / Redis / Postgres / MinIO / ClickHouse, the arena CLI for offline evaluation, or the web UI for replay and fork), see: