AoE2 · LLM Arena

Chapter 14: Seven-Round Run Map

A complete timing breakdown of the first 7 iterations (rounds) of a game run, showing every step the agent executes and its estimated wall-clock cost. Includes analysis of two realistic optimizations: async strategist and loop delay reduction.

Note (since this analysis was written). Two of the optimizations below have shipped: the strategist runs asynchronously (Optimization A) and loop_delay now defaults to 0.3 s (Optimization B). The executor also gained a single-shot path for routine turns (one roundtrip instead of the agentic tool loop — see Chapter 4 §4.3). The tables below keep the original loop_delay = 1.0 s / always-tool-loop baseline so the optimization math in §14.5 stays self-consistent; treat them as the methodology, not today’s wall-clock.

14.1 Per-Step Timing Reference

Every iteration executes the same pipeline. Steps marked conditional only run on specific turns.

#StepTypical TimeSourceCondition
1Game running check~5 mswindow.py:is_game_running()Every turn
2Ensure game focus~50 mswindow.py:ensure_game_focused()Every turn
3Screenshot capture~20 msscreen.py:capture_screenshot() via mssEvery turn
4YOLO detection (single-pass @640)one forward passdetector.detect_fast() (adaptive_sahi=False)Every turn — the deployed path
5Entity ownership classification~5 mspackages/detection/src/inference/ownership.pyEvery turn (if entities detected)
6Alarm check~10 msgoals.py:check_alarm()Every turn (if entities detected)
7Strategist API call (Sonnet, text — resources via local OCR)~5000 msproviders/strategist.pyTurn 1, every 10th turn, on alarm (3-turn cooldown)
8Build LLM context~10 msgame_loop.py:_build_llm_context()Every turn
9Executor agentic loop (1–7 tool calls)~2000 msproviders/claude.pyEvery turn
10Process response + memory update~50 msgame_loop.py:_process_response()Every turn
11Ground commands (zoom, scout)~250 msgame_loop.py:_get_ground_commands()Turn 1 only
12Action execution (3–5 actions)~250 msexecutor.py at 50 ms/actionEvery turn (or fallback)
13Loop delay (sleep)1000 msconfig.loop_delay = 1.0Every turn

Config defaults (from config.py): loop_delay=0.3 (the tables below use the pre-optimization 1.0 baseline — see the note above), strategist_interval=10, detection_imgsz=640, adaptive_sahi=False, full_sahi_interval=5 (only consulted when adaptive_sahi=True), action_delay=0.05, max_tool_iterations=7, executor_effort="low".

Detection mode (v6). The agent now runs a single forward pass at imgsz=640 on every turn (adaptive_sahi=False) — SAHI lowers real F1 at retina resolution (see Chapter 7 §7.4). The per-round “full/adaptive SAHI” distinctions and the millisecond detection figures in the timelines below are illustrative/historical from the pre-v6 design; treat detection as one constant single-pass cost per turn regardless of round.


14.2 Round-by-Round Overview

RoundStrategist?Detection ModeGround Cmds?Notes
1YesSingle-pass @640YesHeaviest round — first strategist + zoom/scout
2NoSingle-pass @640NoNormal
3NoSingle-pass @640NoNormal
4NoSingle-pass @640NoNormal
5NoSingle-pass @640NoNormal (no SAHI; full_sahi_interval only applies when adaptive_sahi=True)
6NoSingle-pass @640NoNormal
7NoSingle-pass @640NoNormal

14.3 Detailed Timeline

Round 1 — First Iteration (Heaviest)

#StepTimeCumulative
1Game running check5 ms5 ms
2Ensure game focus50 ms55 ms
3Screenshot capture20 ms75 ms
4YOLO detection (single-pass @640)150 ms225 ms
5Entity ownership classification5 ms230 ms
6Alarm check10 ms240 ms
7Strategist API call (Sonnet, text — resources via local OCR)5000 ms5240 ms
8Build LLM context10 ms5250 ms
9Executor agentic loop2000 ms7250 ms
10Process response + memory update50 ms7300 ms
11Ground commands (scroll ×5, select scout, auto-scout)250 ms7550 ms
12Action execution (~4 actions)200 ms7750 ms
13Loop delay1000 ms8750 ms

Round 1 total: ~8.75 s

Rounds 2–4, 6–7 — Normal Iterations

#StepTimeCumulative
1Game running check5 ms5 ms
2Ensure game focus50 ms55 ms
3Screenshot capture20 ms75 ms
4YOLO detection (single-pass @640)150 ms225 ms
5Entity ownership classification5 ms230 ms
6Alarm check10 ms240 ms
7Strategist — skipped0 ms240 ms
8Build LLM context10 ms250 ms
9Executor agentic loop2000 ms2250 ms
10Process response + memory update50 ms2300 ms
11Ground commands — skipped0 ms2300 ms
12Action execution (~4 actions)200 ms2500 ms
13Loop delay1000 ms3500 ms

Normal round total: ~3.5 s

Round 5 — Normal Iteration

Pre-v6 this round forced a full SAHI scan (iteration % 5 == 0). With adaptive_sahi=False there is no forced full scan, so Round 5 is now an ordinary single-pass turn — identical in shape to Rounds 2–4.

#StepTimeCumulative
1Game running check5 ms5 ms
2Ensure game focus50 ms55 ms
3Screenshot capture20 ms75 ms
4YOLO detection (single-pass @640)150 ms225 ms
5Entity ownership classification5 ms230 ms
6Alarm check10 ms240 ms
7Strategist — skipped0 ms240 ms
8Build LLM context10 ms250 ms
9Executor agentic loop2000 ms2250 ms
10Process response + memory update50 ms2300 ms
11Ground commands — skipped0 ms2300 ms
12Action execution (~4 actions)200 ms2500 ms
13Loop delay1000 ms3500 ms

Round 5 total: ~3.5 s


14.4 Cumulative 7-Round Timeline

RoundTypeRound TimeCumulative Elapsed
1Strategist + Ground Cmds8.75 s8.75 s
2Normal3.50 s12.25 s
3Normal3.50 s15.75 s
4Normal3.50 s19.25 s
5Normal3.50 s22.75 s
6Normal3.50 s26.25 s
7Normal3.50 s29.75 s

Total 7-round run: ~29.8 s

Time distribution across the full run:

ComponentTotal Time% of Run
Executor agentic loop (API)14.00 s46.8%
Loop delay (sleep)7.00 s23.4%
Strategist API call5.00 s16.7%
Action execution + ground cmds1.45 s4.9%
YOLO detection1.37 s4.6%
Other (focus, screenshot, context, memory)1.09 s3.6%

The executor API calls dominate, followed by the sleep delay and the single strategist call on round 1.


14.5 Optimization Analysis

Optimization A: Async Strategist

Problem: The strategist call on round 1 blocks the loop for ~5 s (3–8 s range). This is the single most expensive step in a 7-round run at 16.7% of total time.

Proposal: Fire the strategist as a background asyncio.create_task(). The executor continues immediately with default/previous goals. When the strategist response arrives, goals update asynchronously.

Current (blocking):
  Round 1: ... → [Strategist 5000ms] → [Executor 2000ms] → ...
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                  Total: 7000ms sequential

Proposed (async):
  Round 1: ... → [Executor 2000ms with default goals] → ...
                  [Strategist 5000ms in background..........]
                  ^^^^^^^^^^^^^^^^
                  Total: 2000ms (strategist completes during round 2)

Impact on round 1:

StepCurrentWith Async Strategist
Strategist API call5000 ms0 ms (background)
Round 1 total8834 ms3834 ms

Tradeoffs:

AspectDetail
Round 1 goalsUses default base goals (“Queue villagers”, “Gather food”, “Advance to Feudal Age”) from strategist.py:190-198 until real goals arrive
Resource readingsEmpty on round 1; executor works without resource context for 1–2 turns
Goal stalenessGoals update when the async task completes — typically during round 2. Acceptable since goals are high-level strategic directives, not per-action commands
Thread safetyNo issue — asyncio is single-threaded cooperative. Goal updates happen between awaits
Implementation~15 lines changed in game_loop.py: replace await _run_strategist() with asyncio.create_task() + pre-seed default goals in GoalManager.__init__()

Verdict: Realistic. Saves 5.0 s on a 7-round run (16.7% improvement). The default goals are reasonable for early Dark Age play and closely match what the strategist would generate anyway.

Optimization B: Loop Delay Reduction

Problem: The 1000 ms loop_delay sleep accounts for 7.0 s across 7 rounds (23.4% of total time). It runs every single iteration.

Why it exists: Prevents rapid-fire inputs to the game and allows the screen to update between iterations.

Analysis: The pipeline already introduces ~1.5–3.5 s of natural latency per iteration:

  • Detection: 150–234 ms
  • Executor API loop: 1000–3500 ms
  • Action execution: 200+ ms with built-in delays (action_delay=0.05s, BUILD_SETTLE_DELAY=0.15s)

The game renders at 60 fps (16 ms/frame). The screen fully updates within 50–100 ms of any action. The existing action-level delays already pace inputs.

Three scenarios:

Scenarioloop_delayPer-round savings7-round savingsRisk
Current1.0 sNone
Conservative0.3 s0.7 s4.9 sMinimal — 300 ms is still 18 frames of render time
Aggressive0.0 s1.0 s7.0 sScreenshots may capture mid-animation; rapid API calls

Verdict: Reducing to 0.3 s is safe and realistic. Saves 4.9 s on a 7-round run (16.4% improvement). Eliminating entirely (0.0 s) is viable but carries minor risk of stale screenshots.

Combined Impact

ScenarioRound 1Rounds 2–4,6–7Round 57-Round TotalSavings
Current8.83 s3.50 s3.58 s29.91 s
Async strategist only3.83 s3.50 s3.58 s24.91 s5.0 s (16.7%)
Loop delay 0.3 s only8.13 s2.80 s2.88 s24.81 s5.1 s (17.0%)
Both combined3.13 s2.80 s2.88 s19.81 s10.1 s (33.8%)

With both optimizations, a 7-round run drops from ~30 s to ~20 s — a one-third reduction in wall-clock time with no loss of gameplay quality.


14.6 Variability and Edge Cases

The timings above assume a clean run with no alarms. Real runs may vary:

EventEffect on Timing
Alarm triggered (enemy detected)Strategist runs on alarm turn (+5 s). Alarm goal injected at priority 10. (Pre-v6 an alarm also forced a full SAHI scan; with adaptive_sahi=False detection stays single-pass.)
Rescan during executor loopA press action with rescan: true triggers mid-turn screenshot + detection. Adds ~50–300 ms per rescan depending on frame differ result.
Strategist retry (API error)SDK retries up to 2× with exponential backoff. Could add 5–15 s on failure turns. Falls back to default goals on total failure.
Executor max iterationsIf the executor uses all 7 tool call iterations, the agentic loop may take 3.5 s+ instead of the typical 2 s.
Game not focusedensure_game_focused() fails → 1 s sleep + skip iteration entirely. Round time becomes ~1 s of wasted wall-clock.
Remote detection server downFalls back to local ONNX inference. Detection time jumps from ~150 ms to ~1200 ms per turn.

14.7 Visual Timeline (7 Rounds)

Time (s)  0         5         10        15        20        25        30
          |---------|---------|---------|---------|---------|---------|
Round 1   [===STRATEGIST====][EXEC][ACT][SLEEP]
Round 2                                          [DET][EXEC][ACT][SLP]
Round 3                                                               [DET][EXEC][ACT][SLP]
Round 4                                                                                    [DET][EXEC][ACT][SLP]
Round 5                                                                                                         [DET][EXEC][ACT][SLP]
Round 6                                                                                                                                [DET][EXEC][ACT][SLP]
Round 7                                                                                                                                                     [DET][EXEC][ACT][SLP]

Legend: DET = detection (single-pass @640, every turn)  EXEC = executor API  ACT = action execution  SLP = sleep
        STRATEGIST = Sonnet API call (text — resources via local OCR; only round 1 in a 7-round run)