Chapter 6: Context Injection

Each API call sends more than just a screenshot. The LLM receives layered context: memory from previous turns, detected entities, and optionally dynamic game knowledge tailored to the current game state.

6.1 Context Assembly

The context string is built in two stages: _build_llm_context() in the game loop builds the base context, then the Claude provider optionally enhances it with game knowledge.

6.2 Memory System

Data Structures (`apps/agent/src/memory.py`)

Turn — a single decision cycle:

@dataclass
class Turn:
    iteration: int
    timestamp: str
    reasoning: str
    actions: list[dict]
    observed_resources: dict | None = None
    observed_events: list[str] = field(default_factory=list)

GameState — cumulative state:

@dataclass
class GameState:
    resources: dict  # {"food": 0, "wood": 0, "gold": 0, "stone": 0}
    population: int
    population_cap: int
    current_age: str  # "Dark Age", "Feudal Age", etc.
    idle_tc: bool
    under_attack: bool
    enemy_located: bool
    enemy_location: str

Initial resources and population are defined as named constants (INITIAL_RESOURCES, INITIAL_POPULATION, INITIAL_POPULATION_CAP).

AgentMemory — the memory manager:

working_memory: deque(maxlen=10) — last 10 turns
episode_summary: string for long-term context (currently unused but plumbed)
game_state: a single GameState updated from observations
turn_count: monotonically increasing counter

Observation Feedback Loop

After each Claude response, create_turn():

Creates a Turn record with reasoning, actions, and extracted observations
Calls update_from_observations() to update GameState
Appends the turn to working memory

update_from_observations() parses the LLM’s self-reported observations:

resources dict updates directly
population string like "12/15" is split to set population and population_cap
age string overwrites current_age
Boolean flags (idle_tc, under_attack) update directly

This creates a feedback loop: the LLM reports what it sees, those observations become context for the next turn, and the LLM can track trends (e.g., resources increasing, population growing).

Context Formatting

get_context_for_llm() builds a human-readable context string with three sections:

Current Game State:

## Current Game State
- Resources: Food=200, Wood=150, Gold=100, Stone=200
- Population: 7/10
- Housed: False
- Age: Dark Age
- TC Idle: True
- Under Attack: False

Episode Summary (if exists — currently a placeholder for future use)

Recent Decisions (last 3 turns):

## Recent Decisions
Turn 1: I see the TC and some sheep. Need to gather food...\n  Actions: press(h), press(q)
Turn 2: Villagers are idle. Sending them to sheep...\n  Actions: press(.), right_click(640,380)

The “housed” flag is computed in _format_game_state(): population >= population_cap and population_cap > 0. When true, it’s flagged prominently (HOUSED (cannot create villagers!)) to alert the LLM.

Stuck-loop detection counts consecutive turns with no visible change. After STUCK_LOOP_THRESHOLD (3) failures, a warning is injected: “Last N actions had NO EFFECT. You MUST try a completely different approach.”

6.3 Entity Context

Built by _build_llm_context() in the game loop, using build_entity_summary() from entity_utils.py:

def build_entity_summary(
    entities: list[object],
    max_count: int = 20,
    ownership_results: dict | None = None,
) -> str:
    lines = []
    for entity in entities[:max_count]:
        attrs = extract_attrs(entity)
        owner_tag = ""
        if ownership_results and attrs.entity_id in ownership_results:
            owner_tag = f" [{ownership_results[attrs.entity_id][0].value}]"
        lines.append(
            f"  {attrs.entity_id}: {attrs.class_name}{owner_tag}"
            f" at ({int(attrs.center[0])},{int(attrs.center[1])})"
            f" [{attrs.confidence:.0%}]"
        )
    return "\n".join(lines)

extract_attrs() normalizes both DetectedEntity objects and plain dicts into an EntityAttrs named tuple, eliminating the hasattr chains that previously existed inline.

The ENTITY_DISPLAY_LIMIT = 20 constant caps entity count to prevent token bloat. Entities are sorted by confidence in the detector (see Chapter 7), so the top 20 are the most reliable detections.

Entity context is prepended to memory context, so the LLM sees detections first.

6.4 Dynamic Game Knowledge

When the game knowledge database is available, _get_dynamic_context() in the Claude provider enhances the context:

Resource Extraction

Parses the memory context using regex to extract current resources and age:

food_match = re.search(r"Food[=:]?\s*(\d+)", context, re.IGNORECASE)
age_match = re.search(r"(Dark|Feudal|Castle|Imperial)\s*Age", context, re.IGNORECASE)

Defaults to {"food": 200, "wood": 200, "gold": 100, "stone": 200} and "dark" if parsing fails.

Database Queries

With the extracted state, queries the SQLite database:

get_context_for_state(age, resources) — returns a 200-500 token string listing:
- Units affordable with current resources at current age
- Buildings affordable with current resources at current age
- Counter information for visible enemy units
get_early_game_priorities() — returns static strategic tips:
- “Keep TC producing villagers at all times”
- “Build houses before getting housed”
- “Scout early to find resources and enemy”

Context Assembly

The enhanced context prepends dynamic knowledge before the original context:

enhanced_context = f"{dynamic_context}\n{early_game_tips}\n{context}"

So the LLM sees: dynamic knowledge > early game tips > detected entities > game state > recent turns.

Key Insight: The dynamic context is resource-aware. If the player has 300 food and 200 wood in Feudal Age, the context lists only units and buildings affordable at those resource levels. This prevents the LLM from trying to build a Castle (650 stone) when it has 0 stone, or training Knights (60 food + 75 gold) when gold is scarce.

The pattern in this section is a generic technique that’s worth naming: scope the context to what’s currently possible, not what’s theoretically possible.

Most prompt injection looks like “here’s everything about the game; figure out what to do.” The model has to do two hard jobs at once: filter the catalog down to what applies right now, then pick the best option. Two hard jobs is where hallucinations live — the model picks “Castle: 650 stone” because the option was on the table, and it confidently emits an action that can’t possibly execute.

By pre-filtering the context to “units you can afford right now at your current age,” we offload the filtering step to deterministic code (a SQL query) and leave the model with the single, easier job of choosing among feasible options. Wrong choices still happen, but impossible choices stop showing up.

The same pattern is used in retrieval-augmented generation (only retrieve documents that are topically relevant), in tool use (only expose tools the model can actually invoke right now), and in agentic frameworks generally (the action space shrinks dynamically with state). When you find yourself writing “the model keeps trying to do X even though it can’t” in a postmortem, the answer is usually “remove X from the context until it can.”

6.5 Full Context Example

A typical context string sent to Claude (in addition to the screenshot):

Screenshot dimensions: 1920x1080 pixels. Center=(960,540). Valid x=0-1920, y=0-1080.

## Available Units (affordable)
- Villager: 50 food (from Town Center)
- Militia: 60 food, 20 gold (from Barracks)

## Available Buildings (affordable)
- House: 25 wood
- Barracks: 175 wood

## Early Game Tips
- Keep TC producing villagers
- Build houses before pop cap

## Detected Entities
  town_center_0: town_center at (960,520) [97%]
  sheep_0: sheep at (640,380) [92%]
  sheep_1: sheep at (680,400) [89%]
  villager_0: villager at (520,310) [88%]
  villager_1: villager at (550,340) [85%]

## Current Game State
- Resources: Food=200, Wood=150, Gold=100, Stone=200
- Population: 5/10
- Housed: False
- Age: Dark Age
- TC Idle: True
- Under Attack: False

## Recent Decisions
Turn 1: Starting game. I see the TC and some sheep...
  Actions: press(h), press(q)

What should I do next?

Summary

Three context layers: memory state, detected entities, dynamic game knowledge
Working memory keeps last 10 turns; LLM sees last 3
Observation feedback loop: LLM reports state, memory tracks it, next turn sees updates
Dynamic context filters by current age and resources
20-entity cap (ENTITY_DISPLAY_LIMIT) prevents token bloat; entity formatting via entity_utils.py

Chapter 2: Game Loop Pipeline — where context is built
Chapter 4: Provider Pattern — where context is sent to Claude
Chapter 10: Knowledge Database — the SQLite backend for dynamic queries