AoE2 · LLM Arena

Chapter 11: Sprite Extraction

Training data generation requires individual sprite images for each entity class. These are extracted from AoE2:DE’s proprietary SLD file format — a GPU-compressed sprite format used by the game engine.

11.1 SLD File Format

SLD (Sprite Layer Data) files use the SLDX signature and store multi-layer sprite data with GPU texture compression.

File Structure

Header:
  - Signature: "SLDX" (4 bytes)
  - Version: uint16
  - Num frames: uint16

Frame Headers (repeated):
  - Width, height: uint16 each
  - Hotspot X, Y: int16 each (anchor point for game positioning)
  - Layer flags: uint8 bitfield
  - Per-layer: content_length + compressed pixel data

Layer Flags (Bitfield)

BitLayerCompressionPurpose
0Main graphicsDXT1 (BC1)Unit/building appearance
1ShadowBC4Shadow overlay
2UnknownUnused in extraction
3DamageDXT1Damaged variant
4Player colorDXT1Team color overlay

The extractor reads the main graphics layer (bit 0) and optionally the player color layer (bit 4).

11.2 DXT1 (BC1) Decompression

DXT1 is a lossy texture compression format that encodes 4x4 pixel blocks into 8 bytes:

Block Layout (8 bytes)

Bytes 0-1: Color 0 (RGB565)
Bytes 2-3: Color 1 (RGB565)
Bytes 4-7: 2-bit index table (16 pixels, 4x4)

Color Palette Generation

Two reference colors are stored as RGB565 (5 bits red, 6 bits green, 5 bits blue):

# RGB565 to RGB888 conversion
r = ((c >> 11) & 0x1F) * 255 // 31
g = ((c >> 5) & 0x3F) * 255 // 63
b = (c & 0x1F) * 255 // 31

Four palette colors are derived:

ModeColor 0Color 1Color 2Color 3
Opaque (c0 > c1)c0c12/3c0 + 1/3c11/3c0 + 2/3c1
Transparent (c0 <= c1)c0c11/2c0 + 1/2c1transparent (alpha=0)

Each pixel in the 4x4 block uses a 2-bit index to select one of these 4 colors.

Deep dive — DXT1 / BC4 block compression (why 4×4 blocks and what they trade)

DXT1 (aka BC1) and BC4 are part of the block-compression family used by GPUs since the late 1990s. Both compress 4×4 pixel blocks into 8 bytes. The motivation is simple and unchanged: GPUs need texture data to be small enough to fit in cache and aligned to a power-of-two pixel grid so they can fetch a sample in one memory transaction.

Why 4×4 blocks specifically. GPU texture samplers always fetch a 2×2 quad (for trilinear filtering), and most modern GPUs further coalesce that into 4×4 cache lines. So 4×4 is the smallest block size where the compressed unit fits neatly into one cache fetch — anything smaller wastes the slack; anything larger spreads a single sample across multiple fetches.

DXT1’s encoding. Each block stores two reference colors as RGB565 (a packed 16-bit color: 5 bits red, 6 bits green, 5 bits blue — green gets extra precision because the eye is most sensitive to it). A 2-bit index per pixel picks one of four colors: the two reference colors plus two interpolated colors at 1/3 and 2/3 along the line between them. Per pixel cost: 4 bits — a 32× compression over uncompressed RGBA8.

The clever transparency bit. If the first reference color is less than or equal to the second (treating them as 16-bit ints), DXT1 switches modes: now there are only three colors, and the fourth index value (0b11) means fully transparent. The compression is the same 8 bytes, but you’ve gained 1-bit alpha for free. AoE2’s SLD format uses this for cutout sprites.

The lossy part. Two reference colors per 4×4 block is fine for smooth gradients (skin tones, sky) but visibly worse for sharp color transitions (text, fine line art, vibrant rainbow patterns). Sprites with strong color contrast inside a 4×4 region get ugly artifacts. AoE2’s sprites are visually noisy enough (organic textures, painted brush strokes) that DXT1 is invisible in practice.

BC4 — single-channel. Same idea, applied to one channel only (alpha or luminance). Each 4×4 block stores two reference single-byte values and uses 3 bits per pixel to interpolate between them. Result: 8 alpha levels per block, 3 bits × 16 pixels = 6 bytes + 2 bytes for refs = 8 bytes total. AoE2 shadows are perfect for this — pure alpha gradients with no color information.

The trade-off: 4× to 8× smaller textures, vs lossy compression that’s typically invisible on photorealistic or hand-painted art and ugly on text/icons. Modern alternatives — BC7 (better quality at the same ratio), ASTC (variable block size) — exist but are slower to decode and weren’t around when AoE2:DE was built.

Why we decode in software rather than ship to a GPU. We’re not rendering sprites; we’re extracting them to PNGs for training data generation. A pure-Python decompressor runs once per sprite at extraction time, then never again. Decoding is ~50 lines of bit-twiddling per format.

Further reading. The OpenGL extension spec for EXT_texture_compression_s3tc is the canonical DXT1 reference. The openage project (open-source AoE engine) has the most complete public reverse-engineering notes on the SLD format that wraps these compression blocks.

11.3 BC4 Shadow Decompression

Shadows use BC4 compression — single-channel (alpha) with 8 bytes per 4x4 block:

Bytes 0-1: Two reference alpha values
Bytes 2-7: 3-bit index table (16 pixels)

Eight alpha levels are interpolated between the two reference values. The extractor uses these as shadow intensity masks.

11.4 Command Array (Run-Length Encoding)

SLD sprites are sparse — most of the bounding box is transparent. A command array encodes skip/draw pairs:

For each row:
  - Skip N transparent pixels
  - Draw M opaque pixels (from compressed data)
  - Repeat until row width reached

This avoids storing and decompressing transparent regions, significantly reducing file size for small sprites on large canvases.

11.5 Player Color Recoloring

AoE2 uses 8 team colors. The base sprites use blue as the default player color, and the game recolors them at runtime.

The extractor performs luminance-preserving hue shift:

  1. Identify blue-range pixels in the player color layer (hue 180-260)
  2. Compute luminance from original pixel
  3. Map to target team color while preserving luminance
  4. Blend with main graphics layer

8 team colors: Blue, Red, Green, Yellow, Cyan, Purple, Gray, Orange.

For training data, sprites are extracted in 2-3 random player colors to teach the model that the same unit can appear in different colors.

11.6 Batch Extraction

packages/detection/src/extraction/extract_sprites.py defines 46 sprite categories with glob patterns:

("villager", [
    "u_vil_male_villager_idle*_x1.sld",
    "u_vil_female_villager_idle*_x1.sld", ...
], 6, "Worker units"),

("knight_line", [
    "u_cav_knight_idle*_x1.sld",
    "u_cav_cavalier_idle*_x1.sld",
    "u_cav_paladin_idle*_x1.sld",
], 4, "Knight, Cavalier, Paladin"),

Each category specifies:

  • Class name (matching the detection taxonomy)
  • Glob patterns for SLD files in game_graphics/
  • Number of variants to extract (4-6 per class)
  • Description

Animation frames [0, 4, 8, 12] are extracted to capture idle and walking poses.

11.7 Output

Extracted sprites are saved as RGBA PNG files in tmp/sprites/{class_name}/:

tmp/sprites/
├── villager/
│   ├── villager_0_blue.png
│   ├── villager_0_red.png
│   ├── villager_1_blue.png
│   └── ...
├── sheep/
│   ├── sheep_0.png
│   └── ...
└── town_center/
    ├── town_center_0.png
    └── ...

These PNGs are consumed by generate_training_data.py (see Chapter 8) to composite synthetic training images.

Key Insight: The SLD format is not publicly documented by Microsoft. The implementation is reverse-engineered from the openage project (open-source AoE engine), with additional AoE2:DE-specific discoveries around optional 2-byte markers before certain layers and content_length semantics (includes its own size). Some building variants with unusual layer configurations cause parsing failures and are skipped.


Summary

  • SLD files use DXT1/BC4 GPU texture compression with run-length encoded command arrays
  • 4x4 pixel blocks with 4-color palettes (DXT1) or 8-level alpha (BC4)
  • Player color recoloring via luminance-preserving hue shift
  • 46 sprite categories extracted with multiple animation frames and team colors
  • Output: RGBA PNGs consumed by the synthetic data generator