Chapter 11: Sprite Extraction
Training data generation requires individual sprite images for each entity class. These are extracted from AoE2:DE’s proprietary SLD file format — a GPU-compressed sprite format used by the game engine.
11.1 SLD File Format
SLD (Sprite Layer Data) files use the SLDX signature and store multi-layer sprite data with GPU texture compression.
File Structure
Header:
- Signature: "SLDX" (4 bytes)
- Version: uint16
- Num frames: uint16
Frame Headers (repeated):
- Width, height: uint16 each
- Hotspot X, Y: int16 each (anchor point for game positioning)
- Layer flags: uint8 bitfield
- Per-layer: content_length + compressed pixel data
Layer Flags (Bitfield)
| Bit | Layer | Compression | Purpose |
|---|---|---|---|
| 0 | Main graphics | DXT1 (BC1) | Unit/building appearance |
| 1 | Shadow | BC4 | Shadow overlay |
| 2 | Unknown | — | Unused in extraction |
| 3 | Damage | DXT1 | Damaged variant |
| 4 | Player color | DXT1 | Team color overlay |
The extractor reads the main graphics layer (bit 0) and optionally the player color layer (bit 4).
11.2 DXT1 (BC1) Decompression
DXT1 is a lossy texture compression format that encodes 4x4 pixel blocks into 8 bytes:
Block Layout (8 bytes)
Bytes 0-1: Color 0 (RGB565)
Bytes 2-3: Color 1 (RGB565)
Bytes 4-7: 2-bit index table (16 pixels, 4x4)
Color Palette Generation
Two reference colors are stored as RGB565 (5 bits red, 6 bits green, 5 bits blue):
# RGB565 to RGB888 conversion
r = ((c >> 11) & 0x1F) * 255 // 31
g = ((c >> 5) & 0x3F) * 255 // 63
b = (c & 0x1F) * 255 // 31
Four palette colors are derived:
| Mode | Color 0 | Color 1 | Color 2 | Color 3 |
|---|---|---|---|---|
| Opaque (c0 > c1) | c0 | c1 | 2/3c0 + 1/3c1 | 1/3c0 + 2/3c1 |
| Transparent (c0 <= c1) | c0 | c1 | 1/2c0 + 1/2c1 | transparent (alpha=0) |
Each pixel in the 4x4 block uses a 2-bit index to select one of these 4 colors.
Deep dive — DXT1 / BC4 block compression (why 4×4 blocks and what they trade)
DXT1 (aka BC1) and BC4 are part of the block-compression family used by GPUs since the late 1990s. Both compress 4×4 pixel blocks into 8 bytes. The motivation is simple and unchanged: GPUs need texture data to be small enough to fit in cache and aligned to a power-of-two pixel grid so they can fetch a sample in one memory transaction.
Why 4×4 blocks specifically. GPU texture samplers always fetch a 2×2 quad (for trilinear filtering), and most modern GPUs further coalesce that into 4×4 cache lines. So 4×4 is the smallest block size where the compressed unit fits neatly into one cache fetch — anything smaller wastes the slack; anything larger spreads a single sample across multiple fetches.
DXT1’s encoding. Each block stores two reference colors as RGB565 (a packed 16-bit color: 5 bits red, 6 bits green, 5 bits blue — green gets extra precision because the eye is most sensitive to it). A 2-bit index per pixel picks one of four colors: the two reference colors plus two interpolated colors at 1/3 and 2/3 along the line between them. Per pixel cost: 4 bits — a 32× compression over uncompressed RGBA8.
The clever transparency bit. If the first reference color is less than or equal to the second (treating them as 16-bit ints), DXT1 switches modes: now there are only three colors, and the fourth index value (0b11) means fully transparent. The compression is the same 8 bytes, but you’ve gained 1-bit alpha for free. AoE2’s SLD format uses this for cutout sprites.
The lossy part. Two reference colors per 4×4 block is fine for smooth gradients (skin tones, sky) but visibly worse for sharp color transitions (text, fine line art, vibrant rainbow patterns). Sprites with strong color contrast inside a 4×4 region get ugly artifacts. AoE2’s sprites are visually noisy enough (organic textures, painted brush strokes) that DXT1 is invisible in practice.
BC4 — single-channel. Same idea, applied to one channel only (alpha or luminance). Each 4×4 block stores two reference single-byte values and uses 3 bits per pixel to interpolate between them. Result: 8 alpha levels per block, 3 bits × 16 pixels = 6 bytes + 2 bytes for refs = 8 bytes total. AoE2 shadows are perfect for this — pure alpha gradients with no color information.
The trade-off: 4× to 8× smaller textures, vs lossy compression that’s typically invisible on photorealistic or hand-painted art and ugly on text/icons. Modern alternatives — BC7 (better quality at the same ratio), ASTC (variable block size) — exist but are slower to decode and weren’t around when AoE2:DE was built.
Why we decode in software rather than ship to a GPU. We’re not rendering sprites; we’re extracting them to PNGs for training data generation. A pure-Python decompressor runs once per sprite at extraction time, then never again. Decoding is ~50 lines of bit-twiddling per format.
Further reading. The OpenGL extension spec for EXT_texture_compression_s3tc is the canonical DXT1 reference. The openage project (open-source AoE engine) has the most complete public reverse-engineering notes on the SLD format that wraps these compression blocks.
11.3 BC4 Shadow Decompression
Shadows use BC4 compression — single-channel (alpha) with 8 bytes per 4x4 block:
Bytes 0-1: Two reference alpha values
Bytes 2-7: 3-bit index table (16 pixels)
Eight alpha levels are interpolated between the two reference values. The extractor uses these as shadow intensity masks.
11.4 Command Array (Run-Length Encoding)
SLD sprites are sparse — most of the bounding box is transparent. A command array encodes skip/draw pairs:
For each row:
- Skip N transparent pixels
- Draw M opaque pixels (from compressed data)
- Repeat until row width reached
This avoids storing and decompressing transparent regions, significantly reducing file size for small sprites on large canvases.
11.5 Player Color Recoloring
AoE2 uses 8 team colors. The base sprites use blue as the default player color, and the game recolors them at runtime.
The extractor performs luminance-preserving hue shift:
- Identify blue-range pixels in the player color layer (hue 180-260)
- Compute luminance from original pixel
- Map to target team color while preserving luminance
- Blend with main graphics layer
8 team colors: Blue, Red, Green, Yellow, Cyan, Purple, Gray, Orange.
For training data, sprites are extracted in 2-3 random player colors to teach the model that the same unit can appear in different colors.
11.6 Batch Extraction
packages/detection/src/extraction/extract_sprites.py defines 46 sprite categories with glob patterns:
("villager", [
"u_vil_male_villager_idle*_x1.sld",
"u_vil_female_villager_idle*_x1.sld", ...
], 6, "Worker units"),
("knight_line", [
"u_cav_knight_idle*_x1.sld",
"u_cav_cavalier_idle*_x1.sld",
"u_cav_paladin_idle*_x1.sld",
], 4, "Knight, Cavalier, Paladin"),
Each category specifies:
- Class name (matching the detection taxonomy)
- Glob patterns for SLD files in
game_graphics/ - Number of variants to extract (4-6 per class)
- Description
Animation frames [0, 4, 8, 12] are extracted to capture idle and walking poses.
11.7 Output
Extracted sprites are saved as RGBA PNG files in tmp/sprites/{class_name}/:
tmp/sprites/
├── villager/
│ ├── villager_0_blue.png
│ ├── villager_0_red.png
│ ├── villager_1_blue.png
│ └── ...
├── sheep/
│ ├── sheep_0.png
│ └── ...
└── town_center/
├── town_center_0.png
└── ...
These PNGs are consumed by generate_training_data.py (see Chapter 8) to composite synthetic training images.
Key Insight: The SLD format is not publicly documented by Microsoft. The implementation is reverse-engineered from the openage project (open-source AoE engine), with additional AoE2:DE-specific discoveries around optional 2-byte markers before certain layers and content_length semantics (includes its own size). Some building variants with unusual layer configurations cause parsing failures and are skipped.
Summary
- SLD files use DXT1/BC4 GPU texture compression with run-length encoded command arrays
- 4x4 pixel blocks with 4-color palettes (DXT1) or 8-level alpha (BC4)
- Player color recoloring via luminance-preserving hue shift
- 46 sprite categories extracted with multiple animation frames and team colors
- Output: RGBA PNGs consumed by the synthetic data generator
Related Topics
- Chapter 8: Training Pipeline — how sprites become training data
- Chapter 7: Detector Architecture — the 60-class taxonomy these sprites map to