⚠️ Clean-Room / Educational

This project is educational and Open Source. No code is copied from other emulators. Implementation based solely on technical documentation and permitted tests.

DMG VRAM Tiledata Write Audit + Force-Writes A/B + IRQ Tracking Fix

Date:2026-01-10 StepID:0501 State: VERIFIED

Summary

This Step implements a complete VRAM write auditing system to diagnose why DMG games (specificallytetris.gb) do not display visual content. Added extensive instrumentation inMMU::write()for VRAM (0x8000-0x9FFF) that captures PPU state, crash reason, and verifies immediate readback. A flag was implementedVIBOY_VRAM_FORCE_WRITESfor isolation tests that forces writes even when they are blocked. Fixed an inconsistency in IRQ tracking whereirq_vblank_services_andinterrupt_taken_counts_[0]they did not increase at the same point. Added PPU mode tracking to detect timing problems (mode3 stuck). Finally, the DMG v3 classifier was implemented that uses these new metrics for more accurate diagnosis.

Hardware Concept

VRAM Access Rules (Pan Docs): VRAM (0x8000-0x9FFF) is accessible by the CPU during certain PPU modes:

  • LCD OFF (LCDC bit 7 = 0): VRAM is always accessible
  • LCD ON (LCDC bit 7 = 1):
    • Mode 0 (HBlank): Accessible VRAM
    • Mode 1 (VBlank): Accessible VRAM
    • Mode 2 (OAM Search): Accessible VRAM
    • Mode 3 (Pixel Transfer): VRAMNOT accessible- locked

If the CPU attempts to write to VRAM during Mode 3 when LCD is ON, the write is blocked (ignored). This is critical because the PPU is reading data from VRAM for rendering, and a simultaneous write could corrupt the data.

PPU Modes: The PPU operates in 4 main modes (documented in STAT register):

  • Mode 0 (HBlank): Period between lines (252-455 CPU cycles)
  • Mode 1 (VBlank): Period between frames (lines 144-153, ~4560 cycles)
  • Mode 2 (OAM Search): Sprite search (0-79 CPU cycles)
  • Mode 3 (Pixel Transfer): Pixel transfer to LCD (80-251 CPU cycles)

If Mode 3 lasts too long (more than 456 cycles per line), it indicates a PPU timing problem that could cause incorrect VRAM locks.

IRQ Tracking Consistency: For IRQ tracking to be reliable, the "service" and "IRQ taken" counters must be incremented at the exact same point. If they increase at different points, they can become out of sync and cause incorrect diagnoses.

Reference: Pan Docs - VRAM Access, LCD Timing, STAT Register

Implementation

Phase A: IRQ Tracking Fix ✅

A1) Unification of IRQ Tracking Semantics:

  • Modification ofCPU::handle_interrupts()to ensure thatirq_vblank_services_andinterrupt_taken_counts_[0]increase at the exact same point
  • Added desynchronization log if counters do not match (debugging)
  • Ensures that "IRQ serviced" and "IRQ taken" are semantically equivalent

Phase B: VRAM Write Audit ✅

B1) Instrumentation in MMU::write() for VRAM:

  • New structureVRAMWriteEvent:
    • frame_id: Frame ID
    • pc, addr, value: Write address and value
    • region: TILE_DATA (0x8000-0x97FF) or TILE_MAP (0x9800-0x9FFF)
    • lcdc, lcd_on, stat_mode, ly: PPU state at the time of write
    • allowed, blocked_reason: If the write was allowed and why
    • readback_value, readback_matches: Immediate integrity check
    • forced: If the write was forced (VIBOY_VRAM_FORCE_WRITES=1)
  • New structureVRAMWriteAuditStats:
    • Added counters:tiledata_write_attempts, tiledata_write_allowed, tiledata_write_blocked, tiledata_write_readback_mismatch
    • Same for tilemap
    • last_blocked_pc, last_blocked_addr, last_blocked_reason: Last lock for debugging
  • Ring buffer of 256 VRAM events (vram_write_ring_)
  • Full capture inMMU::write()whenaddr >= 0x8000 && addr<= 0x9FFF
  • Pan Docs Rules Enforcement: Block writes during Mode 3 if LCD is ON
  • Immediate readback after successful write to verify integrity

B2) Flag VIBOY_VRAM_FORCE_WRITES:

  • Environment variableVIBOY_VRAM_FORCE_WRITESwhich forces VRAM writes even when locked
  • Useful for isolation tests: difference between timing/mode problems vs MMU/decoder bugs
  • If the game works with force-writes, the problem is timing; If not, it is a bug in MMU/decoder

Phase C: PPU Mode Reality Check ✅

C) PPU Mode Metrics per Frame:

  • New structurePPUModeStats:
    • mode_entries_count[4]: Number of inputs in each mode (0, 1, 2, 3)
    • mode_cycles[4]: Total cycles in each mode
    • ly_min, ly_max: Observed LY range
    • frames_with_mode3_stuck: Frames where Mode 3 lasted more than 456 cycles
  • Update onPPU::step():
    • increase ofmode_entries_count[mode_]andmode_cycles[mode_]inupdate_mode()
    • Updately_minandly_maxafter increasingly_
    • Mode 3 stuck detection: yesmode_ == 3andcycles > CYCLES_PER_SCANLINE, increments counter

Phase E: DMG Classifier v3 ✅

E) DMG v3 Classifier with VRAM Audit:

  • New method_classify_dmg_quick_v3()inrom_smoke_0442.py
  • Useget_vram_write_audit_stats(), get_vram_write_ring(), andget_ppu_mode_stats()
  • New classifications:
    • VRAM_BLOCKED_INCORRECTLY: Writes blocked in modes that should not block (not Mode 3)
    • VRAM_BLOCKED_MODE3_STUCK: Writes blocked because Mode 3 is stuck
    • VRAM_WRITE_READBACK_MISMATCH: Writes allowed but readback does not match
    • PPU_MODE3_STUCK: PPU in Mode 3 too long
    • PPU_MODE3_DOMINANT: Mode 3 takes up more than 60% of the time
    • VRAM_NO_ATTEMPTS: No attempts to write to VRAM (program does not try to load tiles)
    • VRAM_ALLOWED_BUT_NOT_READABLE: Writes allowed but then cannot be read correctly
  • Fallback to v2 if no new metrics are available
  • Updated_classify_dmg_quick()to use v3 by default

Cython Exposure ✅

Cython Wrappers:

  • mmu.pxd: Added structuresVRAMWriteEventandVRAMWriteAuditStats
  • mmu.pyx: Implemented methodsget_vram_write_audit_stats()andget_vram_write_ring()
  • ppu.pxd: Added structurePPUModeStats
  • ppu.pyx: Implemented methodget_ppu_mode_stats()
  • Type correction: use of!= 0ratherbool()to convertbintto Python bool
  • Explicit import ofPPUModeStatsinppu.pyx

Affected Files

  • src/core/cpp/MMU.hpp: StructuresVRAMWriteEventandVRAMWriteAuditStats
  • src/core/cpp/MMU.cpp: VRAM instrumentation inwrite(), getter implementation
  • src/core/cpp/CPU.cpp: IRQ tracking unification fix
  • src/core/cpp/PPU.hpp: StructurePPUModeStats
  • src/core/cpp/PPU.cpp: PPU mode tracking instep()
  • src/core/cython/mmu.pxd: Structure and method declarations
  • src/core/cython/mmu.pyx: Python wrappers for VRAM audit
  • src/core/cython/ppu.pxd: Declaration ofPPUModeStats
  • src/core/cython/ppu.pyx: Python Wrapper for PPU mode stats
  • tools/rom_smoke_0442.py: DMG v3 Classifier

Tests and Verification

Compilation:

python3 setup.py build_ext --inplace

Successful build without errors. Fixes applied:

  • Type correctionbool()!= 0forbintin Cython
  • Explicit import ofPPUModeStatsinppu.pyx

Unit Tests:

pytest tests/test_core_cpu.py -v

Result:6 passed in 0.14s

Build Test:

python3 test_build.py

Result:SUCCESS - The build pipeline works correctly

Native Validation: All new features are available and compiled correctly:

  • mmu.get_vram_write_audit_stats()
  • mmu.get_vram_write_ring(max_events)
  • ppu.get_ppu_mode_stats()

Results and Diagnosis

Current Status:

  • ✅ Unified and consistent IRQ tracking
  • ✅ Complete VRAM audit system implemented
  • ✅ PPU mode tracking to detect timing problems
  • ✅ DMG v3 Classifier with new metrics
  • ✅ Flag VIBOY_VRAM_FORCE_WRITES for isolation tests

Next Steps:

  • Phase D (Pending): Run 3 tests per ROM (tetris.gb, pkmn.gb) with different configurationsVIBOY_VRAM_FORCE_WRITESand capture snapshots
  • Phase F (Conditional): Apply minimum fix only if the diagnosis is conclusive

Note: The audit system is ready to collect detailed evidence about how VRAM is being accessed, whether writes are being locked correctly according to Pan Docs, and whether there are PPU timing issues that may be causing incorrect locks.

References

  • Pan Docs - VRAM Access Rules
  • Pan Docs - LCD Timing, PPU Modes, STAT Register
  • Pan Docs - Interrupts, IRQ Handling
  • Cython Documentation - Type Conversion, C++ Integration