⚠️ Clean-Room / Educational

This project is educational and Open Source. No code is copied from other emulators. Implementation based solely on technical documentation and permitted tests.

Direct Loads to Memory (LD (nn), A and LD A, (nn))

Date:2025-12-17 StepID:0030 State: Verified

Summary

Critical opcodes implemented0xEA (LD (nn), A)and0xFA (LD A, (nn))which allow direct memory access using 16-bit absolute addresses specified directly in the code. These instructions are essential so that games can save and read global variables, game states and graphic settings. The emulator was crashing at 0xEA when running Tetris DX, preventing graphs will be drawn. With this implementation, the emulator can advance beyond that point and begin to render the title screen.

Hardware Concept

Hedirect addressingis a memory access mode where the 16-bit address It is written directly in the code, right after the opcode. Unlike indirect addressing (ex: LD (HL), A where the address is in the HL register), here the address is part of the instruction itself.

On the LR35902 architecture, these instructions are essential to:

  • Access to global variables:Games save game states, scores, lives, etc. in fixed memory locations.
  • Access to hardware registers:Many I/O registers (such as PPU registers, timers, etc.) are mapped to specific addresses.
  • Buffer initialization:They are used to initialize memory areas without using intermediate registers.

LD (nn), A (0xEA):Read the next 2 bytes of the code (Little-Endian address), and write the value of accumulator A to that address. Consumes 4 M-Cycles (fetch opcode + fetch 2 bytes + write).

LD A, (nn) (0xFA):Read the next 2 bytes of the code (Little-Endian address), reads the byte of that address, and saves it in A. It consumes 4 M-Cycles (fetch opcode + fetch 2 bytes + read).

Fountain:Pan Docs - Instruction Set (LD (nn), A and LD A, (nn))

Implementation

Two new methods were implemented in the CPU class following the pattern established by other operations indirect load like LD (HL), A and LD (BC), A. The key difference is that here the address is read of the code usingfetch_word()instead of getting it from a registry.

Components created/modified

  • src/cpu/core.py: Added methods_op_ld_nn_ptr_a()(0xEA) and_op_ld_a_nn_ptr()(0xFA), and recorded in the dispatch table.
  • tests/test_cpu_load_direct.py: Created new file with 4 unit tests that validate writing, reading, roundtrip and multiple addresses.

Design decisions

The same pattern was followed as other indirect memory operations:

  • Use offetch_word()to read the 16-bit address (already handles Little-Endian and PC forward).
  • Logging withlogger.debug()to trace operations during debugging.
  • Explicit return of 4 M-Cycles as per Pan Docs specification.
  • Implicit masking:fetch_word()it already handles 16-bit wrap-around.

The methods are simple and direct, without premature optimizations, prioritizing clarity and correctness.

Affected Files

  • src/cpu/core.py- Added methods_op_ld_nn_ptr_a()and_op_ld_a_nn_ptr(), and registered in_opcode_table
  • tests/test_cpu_load_direct.py- Created new file with 4 unit tests

Tests and Verification

Exhaustive unit tests were executed that validate the correct behavior of both instructions:

Test Execution

Command executed:

python3 -m pytest tests/test_cpu_load_direct.py -v

Around:

  • OS: macOS (darwin 21.6.0)
  • Python: 3.9.6
  • pytest:8.4.2

Result:

============================== test session starts ==============================
platform darwin -- Python 3.9.6, pytest-8.4.2, pluggy-1.6.0
collected 4 items

tests/test_cpu_load_direct.py::TestLoadDirect::test_ld_direct_write PASSED [ 25%]
tests/test_cpu_load_direct.py::TestLoadDirect::test_ld_direct_read PASSED [ 50%]
tests/test_cpu_load_direct.py::TestLoadDirect::test_ld_direct_write_read_roundtrip PASSED [ 75%]
tests/test_cpu_load_direct.py::TestLoadDirect::test_ld_direct_different_addresses PASSED [100%]

============================== 4 passed in 0.39s ==============================

What is valid:

  • test_ld_direct_write:Verifies that LD (nn), A correctly writes the value of A to the specified address, consumes 4 M-Cycles, and advances PC correctly (3 bytes).
  • test_ld_direct_read:Verifies that LD A, (nn) correctly reads from the specified address, loads the value into A, consumes 4 M-Cycles, and advances PC correctly.
  • test_ld_direct_write_read_roundtrip:It validates that the same value can be written and read back, demonstrating that both instructions work correctly together.
  • test_ld_direct_different_addresses:It verifies that instructions work with different memory addresses, ensuring that there are no side effects between addresses.

Test code (essential fragment):

def test_ld_direct_write(self):
    """Verify direct writing to memory (LD (nn), A - 0xEA)."""
    mmu = MMU()
    cpu = CPU(mmu)
    cpu.registers.set_pc(0x0100)
    cpu.registers.set_a(0x55)
    
    # Write opcode + address 0xC000 (Little-Endian: 0x00 0xC0)
    mmu.write_byte(0x0100, 0xEA) # Opcode
    mmu.write_byte(0x0101, 0x00) # Low byte
    mmu.write_byte(0x0102, 0xC0) # High byte
    
    cycles = cpu.step()
    
    assert mmu.read_byte(0xC000) == 0x55
    assert cycles == 4
    assert cpu.registers.get_pc() == 0x0103

Why these tests demonstrate hardware behavior:

  • The tests verify that the address is read correctly in Little-Endian format (0x00 0xC0 = 0xC000).
  • They validate that the memory access is correct (read/write at the specified address).
  • They confirm the correct timing (4 M-Cycles according to Pan Docs).
  • They show that the PC is advancing correctly (3 bytes: opcode + 2 bytes of address).

Impact on ROM execution

Before this implementation, the emulator would crash when it encountered the 0xEA opcode during execution from Tetris DX. This prevented the game from progressing far enough to initialize the PPU and draw the title screen. With these opcodes implemented, the emulator can execute more instructions and potentially render graphics.

Sources consulted

Note: The implementation exactly follows the Pan Docs specification for timing (4 M-Cycles) and address format (Little-Endian).

Educational Integrity

What I Understand Now

  • Direct vs indirect addressing:Direct addressing allows you to specify the address in the code, while indirect addressing obtains it from a register. They both have different use cases: direct for global variables and hardware registers, indirect for loops and dynamic pointers.
  • Little-Endian in addresses:16-bit addresses are stored in memory with the low byte first (0x00) and the high byte second (0xC0), forming 0xC000. This is consistent with the entire LR35902 architecture.
  • Timing instructions:Instructions that read addresses from the code consume more cycles because they must make multiple memory accesses: fetch of the opcode, fetch of the 2 address bytes, and then the final access (read or write).

What remains to be confirmed

  • Behavior with I/O addresses:Although the tests validate RAM addresses, the behavior remains to be verified when accessing mapped I/O addresses (0xFF00-0xFFFF). This will be validated when tested with real ROMs.
  • Impact on rendering:Although these opcodes are expected to unlock the graphics, we need to run Tetris DX again to confirm that the title screen can now be seen.

Hypotheses and Assumptions

It is assumed that the behavior of these instructions with I/O addresses is the same as with RAM, delegating to the MMU the correct management of the mapping. This will be validated when running the emulator with real ROMs.

Next Steps

  • [ ] Run Tetris DX again to verify that the emulator no longer crashes at 0xEA
  • [ ] Check if graphics can now be seen on the screen (Tetris title screen)
  • [ ] If the graphics appear, proceed with the Joypad implementation (Original Step 27)
  • [ ] If opcodes are still missing, identify them through logs and continue implementing them