This project is educational and Open Source. No code is copied from other emulators. Implementation based solely on technical documentation and permitted tests.
Direct Loads to Memory (LD (nn), A and LD A, (nn))
Summary
Critical opcodes implemented0xEA (LD (nn), A)and0xFA (LD A, (nn))which allow direct memory access using 16-bit absolute addresses specified directly in the code. These instructions are essential so that games can save and read global variables, game states and graphic settings. The emulator was crashing at 0xEA when running Tetris DX, preventing graphs will be drawn. With this implementation, the emulator can advance beyond that point and begin to render the title screen.
Hardware Concept
Hedirect addressingis a memory access mode where the 16-bit address It is written directly in the code, right after the opcode. Unlike indirect addressing (ex: LD (HL), A where the address is in the HL register), here the address is part of the instruction itself.
On the LR35902 architecture, these instructions are essential to:
- Access to global variables:Games save game states, scores, lives, etc. in fixed memory locations.
- Access to hardware registers:Many I/O registers (such as PPU registers, timers, etc.) are mapped to specific addresses.
- Buffer initialization:They are used to initialize memory areas without using intermediate registers.
LD (nn), A (0xEA):Read the next 2 bytes of the code (Little-Endian address), and write the value of accumulator A to that address. Consumes 4 M-Cycles (fetch opcode + fetch 2 bytes + write).
LD A, (nn) (0xFA):Read the next 2 bytes of the code (Little-Endian address), reads the byte of that address, and saves it in A. It consumes 4 M-Cycles (fetch opcode + fetch 2 bytes + read).
Fountain:Pan Docs - Instruction Set (LD (nn), A and LD A, (nn))
Implementation
Two new methods were implemented in the CPU class following the pattern established by other operations
indirect load like LD (HL), A and LD (BC), A. The key difference is that here the address is read
of the code usingfetch_word()instead of getting it from a registry.
Components created/modified
src/cpu/core.py: Added methods_op_ld_nn_ptr_a()(0xEA) and_op_ld_a_nn_ptr()(0xFA), and recorded in the dispatch table.tests/test_cpu_load_direct.py: Created new file with 4 unit tests that validate writing, reading, roundtrip and multiple addresses.
Design decisions
The same pattern was followed as other indirect memory operations:
- Use of
fetch_word()to read the 16-bit address (already handles Little-Endian and PC forward). - Logging with
logger.debug()to trace operations during debugging. - Explicit return of 4 M-Cycles as per Pan Docs specification.
- Implicit masking:
fetch_word()it already handles 16-bit wrap-around.
The methods are simple and direct, without premature optimizations, prioritizing clarity and correctness.
Affected Files
src/cpu/core.py- Added methods_op_ld_nn_ptr_a()and_op_ld_a_nn_ptr(), and registered in_opcode_tabletests/test_cpu_load_direct.py- Created new file with 4 unit tests
Tests and Verification
Exhaustive unit tests were executed that validate the correct behavior of both instructions:
Test Execution
Command executed:
python3 -m pytest tests/test_cpu_load_direct.py -v
Around:
- OS: macOS (darwin 21.6.0)
- Python: 3.9.6
- pytest:8.4.2
Result:
============================== test session starts ==============================
platform darwin -- Python 3.9.6, pytest-8.4.2, pluggy-1.6.0
collected 4 items
tests/test_cpu_load_direct.py::TestLoadDirect::test_ld_direct_write PASSED [ 25%]
tests/test_cpu_load_direct.py::TestLoadDirect::test_ld_direct_read PASSED [ 50%]
tests/test_cpu_load_direct.py::TestLoadDirect::test_ld_direct_write_read_roundtrip PASSED [ 75%]
tests/test_cpu_load_direct.py::TestLoadDirect::test_ld_direct_different_addresses PASSED [100%]
============================== 4 passed in 0.39s ==============================
What is valid:
- test_ld_direct_write:Verifies that LD (nn), A correctly writes the value of A to the specified address, consumes 4 M-Cycles, and advances PC correctly (3 bytes).
- test_ld_direct_read:Verifies that LD A, (nn) correctly reads from the specified address, loads the value into A, consumes 4 M-Cycles, and advances PC correctly.
- test_ld_direct_write_read_roundtrip:It validates that the same value can be written and read back, demonstrating that both instructions work correctly together.
- test_ld_direct_different_addresses:It verifies that instructions work with different memory addresses, ensuring that there are no side effects between addresses.
Test code (essential fragment):
def test_ld_direct_write(self):
"""Verify direct writing to memory (LD (nn), A - 0xEA)."""
mmu = MMU()
cpu = CPU(mmu)
cpu.registers.set_pc(0x0100)
cpu.registers.set_a(0x55)
# Write opcode + address 0xC000 (Little-Endian: 0x00 0xC0)
mmu.write_byte(0x0100, 0xEA) # Opcode
mmu.write_byte(0x0101, 0x00) # Low byte
mmu.write_byte(0x0102, 0xC0) # High byte
cycles = cpu.step()
assert mmu.read_byte(0xC000) == 0x55
assert cycles == 4
assert cpu.registers.get_pc() == 0x0103
Why these tests demonstrate hardware behavior:
- The tests verify that the address is read correctly in Little-Endian format (0x00 0xC0 = 0xC000).
- They validate that the memory access is correct (read/write at the specified address).
- They confirm the correct timing (4 M-Cycles according to Pan Docs).
- They show that the PC is advancing correctly (3 bytes: opcode + 2 bytes of address).
Impact on ROM execution
Before this implementation, the emulator would crash when it encountered the 0xEA opcode during execution from Tetris DX. This prevented the game from progressing far enough to initialize the PPU and draw the title screen. With these opcodes implemented, the emulator can execute more instructions and potentially render graphics.
Sources consulted
- Pan Docs - Instruction Set:LD (nn), A (0xEA) and LD A, (nn) (0xFA)
Note: The implementation exactly follows the Pan Docs specification for timing (4 M-Cycles) and address format (Little-Endian).
Educational Integrity
What I Understand Now
- Direct vs indirect addressing:Direct addressing allows you to specify the address in the code, while indirect addressing obtains it from a register. They both have different use cases: direct for global variables and hardware registers, indirect for loops and dynamic pointers.
- Little-Endian in addresses:16-bit addresses are stored in memory with the low byte first (0x00) and the high byte second (0xC0), forming 0xC000. This is consistent with the entire LR35902 architecture.
- Timing instructions:Instructions that read addresses from the code consume more cycles because they must make multiple memory accesses: fetch of the opcode, fetch of the 2 address bytes, and then the final access (read or write).
What remains to be confirmed
- Behavior with I/O addresses:Although the tests validate RAM addresses, the behavior remains to be verified when accessing mapped I/O addresses (0xFF00-0xFFFF). This will be validated when tested with real ROMs.
- Impact on rendering:Although these opcodes are expected to unlock the graphics, we need to run Tetris DX again to confirm that the title screen can now be seen.
Hypotheses and Assumptions
It is assumed that the behavior of these instructions with I/O addresses is the same as with RAM, delegating to the MMU the correct management of the mapping. This will be validated when running the emulator with real ROMs.
Next Steps
- [ ] Run Tetris DX again to verify that the emulator no longer crashes at 0xEA
- [ ] Check if graphics can now be seen on the screen (Tetris title screen)
- [ ] If the graphics appear, proceed with the Joypad implementation (Original Step 27)
- [ ] If opcodes are still missing, identify them through logs and continue implementing them