This project is educational and Open Source. No code is copied from other emulators. Implementation based solely on technical documentation and permitted tests.
LD Indirect Opcodes (0x0A, 0x1A, 0x3A)
Summary
Implemented three indirect loading opcodes that were missing from the emulator:LD A, (BC)(0x0A),LD A, (DE)(0x1A) andLD A, (HL-)(0x3A). These opcodes are essential for Tetris DX to can run correctly, since the game frequently uses them to read memory data using different registers as pointers. They were created5 teststo validate the behavior of these opcodes, all passing correctly.
Hardware Concept
The Game Boy's LR35902 CPU supports multiple ways of accessing memory using 16-bit registers as pointers. The recordsB.C., OFandH.L.can be used as pointers to read and write in memory, providing flexibility in data transfer operations.
LD A, (BC)(0x0A): Reads a byte from the memory address pointed to by BC and loads it into A. It is the twin
ofRHP (BC), A(0x02): While 0x02 writes A to memory, 0x0A reads from memory and saves it to A.
LD A, (DE)(0x1A): Similar to LD A, (BC), but using DE as a pointer. He is the twin ofRH (DE), A(0x12).
Useful for reading data using DE as the source pointer in copy operations.
LD A, (HL-)(0x3A): Reads a byte from the address pointed to by HL and loads it into A, then decrements HL.
It is the complement ofRHP (HL-), A. Useful for fast read loops that traverse memory backwards.
Decreasing HL allows efficient iteration over arrays or buffers in memory.
All these opcodes consume2 M-Cycles: one for fetching the opcode and another for reading memory. The pointer registers (BC, DE, HL) are not modified except in the case of LD A, (HL-), where HL is decremented.
Fountain:Pan Docs - Instruction Set (LD A, (BC)), (LD A, (DE)), (LDD A, (HL))
Implementation
Three methods were implemented in the classCPUto handle these opcodes:
Components created/modified
src/cpu/core.py: Added methods_op_ld_a_bc_ptr(),_op_ld_a_de_ptr()and_op_ldd_a_hl_ptr()src/cpu/core.py: Registered opcodes 0x0A, 0x1A and 0x3A in the dispatch tabletests/test_cpu_ld_indirect.py: New file with 5 tests to validate the opcodes
Design decisions
LD A, (BC)andLD A, (DE)They are practically identical in implementation, they only differ in registration used as a pointer. They were kept as separate methods for clarity and consistency with the code structure.
ForLD A, (HL-), HL decrement is implemented with 16-bit wrap-around using(hl_addr - 1) & 0xFFFF,
ensuring that if HL is 0x0000, when decremented it becomes 0xFFFF (standard Game Boy behavior).
All methods include DEBUG level logging for easy debugging, displaying the read value, memory address and the final status of the affected records.
Affected Files
src/cpu/core.py- Added methods_op_ld_a_bc_ptr(),_op_ld_a_de_ptr()and_op_ldd_a_hl_ptr(), registered opcodes in dispatch tabletests/test_cpu_ld_indirect.py- New file with 5 tests:test_ld_a_bc_ptr,test_ld_a_de_ptr,test_ld_a_bc_ptr_wrap_around,test_ld_a_de_ptr_zero,test_ld_a_hl_ptr_decrement
Tests and Verification
They were created5 testsintests/test_cpu_ld_indirect.pyto validate the behavior of the opcodes:
- test_ld_a_bc_ptr: Verifies that LD A, (BC) correctly reads from memory, updates A and does not modify BC
- test_ld_a_de_ptr: Verifies that LD A, (DE) correctly reads from memory, updates A and does not modify DE
- test_ld_a_bc_ptr_wrap_around: Verifies that LD A, (BC) works correctly with addresses on the boundary (0xFFFF)
- test_ld_a_de_ptr_zero: Verify that LD A, (DE) works correctly with address 0x0000
- test_ld_a_hl_ptr_decrement: Verifies that LD A, (HL-) reads correctly, updates A and decrements HL
Command executed: pytest -q tests/test_cpu_ld_indirect.py
Around:macOS, Python 3.10+
Result: ✅ 5 passed(all tests pass correctly)
What is valid:
- Opcodes correctly read from memory using registers as pointers
- Register A is updated with the read value
- Pointer registers (BC, DE) are not modified except HL in LD A, (HL-)
- The timing is correct (2 M-Cycles for everyone)
- The wrap-around works correctly in extreme cases
Test code (example - test_ld_a_de_ptr):
def test_ld_a_de_ptr(self) -> None:
"""Test: Verify LD A, (DE) - opcode 0x1A"""
mmu = MMU(None)
cpu = CPU(mmu)
cpu.registers.set_pc(0x0100)
cpu.registers.set_de(0xD000)
cpu.registers.set_a(0x00)
mmu.write_byte(0xD000, 0x55)
mmu.write_byte(0x0100, 0x1A) # LD A, (DE)
cycles = cpu.step()
assert cycles == 2
assert cpu.registers.get_a() == 0x55
assert cpu.registers.get_pc() == 0x0101
assert cpu.registers.get_de() == 0xD000
Validation with Tetris DX:The game stopped crashing with unimplemented opcode errors (0x1A and 0x3A), confirming that these opcodes are necessary for the correct execution of the game.
Sources consulted
- Bread Docs:CPU Instruction Set- LD A, (BC), LD A, (DE), LD A, (HL)
- Bread Docs:CPU Registers and Flags- BC, DE, HL registers as pointers
Educational Integrity
What I Understand Now
- Pointers on LR35902:16-bit registers (BC, DE, HL) can be used as pointers to access memory, providing flexibility in data transfer operations.
- Twin Opcodes:Many opcodes have read and write versions (e.g. LD A, (BC) and LD (BC), A), allowing bidirectional transfers.
- Opcodes with post-increment/decrement:LD A, (HL-) is part of a family of opcodes that modify the pointer after the operation, useful for efficient loops.
- Consistent Timing:Indirect loading opcodes consume 2 M-Cycles (fetch + read), regardless of the register used as a pointer.
What remains to be confirmed
- LD A, (BC+) and LD A, (DE+):They do not exist in the LR35902 instruction set, only versions with HL (LD A, (HL+)) exist. This is consistent with the architecture, where HL is the most versatile register.
- Behavior with invalid addresses:In real hardware, all 16-bit addresses are valid (0x0000-0xFFFF), but some regions have special behaviors (e.g. video memory, I/O). Our emulator handles this through the MMU.
Hypotheses and Assumptions
No critical assumption.The implementation directly follows the Pan Docs specification and tests They validate the expected behavior. The 16-bit wrap-around for HL decrement in LD A, (HL-) is standard on architectures 16-bit and is documented.
Next Steps
- [ ] Investigate why BGP=0x00 (completely white palette) in Tetris DX
- [ ] Check if tiles are loaded into VRAM after initialization
- [ ] Implement Window (WX/WY) to support overlapping windows
- [ ] Implement Sprites (OAM) to render moving objects
- [ ] Continue to implement missing opcodes as found during game execution