⚠️ Clean-Room / Educational

This project is educational and Open Source. No code is copied from other emulators. Implementation based solely on technical documentation and permitted tests.

8-bit transfers (LD r, r') and HALT

Date:2025-12-17 StepID:0015 State: Verified

Summary

The complete 8-bit transfer block (LD r, r') of the range 0x40-0x7F has been implemented, covering 63 new opcodes that allow data to be moved between registers and memory. It was implemented also the HALT instruction (0x76) which puts the CPU into low power mode. This block is critical because it allows the emulator to run real game code that you need to transfer data between registers for normal operation.

Hardware Concept

Transfer Block 0x40-0x7F

On the LR35902 (and Z80) architecture, the core block of opcodes 0x40-0x7F is dedicated almost exclusively to data transfers between registers. It is an 8x8 matrix where each opcode encodes a source and a destination using 3 bits for each.

Opcode structure:

  • Bits 0-2:Origin record code (000=B, 001=C, 010=D, 011=E, 100=H, 101=L, 110=(HL), 111=A)
  • Bits 3-5:Destination record code (same mapping)

This structure allows 64 possible combinations (8x8), but the 0x76 opcode is special: Instead of being LD (HL), (HL) (which makes no sense), it is the HALT instruction.

HALT (0x76) - Low Power Mode

HALT puts the CPU into a low-power state where it stops executing instructions. The Program Counter (PC) does not advance and the CPU simply waits. CPU wakes up automatically when an interruption occurs (if IME is activated) or can be manually woken up.

While in HALT, the CPU consumes 1 cycle per tick (hot wait), but does not executes no instructions. This is useful for games that wait for events like V-Blank (screen refresh interrupt) to synchronize game logic.

Transfer Timing

Transfers have different execution times depending on whether they involve memory:

  • LD r, r:1 M-Cycle (transfer between registers, without memory access)
  • LD r, (HL) or LD (HL), r:2 M-Cycles (indirect memory access)

This difference reflects the real cost of the hardware: accessing memory is slower than accessing internal CPU registers.

Implementation

The entire transfer block was implemented using an initialization system lazy that creates the handlers dynamically when they are accessed for the first time. This allows you to use the helper methods that are defined after the constructor.

Components created/modified

  • CPU.__init__():Added flaghaltedand call to_init_ld_handlers()
  • CPU.step():Modified to handle HALT state (check interrupts, consume cycles)
  • CPU._get_register_value():Helper to get registry value according to code (0-7)
  • CPU._set_register_value():Helper to set value in register according to code (0-7)
  • CPU._op_ld_r_r():Generic handler for all LD r, r' transfers
  • CPU._op_halt():Handler for HALT (0x76)
  • CPU._init_ld_handler_lazy():Lazy initialization of transfer handlers
  • CPU._execute_opcode():Changed to initialize lazy handlers when accessed

Design decisions

Lazy initialization:Transfer handlers are created lazy (when first accessed) because the helper (_get_register_value, _set_register_value, _op_ld_r_r) are defined after the constructor. This avoids definition order problems and keeps the code organized.

Generic Helper:was created_op_ld_r_r()as a generic handler accepts registration codes (0-7) instead of creating 63 individual functions. This reduces code duplication and eases maintenance.

HALT Management:The HALT status is checked at the start ofstep(). If CPU is HALT and interrupts are pending (IME on), it wakes up automatically. If not, it consumes 1 cycle and returns without executing any instructions.

Affected Files

  • src/cpu/core.py- Added flag halted, modified step(), implemented helpers and handlers for transfers and HALT
  • tests/test_cpu_load8.py- Complete suite of TDD tests (8 tests) validating transfers and HALT

Tests and Verification

A complete TDD test suite was created with 8 tests that validate all functionalities:

  • test_ld_r_r:Verifies transfer between registers (LD A, D - 0x7A) with correct timing (1 M-Cycle)
  • test_ld_r_hl:Verifies reading from indirect memory (LD B, (HL) - 0x46) with correct timing (2 M-Cycles)
  • test_ld_hl_r:Verifies writing to indirect memory (LD (HL), C - 0x71) with correct timing (2 M-Cycles)
  • test_ld_all_registers:Check multiple combinations of transfers between basic registers
  • test_halt_sets_flag:Verify that HALT activates the halted flag correctly
  • test_halt_pc_does_not_advance:Verify that in HALT the PC does not advance and consumes 1 cycle per tick
  • test_halt_wake_on_interrupt:Verify that HALT wakes up when IME is enabled
  • test_ld_hl_hl_is_halt:Verify that 0x76 is HALT, not LD (HL), (HL)

Result:All tests pass ✅ (8/8)

Test with Real ROM (Tetris DX)

The emulator was run with Tetris DX to verify that the transfers allow for the game to advance beyond initialization:

  • ROM:tetris_dx.gbc (524,288 bytes, 512 KB ROM, 8 KB RAM)
  • Cycles executed:30 M-Cycles
  • Final PC:0x1389
  • Result:The game successfully advanced from initialization (PC=0x0100) until PC=0x1389, running actual game code.
  • Error found:Opcode0xB3(OR E / OR A, E) not implemented

Error analysis:The opcode0xB3It is a logical OR operation between the accumulator A and the register E. This opcode belongs to the range0xB0-0xB7that contains OR operations with different registers:

  • 0xB0: OR B
  • 0xB1: OR C
  • 0xB2: OR D
  • 0xB3: OR E ←error here
  • 0xB4: OR H
  • 0xB5: OR L
  • 0xB6: OR (HL)
  • 0xB7: OR A

Conclusion:The transfer implementation works correctly and allows have the game run real code. The next logical step is to implement the logical operations (OR, AND, XOR) in range0xA0-0xBFto continue moving forward.

Sources consulted

  • Pan Docs: CPU Instruction Set - LD r, r' encoding (block 0x40-0x7F)
  • Pan Docs: CPU Instruction Set - HALT instruction (0x76)
  • Pan Docs: CPU Timing - M-Cycles for transfers with/without memory

Implementation based on Pan Docs technical documentation on the LR35902 architecture.

Educational Integrity

What I Understand Now

  • Transfer Block:The range 0x40-0x7F is an 8x8 matrix that encodes all possible combinations of transfers between registers. The structure is elegant and allows 63 opcodes to be covered with a generic implementation.
  • HALT as Exception:Opcode 0x76 is special because it breaks the array pattern. Instead of being LD (HL), (HL) (which makes no sense), it is HALT. This is a quirk of the hardware design.
  • Differential Timing:Transfers involving memory (using (HL)) consume 2 M-Cycles, while those involving only registers consume 1 M-Cycle. This reflects the actual cost of the hardware.
  • HALT status:HALT is a special CPU state that allows you to wait for events (such as interrupts) without consuming unnecessary resources. It is essential for synchronization in games.

What remains to be confirmed

  • HALT Awakening:The current implementation simplifies waking up HALT by assuming that if IME is enabled, there are interrupts pending. When implementing full interrupt handling, the IF (Interrupt Flag) and IE (Interrupt Enable) registers should be checked to determine if there are actually pending interrupts.
  • HALT behavior with IME disabled:On real hardware, when IME is disabled and HALT is executed, the CPU may have special behaviors. This needs verification with more detailed documentation or tests with real hardware.

Hypotheses and Assumptions

HALT wakeup implementation assumes that if IME is on, there are interrupts earrings. This is a simplification that will work for most cases, but when If full interrupt handling is implemented, you must explicitly check the IF and IE registers.

Next Steps

  • [ ] Implement more opcodes from the instruction set (logical operations, rotations, etc.)
  • [ ] Implement full interrupt handling (IF, IE, interrupt vectors)
  • [ ] Improve HALT wakeup to explicitly check for pending interrupts
  • [x] Test the emulator with Tetris DX to verify that it can progress beyond initialization ✅
  • [ ] Implement logical operations (OR, AND, XOR) in the range 0xA0-0xBF