⚠️ Clean-Room / Educational

This project is educational and Open Source. No code is copied from other emulators. Implementation based solely on technical documentation and permitted tests.

16-bit Loads (BC, DE) and Comparisons (CP)

Date:2025-12-16 StepID:0013 State: Verified

Summary

Implementation of 16-bit immediate loads for the BC and DE registers, indirect storage using BC and DE as pointers, and the critical comparison instruction CP (Compare). The _cp() helper was added that performs a "phantom subtraction" (updates flags without modifying A) and the opcodes LD BC, d16 (0x01), LD DE, d16 (0x11), LD (BC), A (0x02), LD (DE), A (0x12), CP d8 (0xFE) and CP (HL) (0xBE) were implemented. These instructions are essential for the emulator to progress beyond initialization, allowing constants to be loaded into even registers and conditional decisions to be made. Complete TDD test suite (9 tests) validating all functionalities. Also fixed was a bug in the MMU where the ROM area (0x0000-0x7FFF) returned 0xFF when there was no cartridge, preventing writing/reading memory for testing.

Hardware Concept

BC and DE Register Pairs:The LR35902 CPU has four pairs of 16-bit registers: AF, BC, DE, and HL. We already had HL and SP implemented. BC and DE are equally important:

  • BC:It is frequently used as a counter or secondary pointer in loops.
  • OF:It is frequently used as a destination pointer in data copy (memcpy) operations.

Indirect Storage with BC and DE:As with HL, we can use BC and DE as memory pointers. The instructions LD (BC), A and LD (DE), A write the value of A to the address pointed to by BC or DE respectively. They are very common in memory cleanup or data copy loops.

The CP (Compare) Instruction - A "Phantom" Subtraction:CP is fundamentally a SUBTRACTION (SUB), but with a critical difference:discard the numerical resultand only stays with theFlags. The A record is NOT modified.

CP is used for comparisons in code: "A == value?", "A < value?", etc.:

  • YeahA == value, the subtractionA - valueis 0, so theFlag Z (Zero).
  • YeahA < worth, the subtraction needs "borrow", so it turns on theFlag C (Carry/Borrow).
  • YeahA > worth, the subtraction is positive, Z=0 and C=0.

The flags are updated the same as in SUB:

  • Z (Zero):1 if A == value (equals)
  • N (Subtract):Always 1 (it is a subtraction)
  • H (Half-Borrow):If there was borrow from bit 4 to 3 (low nibble)
  • C (Borrow):If there was borrow from bit 7 (A < value)

Without CP, games can't make conditional decisions like "Have I reached the end of the loop?" or "Has the user pressed START?" It is an absolutely critical instruction for the control logic.

Implementation

6 new opcodes were implemented following the established pattern of dedicated handlers in the dispatch table.

Helper _cp() for Comparisons

The helper was created_cp(value)that reuses the logic of_sub()but with the critical difference thatDOES NOT modify the A record. Calculates the result of A - value to determine the flags, but restores the original value of A at the end (it never actually modifies it, just uses it to calculate).

The helper updates the flags Z, N, H, C the same as_sub(), but preserves A intact. This is essential for CP to function properly as a "non-destructive" comparison.

16-bit loads: LD BC, d16 and LD DE, d16

They were implemented_op_ld_bc_d16()(0x01) and_op_ld_de_d16()(0x11) following exactly the same pattern as_op_ld_hl_d16()and_op_ld_sp_d16(). Read 2 bytes in Little-Endian format usingfetch_word()and load the value into the corresponding register pair usingregisters.set_bc()eitherregisters.set_de().

Both instructions consume 3 M-Cycles (fetch opcode + fetch 2 bytes of value).

Indirect Storage: LD (BC), A and LD (DE), A

They were implemented_op_ld_bc_ptr_a()(0x02) and_op_ld_of_ptr_a()(0x12) following the pattern of_op_ld_hl_ptr_a(). They get the address pointed to by BC or DE, read the value of A, and write to memory usingmmu.write_byte().

Both instructions consume 2 M-Cycles (fetch opcode + write to memory).

Comparisons: CP d8 and CP (HL)

They were implemented_op_cp_d8()(0xFE) and_op_cp_hl_ptr()(0xBE). The first reads an immediate 8-bit value and compares it with A. The second reads a value from memory pointed to by HL and compares it with A. Both use the helper_cp()to update flags without modifying A.

Both instructions consume 2 M-Cycles (fetch opcode + fetch operand or read from memory).

Bug fix in MMU

Fixed a critical bug inMMU.read_byte()where the ROM area (0x0000-0x7FFF) always returned 0xFF when there was no cartridge, even if it had been previously written toself._memory. This prevented the tests from working correctly, as they wrote opcodes to memory but then read them as 0xFF.

The solution was to modify the logic so that when there is no cartridge, it is readself._memoryinstead of returning 0xFF. This allows tests to write and read correctly to the ROM area, although in real hardware this area would be read-only (cartridge ROM).

Components created/modified

  • src/cpu/core.py: Added 6 new opcode handlers and the _cp() helper
  • src/memory/mmu.py: Fixed bug in read_byte() for ROM area without cartridge
  • tests/test_cpu_load16_cp.py: Complete suite of 9 TDD tests

Design decisions

It was decided to reuse the logic of_sub()for_cp()instead of duplicating code, but making sure that A is not modified. This maintains consistency in flag calculation and reduces the possibility of errors.

To correct the bug in MMU, it was chosen to allow reading/writing in the ROM area when there is no cartridge, since it is necessary for testing. In a more complete implementation this would be better handled with specific memory regions, but for this stage it is sufficient.

Affected Files

  • src/cpu/core.py- Added 6 new opcodes, _cp() helper, and dispatch table entries
  • src/memory/mmu.py- Fixed bug in read_byte() for ROM area without cartridge
  • tests/test_cpu_load16_cp.py- New test suite with 9 test cases

Tests and Verification

A complete TDD test suite was created with 9 test cases:

  • 16-bit load tests:
    • test_ld_bc_d16: Verify that LD BC, d16 correctly loads Little-Endian values
    • test_ld_de_d16: Verify that LD DE, d16 correctly loads Little-Endian values
  • Indirect storage tests:
    • test_ld_bc_indirect_write: Verifies that LD (BC), A writes correctly to memory
    • test_ld_of_indirect_write: Verifies that LD (DE), A writes correctly to memory
  • Comparison tests:
    • test_cp_equality: Verifies that CP activates Z when A == value
    • test_cp_less: Verifies that CP activates C when A < worth
    • test_cp_greater: Verifies that CP does not activate C when A > worth
    • test_cp_hl_ptr: Verifies that CP (HL) compares with value in memory
    • test_cp_half_carry: Verifies that CP updates H correctly when there is half-borrow

All tests pass successfully (9/9), validating:

  • Correct loading of 16-bit values ​​in BC and DE registers
  • Correct indirect storage using BC and DE as pointers
  • Correct update of flags in comparisons (Z, N, H, C)
  • Preserving the A record in CP operations
  • Correct consumption of M-Cycles for each instruction

Sources consulted

Note: The CP implementation follows the Z80/8080 architecture standard specification, from which LR35902 is derived. The behavior of CP as "phantom subtraction" is consistent with the technical documentation.

Educational Integrity

What I Understand Now

  • CP is a phantom subtraction:CP calculates A - value to update flags, but discards the numerical result. Only the flags matter, A remains intact. This is essential for conditional comparisons.
  • BC and DE as pointers:Just like HL, BC and DE can be used as memory pointers. LD (BC), A and LD (DE), A are very common in initialization and data copy loops.
  • Importance of CP:Without CP, games cannot make conditional decisions. It is a critical instruction for any control logic (if/else, loops, state machines).
  • ROM area in tests:For testing, we need to be able to write to the ROM area (0x0000-0x7FFF) even though on real hardware it is read-only. The fix in MMU allows this when there is no cartridge.

What remains to be confirmed

  • Complete CP behavior:All tests pass, but we have not yet tested CP in more complex situations (limit values, wrap-around, etc.). It will be validated when the emulator runs real code.
  • Exact timing:M-Cycles are correct according to documentation, but the precise timing of T-Cycles within each M-Cycle is not modeled. This will be added later when needed for accuracy.

Hypotheses and Assumptions

We assume that the behavior of CP is identical to SUB in terms of flag calculation, only that it does not modify A. This is supported by technical documentation and the standard behavior of Z80/8080 architectures.

The correction in MMU to allow reading/writing in the ROM area when there is no cartridge is a simplification for testing. On real hardware, this area would be read-only (cartridge ROM), but for our educational implementation it is acceptable.

Next Steps

  • [ ] Continue running the emulator with Tetris DX to identify which opcodes are missing
  • [ ] Implement more load opcodes (LD between registers, LD with additional indirection)
  • [ ] Implement more arithmetic and logical operations (ADD, SUB, AND, OR, XOR with registers)
  • [ ] Consider implementing more CP variants (CP with other registers, not just d8 e (HL))
  • [ ] Improve the handling of memory regions in MMU to be more faithful to real hardware