⚠️ Clean-Room / Educational

This project is educational and Open Source. No code is copied from other emulators. Implementation based solely on technical documentation and permitted tests.

Implementation of Control Flow and Jumps in C++

Date:2025-12-19 StepID:0106 State: Filled

Summary

Implemented basic CPU flow control in C++, adding instructions absolute jump (JP nn) and relative jump (JR e, JR NZ e). This implementation breaks the linearity of execution, allowing loops and conditional decisions. The CPU now it is practically Turing Complete. Native integer handling was taken advantage of signed C++ to simplify relative jumps, eliminating complexity to simulate two's complement that existed in Python. All tests pass (8/8).

Hardware Concept

Flow control is essential for any CPU: without jumps, the execution It could only be linear, with no possibility of loops, conditionals or subroutines. The CPU LR35902 implements two main types of jumps:

  • Absolute Jumps (JP nn): The opcode reads a 16-bit address (Little-Endian) and set the PC directly to that address. Consumes 4 M-Cycles.
  • Relative Jumps (JR e): The opcode reads a signed byte (-128 to +127) and adds it to the current PC. The offset is relative to the position AFTER the offset is read. Consumes 3 M-Cycles if the jump is taken, 2 if not (in conditional versions).

C++ vs Python Optimization: As mentioned in the prompt, in Python we had to do mathematical formulas to simulate two's complement (e.g. if the byte is >= 128, subtract 256). In C++, the cast ofuint8_ttoint8_tis native to the processor: the compiler simply interprets the same bit pattern as a signed number, generating the correct assembler instruction automatically. This makes the code cleaner and more efficient:pc += (int8_t)offset;in front of Python conditional logic.

Branch Prediction: Although we are emulating, organize jump cases together on the switch can help host processor branch prediction, improving the performance of the switch statement.

Conditional Timing: The conditional versions of JR (such as JR NZ, e) always read the offset (to advance PC), but only execute the jump if the condition is true. This causes them to consume different cycles: 3 M-Cycles if they jump, 2 if they don't.

Implementation

Helper addedfetch_word()to read 16-bit addresses in format Little-Endian (reads LSB first, then MSB, and combines them). 3 opcodes implemented new: JP nn (0xC3), JR e (0x18) and JR NZ, e (0x20).

Components created/modified

  • CPU.hpp: Added declarationfetch_word()helper.
  • CPU.cpp:
    • Implementation offetch_word()(reads 2 Little-Endian bytes).
    • Implementation of JP nn (0xC3) - absolute jump of 4 M-Cycles.
    • Implementation of JR e (0x18) - unconditional relative jump of 3 M-Cycles.
    • JR NZ implementation, e(0x20) - conditional relative jump (3 cycles if jump, 2 if not).
  • tests/test_core_cpu_jumps.py: Complete suite of 8 tests that validate:
    • Absolute jumps (JP nn) with normal and wrap-around directions.
    • Relative positive and negative jumps.
    • Conditional jumps with true and false condition.
    • Critical verification of negative handling in C++.

Design decisions

  • fetch_word() reuses fetch_byte(): To maintain consistency and take advantage of the PC wrap-around handling that already exists infetch_byte(). This also simplifies the code and reduces duplication.
  • Explicit cast to int8_t: Although the compiler could infer it, usestatic_cast<int8_t>(offset_raw)makes the intention explicit and improves the readability of the code.
  • Switch hop grouping: Jump opcodes are grouped together on the switch (after ALU operations) to aid branch prediction. host processor. This is a minor but important optimization in emulation loops.
  • JR NZ always reads offset: Even if we don't take the leap, we always read the offset to advance PC correctly. This is critical for correct emulation of the timing and hardware behavior.

Affected Files

  • src/core/cpp/CPU.hpp- Added declarationfetch_word()
  • src/core/cpp/CPU.cpp- Implementation offetch_word()and 3 jump opcodes
  • tests/test_core_cpu_jumps.py- Complete test suite (8 tests)

Tests and Verification

A complete suite of 8 tests was created intest_core_cpu_jumps.py:

  • TestJumpAbsolute (2 tests):
    • test_jp_absolute(): Checks absolute jump to specific address (0xC000).
    • test_jp_absolute_wraparound(): Check jump to maximum address (0xFFFF).
  • TestJumpRelative (3 tests):
    • test_jr_relative_positive(): Checks positive relative jump (+5).
    • test_jr_relative_negative(): CRITICAL- Check handling of negative offset (-2) in C++.
    • test_jr_relative_loop(): Simulates a loop with negative relative jump (-3).
  • TestJumpRelativeConditional (3 tests):
    • test_jr_nz_condition_true(): Verifies that it jumps when Z=0 (true condition, 3 cycles).
    • test_jr_nz_condition_false(): Verifies that it does NOT jump when Z=1 (false condition, 2 cycles).
    • test_jr_nz_negative_when_condition_true(): Checks negative jumps with true condition.

Validation results:

tests/test_core_cpu_jumps.py::TestJumpAbsolute::test_jp_absolute PASSED
tests/test_core_cpu_jumps.py::TestJumpAbsolute::test_jp_absolute_wraparound PASSED
tests/test_core_cpu_jumps.py::TestJumpRelative::test_jr_relative_positive PASSED
tests/test_core_cpu_jumps.py::TestJumpRelative::test_jr_relative_negative PASSED
tests/test_core_cpu_jumps.py::TestJumpRelative::test_jr_relative_loop PASSED
tests/test_core_cpu_jumps.py::TestJumpRelativeConditional::test_jr_nz_condition_true PASSED
tests/test_core_cpu_jumps.py::TestJumpRelativeConditional::test_jr_nz_condition_false PASSED
tests/test_core_cpu_jumps.py::TestJumpRelativeConditional::test_jr_nz_negative_when_condition_true PASSED

============================== 8 passed in 0.05s ==============================

All tests pass correctly, including the critical number handling case negatives in C++. The CPU can now execute loops and make conditional decisions.

Sources consulted

Educational Integrity

What I Understand Now

  • Complement to Two Native: In C++, the cast ofuint8_ttoint8_tis a bit-level operation: the same bit pattern is interpreted differently. This It is much more efficient than simulating it in Python with arithmetic operations.
  • Little-Endian: The Game Boy stores 16-bit values in memory in Little-Endian (LSB first). This is critical to reading addresses correctly.
  • Conditional Timing: Conditional jump instructions always read the offset (to maintain hardware behavior), but they only execute the jump if the condition is true. This causes different execution times (3 vs 2 M-Cycles).
  • PC relative to full read: In relative jumps, the offset is added to the PC AFTER reading the entire instruction (opcode + offset). This is important to correctly calculate the destination address.

What remains to be confirmed

  • Other conditional jumps: There are more conditional variants (JR Z, JR C, JR NC) They should follow the same pattern. Pending implementation in future steps.
  • CALL and RET: For subroutines, we will need CALL (absolute jump that saves address return on stack) and RET (return from subroutine). Slope for advanced flow control.

Hypotheses and Assumptions

The actual hardware behavior is assumed to be exactly as documented in Pan Docs: relative jumps always read the offset (even if the jump is not taken), and timing It is exactly 3 M-Cycles if skipped, 2 if not. This is verified through exhaustive tests.

Next Steps

  • [ ] Implement more conditional jumps (JR Z, JR C, JR NC)
  • [ ] Implement CALL and RET for subroutines
  • [ ] Implement more load/store (LD) instructions
  • [ ] Continue to expand the CPU's basic instruction set