This project is educational and Open Source. No code is copied from other emulators. Implementation based solely on technical documentation and permitted tests.
HALT Architecture: "Fast Forward" to the Next Event
Summary
The polling deadlock has been solved by the scanlines architecture, but it has revealed a more subtle deadlock: the CPU executes the instructionHALTand our main loop does not advance time efficiently, keepingL.Y.stuck in0. This Step documents the implementation of a managementHALTintelligent that "fast-forwards" the time to the end of the current scanline, correctly simulating a standby CPU while the rest of the hardware (PPU) continues to function.
Hardware Concept: HALT and Event Synchronization
The instructionHALT(opcode0x76) puts the CPU into a low-power state. The CPU stops executing instructions and waits for an interrupt to occur. However, the rest of the hardware (such as the PPU)it doesn't stop. The system clock continues to "beat".
HALT's Crawling Problem
Our previous simulation ofHALTIt was too simplistic:
else: # If CPU is HALT
cycles_this_scanline += 4
This is terribly inefficient. We are simulating a "sleeping" CPU by advancing time at a snail's pace (4 cycles at a time). It would take 114 iterations of our Python loop just to complete one scanline. Meanwhile, theHeartbeatit shoots, it shows usLY=0, and makes us believe that the system is frozen. It is not frozen; is crawling.
HeHALTof the hardware does not "crawl". The CPU stops, but the rest of the system (the PPU) continues running at full speed until the next event. We must simulate this.
The Solution: "Fast Forward" to the Next Event
When the CPU goes intoHALT, we should not advance the time by 4 cycles at a time. We must calculate how many cycles are left until the next significant event (the end of the scanline) and advance the time in a single jump. This is a critical optimization and a much more accurate simulation of hardware behavior.
Fountain:Pan Docs - HALT behavior, Interrupts
Implementation
Two main components were modified to implement intelligent management ofHALT:
A. Signal HALT from C++
First, we needCPU::step()inform us that you have entered into statusHALT. We will use a special (negative) return value for this.
Insrc/core/cpp/CPU.cpp, we modify the case0x76(HALT) and PHASE 2 of HALT management:
// ========== PHASE 2: HALT Management ==========
// If the CPU is in HALT, do not execute instructions
// Return -1 to signal the orchestrator to "fast forward"
// until the next event (end of scanline)
if (halted_) {
cycles_ += 1;
return -1; // Special code: signals HALT for fast forward
}
// ... inside the switch (opcode)
case 0x76://HALT
halted_ = true;
cycles_ += 1; // HALT consumes 1 M-Cycle
return -1; // Special code: signals HALT for fast forward
B. Modify viboy.py to Handle the HALT Signal
Now, the orchestrator in Python reacts to this signal:
# In src/viboy.py, inside the run() method
while cycles_this_scanline< CYCLES_PER_SCANLINE:
# Ejecuta una instrucción de CPU y devuelve los M-Cycles
# m_cycles puede ser negativo (-1) si la CPU entra en HALT
m_cycles = self._cpu.step()
if m_cycles == -1:
# ¡La CPU ha entrado en HALT!
# "Avance Rápido": Calculamos los ciclos restantes para
# completar la scanline y los añadimos de un solo golpe.
remaining_cycles_in_scanline = CYCLES_PER_SCANLINE - cycles_this_scanline
t_cycles = remaining_cycles_in_scanline
cycles_this_scanline += t_cycles
else:
# Instrucción normal: convertir M-Cycles a T-Cycles
t_cycles = m_cycles * 4
cycles_this_scanline += t_cycles
Design Decisions
- Special Return Value:We use
-1as a special code to signal HALT. This is safe because no normal instruction returns a negative M-Cycles value. - Fast Forward:When we detect HALT, we calculate the remaining cycles in the current scanline and add them in one fell swoop. This correctly simulates that the CPU is asleep but the rest of the hardware is still running.
- Compatibility:The Cython wrapper already returns
int, so we don't need to modify it.
Affected Files
src/core/cpp/CPU.cpp- Modified to return-1when it enters HALT (case0x76and PHASE 2).src/viboy.py- Modified the main loop to handle special code-1and perform fast forward.tests/test_core_cpu_interrupts.py- Updated testtest_halt_stops_executionand added new testtest_halt_instruction_signals_correctly.
Tests and Verification
Interruption tests were executed to validate the new behavior ofHALT:
Command Executed
pytest tests/test_core_cpu_interrupts.py::TestHALT -v
Result
tests/test_core_cpu_interrupts.py::TestHALT::test_halt_stops_execution PASSED
tests/test_core_cpu_interrupts.py::TestHALT::test_halt_instruction_signals_correctly PASSED
tests/test_core_cpu_interrupts.py::TestHALT::test_halt_wakeup_on_interrupt PASSED
============================== 3 passed in 0.05s ==============================
Test Code
The new testtest_halt_instruction_signals_correctlyvalidates that:
def test_halt_instruction_signals_correctly(self):
"""
Step 0172: Verify that HALT (0x76) activates the 'halted' flag and
which step() returns -1 to point it out.
"""
mmu = PyMMU()
regs = PyRegisters()
cpu = PyCPU(mmu, regs)
# Configure
mmu.write(0x0100, 0x76) # HALT
regs.pc = 0x0100
assert cpu.get_halted() == 0, "CPU must not be HALT initially"
# Run
cycles = cpu.step()
# Check
assert cycles == -1, "step() must return -1 to signal HALT"
assert cpu.get_halted() == 1, "The 'halted' flag must be set"
assert regs.pc == 0x0101, "PC must have moved forward 1 byte"
Compiled C++ module validation:All tests pass correctly, confirming that the compiled C++ module works as expected.
Sources consulted
- Bread Docs:HALT behavior, Interrupts
- GBEDG:HALT (0x76)
Educational Integrity
What I Understand Now
- HALT and Emulated Time:When the CPU goes into HALT, it doesn't mean that time stops. The rest of the hardware (PPU, Timer, etc.) continues to work. Our simulation should reflect this.
- Fast Forward Optimization:Instead of advancing time by 4 cycles during HALT, we can calculate the remaining cycles until the next event and advance in one fell swoop. This is more efficient and more accurate.
- Signaling between Components:Use special return values (such as
-1) is an elegant way to communicate special states between the C++ core and the Python orchestrator.
What remains to be confirmed
- Running with Real ROM:Verify that with this new architecture, when the game enters HALT waiting for V-Blank, time advances correctly and
L.Y.increases. - HALT Awakening:Confirm that when the PPU generates a V-Blank interrupt, the CPU correctly wakes up from HALT and continues execution.
Hypotheses and Assumptions
This implementation assumes that the next significant event after HALT is always the end of the current scanline. This is correct for most cases, but there may be situations where we want to advance to a more specific event (such as a Timer interruption). For now, moving to the end of the scanline is sufficient and correct.
Next Steps
This is the moment of truth. With this new architecture:
- The game will enter
HALTto wait for V-Blank. - Our
run()will detect it, it will advance time to the end of the current scanline. ppu.step(456)will be called, andL.Y.will increase.- This will be repeated for each line. We will see in the
HeartbeatasL.Y.cycles from 0 to 153. - When
L.Y.reaches 144, the PPU will generate a V-Blank interrupt. handle_interrupts()on the C++ CPU will detect it and wake up the CPU fromHALT.- The game will continue to run.
If everything goes well,we should see the Nintendo logo or the Tetris copyright screen for the first time.