This project is educational and Open Source. No code is copied from other emulators. Implementation based solely on technical documentation and permitted tests.
Diagnosing Segmentation Fault with Native Traces
Summary
The emulator is crashing with aSegmentation Faultwhen running ROMs, indicating a memory access error in the C++ kernel. To diagnose the problem, the C++ CPU has been instrumented with detailed logging usingstd::coutwhich prints the CPU state (PC, opcode, registers) on each instruction cycle. This will allow you to identify the last instruction executed before the crash and, specifically, detect if any jump instructions (J.P., J.R., CALL, RET) is calculating an invalid destination address.
Hardware Concept
ASegmentation Fault(or "access violation" in Windows) is a low-level error that occurs when a program tries to access a memory address that does not belong to it or that is protected by the operating system. In the context of an emulator, this generally means that:
- The CPU attempted to read an opcode from an invalid memory address (outside the range 0x0000-0xFFFF).
- A jump instruction computed a corrupt or out-of-range destination address.
- The PC (Program Counter) pointer was corrupted and points to unmapped memory.
On the Game Boy, all memory addresses are valid in the 16-bit range (0x0000-0xFFFF), but certain memory regions can be protected by the host operating system. If the C++ CPU tries to dofetch_byte()from an address that the operating system considers invalid (for example, if PC was corrupted and has a value like 0xFFFFFFFF), the operating system stops the program with a segfault.
Debugging Strategy:Logging in C++ is a classic low-level debugging technique. By printing the CPU's status before each instruction, we create a "black box" that tells us everything it does. When the program crashes, the last line printed to the console will be the instruction that caused the problem. This is especially useful for finding bugs in jump instructions, since we can see exactly what direction it calculated and compare it to what it should be.
Implementation
Temporary logging has been added to the C++ CPU to track each instruction cycle. Logging is implemented at two levels:
1. General Logging in CPU::step()
Right after reading the opcode (before the switch), the full CPU status is printed:
- PC: Current address of the Program Counter (before reading the opcode)
- Opcode: Operation code read from memory
- Records: AF, BC, DE, HL, SP (in hexadecimal)
This log is printed toeach instruction, which will generate massive output but allow you to see exactly what the CPU was doing when it crashed.
2. Specific Logging in Jump Instructions
Added additional logging in instructions that modify PC:
- JP nn (0xC3): Shows the calculated destination address
- JR e (0x18): Shows the offset (with sign) and the new calculated PC
- JR NZ, e (0x20): Shows whether it jumped or not (according to the Z flag) and the new PC jumped
- CALL nn (0xCD): Displays the destination address and return address saved on the stack
- RET (0xC9): Displays the return address retrieved from the stack
This specific logging is crucial because jumps are the most likely cause of segfault: if one instruction computes an invalid address, the nextfetch_byte()will try to read from that address and will crash.
Design Decisions
- Temporary Logging:This record istemporaryand will be removed after the bug is found and fixed. Console I/O is extremely slow and should not be in the critical emulation loop in production.
- Hexadecimal format:All values are printed in hexadecimal with padding of zeros for easy reading and comparison with technical documentation.
- PC Before Fetch:The PC is savedbeforeto call
fetch_byte(), becausefetch_byte()increases CP. This shows us the exact address where the CPU was when it read the opcode.
Affected Files
src/core/cpp/CPU.hpp- Added includes foriostreamandiomanipsrc/core/cpp/CPU.cpp- Added logging inCPU::step()and in jump instructions (JP, JR, CALL, RET)
Tests and Verification
Note:This step is diagnostic, not implementation. The tests will be executed after identifying and correcting the bug.
Debugging Process:
- Recompile the C++ module with the registry:
.\rebuild_cpp.ps1 - Run the emulator with a ROM:
python main.py roms/tetris.gb - Watch the console output until it crashes
- Analyze the last 5-10 lines printed before the crash
- Identify the problematic instruction and calculated direction
- Correct the bug in the identified instruction
- Delete temporary record
- Recompile and verify that the crash has been resolved
Native Validation:Logging runs directly in C++, bypassing Python, ensuring we capture the exact state of the CPU right before the crash.
Sources consulted
- Bread Docs:CPU Instruction Set- Reference for jump instructions
- C++ Documentation:I/O Manipulators- For hexadecimal formatting with
std::hex,std::setw,std::setfill
Educational Integrity
What I Understand Now
- Segmentation Fault:It is a memory access error that occurs when the program tries to read/write to an invalid address. In our case, probably PC got corrupted and points outside the valid memory range.
- Logging in C++:Although it is slow, logging with
std::coutIt is a powerful tool for low-level debugging. It allows us to see the exact status of the CPU in each cycle. - Emulator Debugging:Segfaults in emulators are usually caused by jump instructions that calculate incorrect addresses. Logging allows us to identify exactly which instruction and which address caused the problem.
What remains to be confirmed
- Guilty Instruction:We need to run the emulator with the log and analyze the output to identify which instruction is computing an invalid address.
- Root Cause:Once the instruction is identified, we need to understand why it is miscalculating the address. It could be an error in the calculation logic, a problem with the Little-Endian format, or a bug in the handling of signed offsets.
- Other Jumps:Although we have added logging to the most common jumps, there may be other conditional jumps (JR Z, JR C, etc.) that also need review.
Hypotheses and Assumptions
Main Hypothesis:The bug is in a jump instruction that miscalculates the destination address. The most probable causes are:
- Error in the calculation of the signed offset in
J.R.(the offset is read asuint8_tbut it must be interpreted asint8_t). - Little-Endian format error when reading 16-bit addresses in
J.P.eitherCALL. - Stack corruption causing
RETrecover an invalid address.
Logging will allow us to confirm or refute these hypotheses.
Next Steps
- [ ] Recompile the C++ module with the registry:
.\rebuild_cpp.ps1 - [ ] Run the emulator and capture the output until the crash
- [ ] Analyze the last lines of the log to identify the problematic instruction
- [ ] Correct the bug in the identified instruction
- [ ] Delete temporary code logging
- [ ] Recompile and verify that the segfault has been fixed
- [ ] Run tests to ensure nothing else broke