This project is educational and Open Source. No code is copied from other emulators. Implementation based solely on technical documentation and permitted tests.
Integration of Core C++ in the Frontend
Summary
Completed integration of C++ core (CPU, MMU, PPU, Registers) into Python frontend, replacing slow Python components with the compiled native versions. The system It can now execute machine code directly, reaching speeds of potentially thousands of FPS. The renderer was adapted to use the C++ framebuffer through Zero-Copy (memoryview), eliminating calculating tiles in Python and allowing a direct blit from the native framebuffer to Pygame.
Hardware Concept
In a real Game Boy, all components (CPU, MMU, PPU) are implemented in hardware and communicate directly through memory buses and control signals. There is no overhead interpretation: each instruction is executed in precise clock cycles.
In an emulator, everything is traditionally implemented in a high-level language (Python, JavaScript, etc.) which is easy to understand but slow because each operation goes through multiple layers of abstraction. The solution is to migrate the "critical loop" (the code that executes millions of times per second) to a compiled language (C++) that runs directly on the CPU without overhead.
Zero-Copy Integration:The C++ PPU framebuffer is exposed as a memoryview from NumPy, which is a direct view to C++ memory without copies. Pygame can read directly from this memory, eliminating the intermediate step of calculating tiles in Python. This is critical for performance: instead of decoding 23,040 pixels in Python every frame, just we blit the framebuffer already rendered in C++.
Implementation
It was modifiedsrc/viboy.pyto detect and use the moduleviboy_corewhen available. The system maintains backward compatibility: if the C++ core is not
compiled, it uses Python components as a fallback.
Components created/modified
- src/viboy.py:Hybrid integration that detects and uses C++ or Python components depending on availability
- src/gpu/renderer.py:Adapted to use C++ framebuffer via Zero-Copy when available
- tests/test_integration_cpp.py:Complete integration test that validates the system with core C++
Design decisions
Backwards Compatibility:The system detects ifviboy_coreis available
and use the C++ components if possible, but keep the Python code as a fallback. This allows
make the project work even if the build fails or has not been run.
Zero-Copy Framebuffer:The C++ framebuffer is exposed as NumPy's memoryview,
which is compatible with Pygame usingpygame.surfarray. This eliminates unnecessary copies
and allows you to transfer 23,040 pixels (160x144) directly from C++ to the GPU without going through Python.
Timer and Joypad still in Python:For now, Timer and Joypad are still in Python because They are not critical bottlenecks. The Timer is updated every instruction, but its logic is simple. The Joypad is only read when there are keyboard events. These components can be migrated later if necessary.
Affected Files
src/viboy.py- Hybrid C++/Python integration with auto detectionsrc/gpu/renderer.py- Adapted to use C++ framebuffer with Zero-Copytests/test_integration_cpp.py- Complete integration test (7 tests)
Tests and Verification
A complete integration test was created that validates all aspects of the system with core C++:
- Initialization test:Verify that Viboy is initialized correctly with C++ components
- ROM loading test:Validates that the ROM loads correctly in MMU C++
- CPU Run Test:Execute 1000 instructions without errors
- PPU synchronization test:Validates that PPU synchronizes correctly with CPU
- Framebuffer access test:Verify that the framebuffer is accessible and the correct size
- Record access test:Validates reading/writing of C++ registers
- Full cycle test:Runs 100 full cycles (CPU + PPU) without errors
Result:All tests pass (7/7) in 25.06 seconds.
$ python -m pytest tests/test_integration_cpp.py -v
============================= test session starts =============================
platform win32 -- Python 3.13.5, pytest-9.0.2, pluggy-1.6.0
collecting...collected 7 items
tests/test_integration_cpp.py::TestIntegrationCPP::test_viboy_initialization_with_cpp_core PASSED
tests/test_integration_cpp.py::TestIntegrationCPP::test_load_rom_into_cpp_mmu PASSED
tests/test_integration_cpp.py::TestIntegrationCPP::test_execute_cpu_instructions PASSED
tests/test_integration_cpp.py::TestIntegrationCPP::test_ppu_synchronization PASSED
tests/test_integration_cpp.py::TestIntegrationCPP::test_framebuffer_access PASSED
tests/test_integration_cpp.py::TestIntegrationCPP::test_registers_access PASSED
tests/test_integration_cpp.py::TestIntegrationCPP::test_full_cycle_execution PASSED
============================= 7 passed in 25.06s =============================
Compiled C++ module validation:The test explicitly verifies that the components They are instances of Cython classes (PyCPU, PyMMU, PyPPU, PyRegisters) and not Python classes.
Sources consulted
- Bread Docs:Game Boy Pan Docs- General technical reference
- Cython Documentation:Cython Documentation- Memoryviews and Zero-Copy
- NumPy Documentation:NumPy Documentation- Memoryviews and arrays
Note: Zero-Copy integration is based on knowledge of Cython and NumPy, not code from other emulators. The hybrid architecture is a design decision specific to the project.
Educational Integrity
What I Understand Now
- Zero-Copy Integration:NumPy memoryviews allow Python to access directly to C++ memory without copies. This is critical to performance when transferring large amounts of data (such as a 23,040 pixel framebuffer) each frame.
- Hybrid Compatibility:It is possible to keep both systems (Python and C++) in the same code, automatically detecting which one to use. This allows incremental development and easier debugging.
- Actual Performance:The C++ core can execute thousands of instructions per second without problems. The bottleneck is now in rendering and FPS synchronization, not in the emulation itself.
What remains to be confirmed
- Production performance:We need to measure real FPS with a full ROM to confirm that we reached 60 stable FPS. Unit tests do not measure graphical performance.
- Sprites in C++:The sprites are still rendered in Python. we need to migrate rendering sprites to C++ to complete the graphics engine (Step 114).
- Timer C++:The Timer is still in Python. Although not critical, could benefit of the migration to maintain consistency.
Hypotheses and Assumptions
Assumption:C++ ARGB32 framebuffer correctly converts to RGBA for Pygame. The conversion uses vectorized NumPy operations, which should be fast, but we have not measured the real overhead of this conversion.
Assumption:The system works correctly without Timer C++ because the Timer only each instruction is updated and its logic is simple. If there are RNG accuracy issues In games like Tetris, we will need to migrate the Timer.
Next Steps
- [ ] Step 114: Implement Sprite Rendering in C++
- [ ] Measure real performance with full ROMs (FPS, CPU usage)
- [ ] Optimize ARGB32 → RGBA conversion if necessary
- [ ] Consider migrating Timer to C++ if there are accuracy issues
- [ ] Implement Joypad C++ if necessary to complete the migration