This project is educational and Open Source. No code is copied from other emulators. Implementation based solely on technical documentation and permitted tests.
Complete ALU Block (0x80-0xBF)
Summary
The complete block of the ALU (Arithmetic Logic Unit) of the range 0x80-0xBF was implemented, covering 64 opcodes including all major arithmetic and logical operations: ADD, ADC (Add with Carry), SUB, SBC (Subtract with Carry), AND, XOR, OR and CP (Compare). This block is critical because it allows the emulator to execute the calculation logic and comparison that games need to work. Generic helpers implemented for each operation and the special behavior of the H flag in the AND operation was documented.
Hardware Concept
ALU block (0x80-0xBF)
The ALU block is one of the most organized and predictable blocks in the instruction set. of the Game Boy. It contains 64 opcodes organized in 8 rows of 8 operations, where each row corresponds to a different operation and each column corresponds to a different operand.
Block structure:
- 0x80-0x87:ADD A, r (Sum)
- 0x88-0x8F:ADC A, r (Sum with Carry)
- 0x90-0x97:SUB A, r (Subtraction)
- 0x98-0x9F:SBC A, r (Subtraction with Borrow)
- 0xA0-0xA7:AND A, r (logical AND operation)
- 0xA8-0xAF:XOR A, r (XOR logical operation)
- 0xB0-0xB7:OR A, r (Logical OR operation)
- 0xB8-0xBF:CP A, r (Comparison)
Whereris one of the 8 possible operands:
B (0x80, 0x88, ...), C (0x81, 0x89, ...), D, E, H, L, (HL) (indirect memory), A.
Arithmetic Operations with Carry
ADC (Add with Carry)andSBC (Subtract with Carry)are Critical operations for multi-byte arithmetic. They allow additions and subtractions to be chained maintaining the carry/borrow of previous operations, which is essential to work with 16 or 32 bit numbers using 8 bit registers.
Example of 16-bit addition using ADC:
; Add BC + DE and store in HL
LD A, C ; A = C (BC low byte)
ADD A, E ; A = C + E
LD L, A ; L = low byte result
LD A, B ; A = B (BC high byte)
ADC A, D ; A = B + D + Carry (from the low byte)
LD H, A ; H = high byte result
Logical Operations and Flags
Logical operations (AND, OR, XOR) have specific behaviors with flags:
- AND:Z=calc, N=0,H=1(Hardware Quirk!), C=0
- OR:Z=calc, N=0, H=0, C=0
- XOR:Z=calc, N=0, H=0, C=0
CRITICAL:The H flag in AND is always set to 1, regardless of the result. This is a special behavior of the actual Game Boy hardware that many emulators fail to implement correctly. It is important for DAA instruction (Decimal Adjust Accumulator) that converts binary numbers to BCD.
CP (Compare) - Unmodified Comparison
CP is fundamentally subtraction, but with a critical difference:discard the numerical resultand it only updates the flags. The A record is NOT modified. Used for comparisons in code: "A == value?", "A< value?", etc.
Implementation
Generic helpers were implemented for each ALU operation and then generated automatically all 64 opcodes of the block using a loop that creates dynamic handlers.
Generic Helpers
The following helpers were created insrc/cpu/core.py:
_adc(value): Sum with carry (A = A + value + Carry)_sbc(value): Subtraction with borrow (A = A - value - Carry)_and(value): Logical AND operation (with quirk H=1)_or(value): Logical OR operation_xor(value): XOR logical operation
These helpers were added along with the existing ones_add, _suband_cp.
Automatic Opcode Generation
The method was implemented_init_alu_handlers()which automatically generates
the 64 opcodes of the block using a nested loop:
- Outer loop: Iterates over the 8 operations (ADD, ADC, SUB, SBC, AND, XOR, OR, CP)
- Inner loop: Iterates over the 8 operands (B, C, D, E, H, L, (HL), A)
- For each combination, calculate the opcode:
0x80 + (op_idx * 8) + reg_idx - Creates a handler that obtains the value of the operand and calls the corresponding helper
The handlers correctly handle the special case of (HL) that requires memory access (2 M-Cycles) versus normal recordings (1 M-Cycle).
Design Decisions
- Closures in Python:Closures were used to capture correctly variables in dynamically generated handlers, avoiding the common problem that all handlers end up using the final values of the loop.
- Helper reuse:Existing helpers were used
(
_get_register_value) to obtain operand values consistently. - Consistent logging:Each handler generates logs with the name of the operation and the operand to facilitate debugging.
Affected Files
src/cpu/core.py- Added generic ALU helpers (_adc, _sbc, _and, _or, _xor) and _init_alu_handlers() method that generates the 64 opcodes of the 0x80-0xBF blocktests/test_cpu_alu_full.py- New file with 8 TDD tests validating all ALU operations, including the quirk of the H flag in AND
Tests and Verification
A complete TDD test suite was created with 8 test cases:
- test_and_h_flag:Verify that AND always puts H=1 (hardware quirk)
- test_or_logic:Check basic OR operation (0x00 OR 0x55 = 0x55)
- test_adc_carry:Check ADC with active carry (A=0, value=0, Carry=1 → result=1)
- test_sbc_borrow:Check SBC with borrow active (A=0, value=0, Carry=1 → result=0xFF)
- test_alu_register_mapping:Verify that the register mapping is correct (0xB3 = OR A, E)
- test_xor_logic:Verify basic XOR operation
- test_and_memory_indirect:Check AND with indirect memory (HL)
- test_cp_register:Check CP with record (A should not be modified)
Result:All 8 tests pass correctly.
Validation with Tetris DX:The emulator can now run the 0xB3 opcode (OR A, E) that Tetris DX requests at address 0x1389, allowing the game to progress further beyond initialization.Test result:
- Initial PC:0x0100
- Final PC:0x12CB (advance 0x11CB bytes = 4,555 bytes)
- Cycles executed:70,077 M-Cycles
- Missing opcode:0xE6 (AND A, d8 - AND immediate)
The emulator successfully executed thousands of instructions, including all operations of the implemented ALU block. The next opcode needed is 0xE6 (AND A, d8), which is an immediate variant of AND that reads the operand from the next byte of memory.
Sources consulted
- Pan Docs - CPU Instruction Set:https://gbdev.io/pandocs/CPU_Instruction_Set.html
- Pan Docs - CPU Flags: Description of the behavior of flags in logical and arithmetic operations
- Z80/8080 Architecture Manual: Reference for ADC/SBC behavior and carry flags
Note: The quirk of the H flag in AND is documented in Pan Docs as behavior Game Boy hardware special.
Educational Integrity
What I Understand Now
- Structured ALU block:The 0x80-0xBF block follows a pattern very predictable that allows it to be implemented systematically with loops.
- ADC/SBC for multi-byte arithmetic:These operations are essential to work with 16 or 32 bit numbers using 8 bit registers, chaining operations and maintaining the carry/borrow between them.
- Quirk of the H flag in AND:Game Boy hardware always says H=1 after an AND operation, regardless of the result. This behavior It is important for DAA (Decimal Adjust Accumulator).
- CP as phantom subtraction:CP calculates A - value but only updates flags, does not modify A. It is essential for conditional comparisons.
What remains to be confirmed
- Exact timing:Machine cycles (M-Cycles) are implemented according to Pan Docs, but it needs to be validated with test ROMs that measure precise timing.
- Behavior of flags in borderline cases:Although the tests cover cases basics, validations with limit values (0xFF, 0x00, etc.) are missing in all possible combinations.
Hypotheses and Assumptions
The ADC/SBC implementation assumes that the Carry flag is interpreted as 1 if it is active and 0 if not, which is standard on Z80/8080 architectures. This is supported by the Pan Docs documentation.
The behavior of the H flag in AND is explicitly documented in Pan Docs as a quirk of the hardware, so it is not an assumption but a documented fact.
Next Steps
- [x] Validate with Tetris DX that the emulator can advance beyond 0x1389 ✅ (it reached 0x12CB)
- [ ] Implement immediate ALU operations (AND A, d8 (0xE6), OR A, d8, XOR A, d8, etc.)
- [ ] Implement rotations and shifts (CB prefix: RLC, RRC, RL, RR, SLA, SRA, SRL, SWAP)
- [ ] Implement bit operations (BIT, RES, SET) of the CB prefix
- [ ] Implement direct rotation instructions (RLCA, RRCA, RLA, RRA)
- [ ] Implement PPU (Graphics Processor) to render the screen