Oct 11, 2024

How Computers Understand 0s and 1s

I built an emulation of an 8-bit CPU, and somewhere along the way, computers just started to make sense.

Article hero image

Overview

I recently started an exciting project: building an NES emulator in C++. An emulator allows a computer to mimic the behavior of other hardware—in my case, the Nintendo Entertainment System. It works by loading a copy of an NES game’s memory and executing the program similarly to a real NES. This means replicating the behavior of the CPU, the PPU (Picture Processing Unit), and the APU (Audio Processing Unit).

I chose to start with the brain of the system, the CPU. The NES used a variant of the MOS Technology 6502 CPU, a legendary 8-bit microprocessor housed in other devices like the Commodore 64, Atari 5200, and the Apple II computer. Compared to modern chips, the 6502 is very simple, and it’s a great place to start learning about how computers work under the hood.

Instruction Execution

CPUs execute instructions. The “8-bit” part means that data is processed in 8-bit chunks. The 6502 registers—small, high-speed memory areas used to hold temporary variables—are 8 bits wide, which means instructions typically manipulate 8-bit numbers.

What is an Instruction?

An instruction is a directive telling the CPU to do something, like moving data around, performing a calculation, or jumping to another section of the program. Here’s a snippet of 6502 assembly code to illustrate:

We load a value into register A, transfer it into X, and then increment X. In a higher-level language, the operation might look like:

Assembly vs. Machine Code

To make the CPU understand any given instruction, it has to decode or “disassemble” the instruction into machine code. Figuring out how to assemble and disassemble instructions sounds like a great weekend, but it isn’t required for my project just yet.

Executing Machine Code

Looking back at the first line of the assembly snippet, LDA #$01 can be written as 0xA9 0x01. Hex values map to the instructions, like in this case, 0xA9 maps to LDA and 0x01 to 01 respectively.

To understand how, let’s load the above program into memory, starting at the address 0x0000:

Let’s briefly talk about how the CPU would execute this code.

  • The program counter (PC) starts program execution at the address 0x0000.
  • The CPU sees 0xA9 and recognizes it as a valid opcode that maps to a 2-byte instruction that tells the CPU to load the next byte of memory into the A register.
  • The next byte of memory is 0x0001, where PC is currently located. The value at this address, 0x01 gets loaded into the A register and PC increments to the next address, 0x0002
  • The CPU recognizes 0xBA as a valid opcode for TAX, a 1-byte instruction. It transfers the value from register A to register X and increments the program counter to the next address, 0x0003
  • At 0x0003 we have another 1-byte instruction set, which increments the value in register X and moves the PC to 0x0004
  • The program sees BRK and stops execution

How do we know which hex value (opcode) maps to which instruction? The chip manufacturer designs the instructions and determines which opcode maps to which instruction. Programmers have to use a data sheet—chip manual also provided by the manufacturer—to get relevant chip information.

More than Just Bits and Bytes

The 6502 CPU has 56 instructions, all represented by some value between 0x00 and 0xFF. The “aha!” moment came when I realized that at its core, the CPU is just an interpreter, (sort of?). In essence, the process of developing the 6502 emulator has been:

  • Load an opcode into memory
  • Implement the instruction for the given opcode
  • Execute the instruction and test to ensure the registers and memory are updated correctly.

Of course, it’s not the whole picture; I’m still deep in the weeds of trying to understand how everything works, but the work I’ve done so far has given me a newfound appreciation for how computers work.

If you’re curious about NES emulation, here are some of the resources I used to get started:

Assembly

C++

Emulation

Back to Blog