Nuts and Bolts of writing 8-bit emulator: Part 6: The Stack and Related Operations

Foreword

In the previous post we covered the compare and branching instructions.

In this post we will cover the stack and related instructions.

Introduction to the stack

A real world example of a stack is a stack of receipts on a spike in a restaurant.

Consider the situation where you have 100 receipts on the spike and you need to get to the very first receipt you pushed onto the spike. This would mean pulling all 99 receipts above this receipt to get to it. Clearly this system is not very suitable for these kind of operations.

However, if you need to get to the last receipt you pushed onto the spike, the task is much simpler, since the last pushed receipt will always be on the top of the stack.

How is the stack implemented on the 6502. Firstly, the stack is located in page 1 of memory. This is memory location 100 to 1FF.

Also, on the 6502 the stack grows downwards. This means from location 1FF towards location 100.

How does the 6502 keeps track where we are on the stack? The 6502 contains a register called the Stack Pointer (SP). This is register is also 8 bits in size, so the ninth bit of the stack address (which is always one), is implied.

The most basic operations of the stack are push and pop. A push puts a byte of data on the stack and decrement the stack pointer (e.g. remember the stack grows downwards). A Pop retrieves a byte of data on the stack and increment the stack pointer.

Implementing the stack

Ok. Lets implement the stack in our Emulator and implement its basic operations, push and pop.

First, we need to create a private variable in our Cpu class for the stackpointer:

  var sp = 0xff;

As you can we initialise this register from the start with 0xff. This is because the stack starts to grow at memory address 1FF.

Next, lets implement the push operation:

    function Push(value) {
      localMem.writeMem((sp | 0x100), value);
      sp--;
      sp = sp & 0xff;
    }

And, finally the pop:

    function Pop() {
      sp++;
      sp = sp & 0xff;
      var result = localMem.readMem(sp | 0x100);
      return result;
    }

Note, in the pop we do the increment of SP before the memory operation. This is the opposite as done in the push. This is because the stack pointer points to the top of the stack, not to the data itself. The data is always one location below the top of the stack.

Implementing Push and Pop opcodes

Time to implement the Push and Pop opcodes.

Two of these opcodes require special mention: PHP Push (Processor Status on Stack) and PLP (Pull Processor Status from Stack). These two opcodes works with all the status flags as a single byte. Currently this is not how we implement status flags in our emulator. To help us out, I will create two convenience methods that will retrieve and set the flags as and with a single byte.

Firstly we need to know which flags the different bits in the status byte represents. We also get this info on the masswerk website:

SR Flags (bit 7 to bit 0):

N .... Negative
V .... Overflow
- .... ignored
B .... Break
D .... Decimal (use BCD for arithmetics)
I .... Interrupt (IRQ disable)
Z .... Zero
C .... Carry

For the moment we will not worry about the break, Decimal or Interrupt flag. We will implement as required.

Next, lets create a method for retrieving the status flags as a byte:

    function getStatusFlagsAsByte() {
      var result = (negativeflag << 7) | (overflowflag << 6) | (zeroflag << 1) |
        (carryflag);
      return result;
    }

Here is the method to set the status flags with a byte:

    function setStatusFlagsAsByte(value) {
      negativeflag = (value >> 7) & 1;
      overflowflag = (value >> 6) & 1;
      zeroflag = (value >> 1) & 1;
      carryflag = (value) & 1;
    }

With this implemented, it is now straight forward to implement the push and pop instructions:

/*PHA  Push Accumulator on Stack

     push A                           N Z C I D V
                                      - - - - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     implied       PHA           48    1     3 */

      case 0x48:
        Push(acc);
      break;


/*PHP  Push Processor Status on Stack

     push SR                          N Z C I D V
                                      - - - - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     implied       PHP           08    1     3 */

      case 0x08:
        Push(getStatusFlagsAsByte());
      break;


/*PLA  Pull Accumulator from Stack

     pull A                           N Z C I D V
                                      + + - - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     implied       PLA           68    1     4 */

      case 0x68:
        acc = Pop();
        zeroflag = (acc == 0) ? 1 : 0;
        negativeflag = ((acc & 0x80) != 0) ? 1 : 0;
      break;



/*PLP  Pull Processor Status from Stack

     pull SR                          N Z C I D V
                                      from stack

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     implied       PHP           28    1     4 */

      case 0x28:
        setStatusFlagsAsByte(Pop());
      break;

JSR and RTS

The JSR (Jump to Subroutine) is almost the same as the Jump (JMP) instruction. The only difference is that JSR remembers the address it was called from, so that when RTS (Return from Subroutine) is called, it jumps back to this address.

Lets have a look at the definition of JSR:

JSR  Jump to New Location Saving Return Address

     push (PC+2),                     N Z C I D V
     (PC+1) -> PCL                    - - - - - -
     (PC+2) -> PCH

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     absolute      JSR oper      20    3     6

From this we see that we do not store the address of the next instruction, but the preceding memory location (e.g. the third byte of the JSR instruction). This is something to keep in mind when executing the RTS instruction.

Something that is not clear from this definition, is the order in which the address is pushed unto the stack. Googling this question doesn't give you a straight answer. However, after some digging I found this link:

http://nesdev.com/6502.txt

At the end of this page it give some snippets of code from the Vice emulator showing how each instruction is implemented. For JSR here is the snippet:

/* JSR */
    PC--;
    PUSH((PC >> 8) & 0xff); /* Push return address onto the stack. */
    PUSH(PC & 0xff);
    PC = (src);

The first line, PC-- means don't use the address of the next instruction, but the preceding byte. This is as expected as we discussed earlier.

The next two lines gives us the answer to our question. First, the high byte is pushed and then the low byte. Not very little-endian like :-) However, when RTS executes, the POPS will return the parts of the return address in the reverse order, which is little endian!

Lets implement JSR:

/*JSR  Jump to New Location Saving Return Address

     push (PC+2),                     N Z C I D V
     (PC+1) -> PCL                    - - - - - -
     (PC+2) -> PCH

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     absolute      JSR oper      20    3     6 */

      case 0x20:
        var tempVal = pc - 1;
        Push((tempVal >> 8) & 0xff);
        Push(tempVal & 0xff);
        pc = effectiveAdrress;
      break;

Let us also implement RTS:

/*RTS  Return from Subroutine

     pull PC, PC+1 -> PC              N Z C I D V
                                      - - - - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     implied       RTS           60    1     6 */

      case 0x60:
        var tempVal = Pop();
        tempVal = tempVal + Pop() * 256;
        pc = tempVal + 1;
      break;

Testing

As usual we end off a post with a small test program:

0000 LDA #$52 A9 52
0002 PHA      48
0003 LDA #$07 A9 07
0005 JSR $000A 20 0A 00
0008 PLA       68
0009 00        00
000A SBC #$06  E9 06
000C RTS       60

Here is the program as a byte array to copy to the memory class:

0xA9, 0x52,
0x48,
0xA9, 0x07,
0x20, 0x0A, 0x00,
0x68,
0x00,
0xE9, 0x06,
0x60

I also adjusted getDebugReg() in the Cpu class to also return the contents of the stack pointer. All this you can get in ch6 tag on GitHub.

In Summary

In this post we covered the stack and related instructions.

In the next post I will cover the remaining Instructions we still need to implement in our emulator. This post will be my final post relating to implement 6502 instruction to our emulator.

With the majority of instructions implemented on our emulator, I will cover a post where I give the Klaus Test suite a spin on our emulator and fix bugs it will bring forth.

This is all for this post.

Till next time!

Nuts and Bolts of writing 8-bit emulator

Thursday 5 May 2016

Part 6: The Stack and Related Operations