Nuts and Bolts of writing 8-bit emulator: April 2016

Tuesday 26 April 2016

Part 4: The ADC and SBC instuctions

Forward

In the previous section we extended our index html page to show more state of the emulator.

In this section we will implement the ADC and SBC instructions.

Before we go into implementation details I will cover Signed/unsigned numbers and then Twos complement.

Signed and unsigned integers

An 8-bit integer can be either signed or unsigned.

Unsigned integers can only represent positive numbers in the range 0 to 255.

Signed integers can represent both positive and negative numbers in the range -128 to 127. In this type the most significant bit is used to indicate the sign. If it is zero, the number is positive. If it is a one it is negative.

The negative numbers is represented in a form we call Two's complement which we will discuss in the next section.

Two's Complement

How do you the determine the Two's Complement of a number? Here is the algorithm:

Take the magnitude of the number and convert it to binary
Negate the binary number. This means changing each one to a zero and each zero to a one
Add one to the result

Let us take an example. Let's say you want to convert -5 to two's complement.

The magnitude of 5 in binary is 0000 0101.

If you negate this number you get 1111 1010 and adding one you get 1111 1011. This 251 in decimal.

Two's complement gives you the ability to subtraction with addition.

Let's take an example again. Say you want to calculate 5 - 3. You can rewrite this as 5 + (-3).

-3 you can change to two complement. This results to 1111 1101.

We can now add these two numbers together

0000 0101

1111 1101

(1)0000 0010

The end result is a nine digit binary number. This ninth bit is called the carry. The other eight bits have our result 0000 0010 which is 2.

Lets take an example with two negative numbers: -7 - 9. Again we can rewrite this as (-7) + (-9)

In this example we need to do a twos complement on both 7 and 9. This will yield 1111 1001 and 1111 0111. Lets add them together:

1111 1001
1111 0111

(1) 1111 0000

Again, we hot a carry. As we know the result should be a negative number so in this case the answer is also in twos complement form. To evaluate whether this answer is correct we need to do another twos complement conversion to change it to the positive equivalent. This yields 0001 0000 which is 16. So we know know this complement number represent -16 which is correct.

Finally, lets see what happens when we do a calculation that will exceed the signed number range of 127...-128. For this, we do -125 -4

1000 0011
1111 1100
(1) 0111 1111

The eighth bit of the result is 0, implying a positive number, which is incorrect. This condition is called overflow. The 6502 actually have a overflow flag to indicating such a condition.

Implementing ADC and SBC

Time to implement the ADC and SBC instructions in our emulator.

The most tricky part of implementing the ADC and SBC instruction is when how to determine when to set the overflow flag. As usual, Google is your friend :-)

The website http://www.righto.com/2012/12/the-6502-overflow-flag-explained.html gives you a small formula for determining if the overflow flag should b set:

(M^result)&(N^result)&0x80

Just to clarify the notation. ^ means XOR and & means AND.

Basically this formula says set the overflow flag if the signs of both the inputs differ from the sign of the result.

With this in mind we can start coding.

First lets get definition of ADC from Appendix A:

 
ADC  Add Memory to Accumulator with Carry

     A + M + C -> A, C                N Z C I D V
                                      + + + - - +

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     immediate     ADC #oper     69    2     2
     zeropage      ADC oper      65    2     3
     zeropage,X    ADC oper,X    75    2     4
     absolute      ADC oper      6D    3     4
     absolute,X    ADC oper,X    7D    3     4*
     absolute,Y    ADC oper,Y    79    3     4*
     (indirect,X)  ADC (oper,X)  61    2     6
     (indirect),Y  ADC (oper),Y  71    2     5*

From this definition, we see that we should declare two new flags in our CPU: Carry and Overflow. So, here it goes:

 
    var carryflag =0;
    var overflowflag =0;

Next, a function should be created for performing the addition:

    function ADC(operand1, operand2) {
      temp = operand1 + operand2 + carryflag;
      carryflag = ((temp & 0x100) == 0x100) ? 1 : 0;
      overflowflag = (((operand1^temp) & (operand2^temp) & 0x80) == 0x80) ? 1 : 0;
      temp = temp & 0xff;
      return temp;
    }

You will see when we do the addition, we are also adding the carry flag. This is part of the definition of ADC: A + M + C -> A, C

Before the result return we AND it with 0xff. This is to ensure a proper 8-bit value is returned with the carry bit stripped.

Next, we should implement this instruction within our Case statement:

 
/*ADC  Add Memory to Accumulator with Carry

     A + M + C -> A, C                N Z C I D V
                                      + + + - - +

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     immediate     ADC #oper     69    2     2
     zeropage      ADC oper      65    2     3
     zeropage,X    ADC oper,X    75    2     4
     absolute      ADC oper      6D    3     4
     absolute,X    ADC oper,X    7D    3     4*
     absolute,Y    ADC oper,Y    79    3     4*
     (indirect,X)  ADC (oper,X)  61    2     6
     (indirect),Y  ADC (oper),Y  71    2     5* */

      case 0x69:
        acc = ADC (acc, arg1);
        zeroflag = (acc == 0) ? 1 : 0;
        negativeflag = ((acc & 0x80) != 0) ? 1 : 0;
      break;
      case 0x65:
      case 0x75:
      case 0x6D:
      case 0x7D:
      case 0x79:
      case 0x61:
      case 0x71:
        acc = ADC (acc, localMem.readMem(effectiveAdrress));
        zeroflag = (acc == 0) ? 1 : 0;
        negativeflag = ((acc & 0x80) != 0) ? 1 : 0;
      break;

Next, we need to implement the SBC instruction. First we write a method for SBC:

 
   function SBC(operand1, operand2) {
      operand2 = ~operand2 & 0xff;
      operand2 = operand2 + (1 - carryflag);
      temp = operand1 + operand2;
      carryflag = ((temp & 0x100) == 0x100) ? 1 : 0;
      overflowflag = (((operand1^temp) & (operand2^temp) & 0x80) == 0x80) ? 1 : 0;
      temp = temp & 0xff;
      return temp;
    }

The first two lines performs twos complement on operand2.

Finally, let us create a case statement for SBC:

/*SBC  Subtract Memory from Accumulator with Borrow

     A - M - C -> A                   N Z C I D V
                                      + + + - - +

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     immediate     SBC #oper     E9    2     2
     zeropage      SBC oper      E5    2     3
     zeropage,X    SBC oper,X    F5    2     4
     absolute      SBC oper      ED    3     4
     absolute,X    SBC oper,X    FD    3     4*
     absolute,Y    SBC oper,Y    F9    3     4*
     (indirect,X)  SBC (oper,X)  E1    2     6
     (indirect),Y  SBC (oper),Y  F1    2     5*  */

      case 0xE9:
        acc = SBC (acc, arg1);
        zeroflag = (acc == 0) ? 1 : 0;
        negativeflag = ((acc & 0x80) != 0) ? 1 : 0;
      break;
      case 0xE5:
      case 0xF5:
      case 0xED:
      case 0xFD:
      case 0xF9:
      case 0xE1:
      case 0xF1:
        acc = SBC (acc, localMem.readMem(effectiveAdrress));
        zeroflag = (acc == 0) ? 1 : 0;
        negativeflag = ((acc & 0x80) != 0) ? 1 : 0;
      break;

Increment and Decrement Instructions

Instructions very related to ADC and SBC instructions are Increment and Decrement. I will there try to squeeze in a discussion on these instructions in this blog post.

Increment will increase the value of the accumulator or memory location by one.

Decrement decreases the value of the accumulator or memory location by one.

It is important to note that Increment and Decrement Instructions does not effect the Carry or Overflow flag.

Lets implement these instructions.

Firstly the INC instruction. This instruction increment the value of a specific memory location:

/*INC  Increment Memory by One

     M + 1 -> M                       N Z C I D V
                                      + + - - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     zeropage      INC oper      E6    2     5
     zeropage,X    INC oper,X    F6    2     6
     absolut      INC oper      EE    3     6
     absolute,X    INC oper,X    FE    3     7 */
 
      case 0xE6:
      case 0xF6:
      case 0xEE:
      case 0xFE:
        var tempVal = localMem.readMem(effectiveAdrress);
        tempVal++; tempVal = tempVal & 0xff;        
        localMem.writeMem(effectiveAdrress, tempVal);
        zeroflag = (tempVal == 0) ? 1 : 0;
        negativeflag = ((tempVal & 0x80) != 0) ? 1 : 0;
      break;

Next up, INX (Increment X register):

/*INX  Increment Index X by One

     X + 1 -> X                       N Z C I D V
                                      + + - - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     implied       INX           E8    1     2*/
 
      case 0xE8:

        x++; x = x & 0xff;        
        zeroflag = (x == 0) ? 1 : 0;
        negativeflag = ((x & 0x80) != 0) ? 1 : 0;
      break;

And, we do the same instruction, but for the Y register:

/*INY  Increment Index Y by One

     Y + 1 -> Y                       N Z C I D V
                                      + + - - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     implied       INY           C8    1     2*/
 
      case 0xC8:

        y++; y = y & 0xff;        
        zeroflag = (y == 0) ? 1 : 0;
        negativeflag = ((y & 0x80) != 0) ? 1 : 0;
      break;

Now it is time to do a similar exercise for the Decrement instructions:

First the DEC instruction:

/*DEC  Decrement Memory by One

     M - 1 -> M                       N Z C I D V
                                      + + - - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     zeropage      DEC oper      C6    2     5
     zeropage,X    DEC oper,X    D6    2     6
     absolute      DEC oper      CE    3     3
     absolute,X    DEC oper,X    DE    3     7 */
 
      case 0xC6:
      case 0xD6:
      case 0xCE:
      case 0xDE:
        var tempVal = localMem.readMem(effectiveAdrress);
        tempVal--; tempVal = tempVal & 0xff;        
        localMem.writeMem(effectiveAdrress, tempVal);
        zeroflag = (tempVal == 0) ? 1 : 0;
        negativeflag = ((tempVal & 0x80) != 0) ? 1 : 0;
      break;

Then, DEX (Decrement X register):

/*DEX  Decrement Index X by One

     X - 1 -> X                       N Z C I D V
                                      + + - - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     implied       DEC           CA    1     2*/
 
      case 0xCA:

        x--; x = x & 0xff;        
        zeroflag = (x == 0) ? 1 : 0;
        negativeflag = ((x & 0x80) != 0) ? 1 : 0;
      break;

And finally DEY (decrement Y register):

/*DEY  Decrement Index Y by One

     Y - 1 -> Y                       N Z C I D V
                                      + + - - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     implied       DEC           88    1     2*/
 
      case 0x88:

        y--; y = y & 0xff;        
        zeroflag = (y == 0) ? 1 : 0;
        negativeflag = ((y & 0x80) != 0) ? 1 : 0;
      break;

Testing

Let us end of this blog with a small test assembly program to test the functionality we have implemented:

Here is the test program:

LDA #$0a  a9 0a
ADC #0f   69 0f
LDA #$32  a9 32
ADC #$32  69 32
LDA #E7   a9 e7
ADC #$CE  69 ce
LDA #$88  a9 88
ADC #$EC  69 ec
LDA #$43  a9 43
ADC #$3C  69 3c

LDA #$0f  a9 0f
SBC #$0a  e9 0a
LDA #$0a  a9 0a
SBC #$0d  e9 0d
LDA #$78  a9 78
SBC #$F9  e9 f9
LDA #$88  a9 88
SBC #$F8  e9 f8
LDX #$FE  a2 fe
INX       e8
INX       e8
DEX       ca
DEX       ca
LDA #$FE  a9 fe
STA $64   85 64
INC $64   e6 64
INC $64   e6 64
DEC $64   c6 64
DEC $64   c6 64

This translates to the following sequence of bytes you need to add to the memory class:

0xa9, 0x0a, 0x69, 0x0f, 0xa9, 0x32, 0x69, 0x32, 0xa9, 0xe7, 0x69, 0xce, 0xa9, 0x88, 0x69, 0xec, 0xa9, 0x43, 0x69, 0x3c, 0xa9, 0x0f, 0xe9, 0x0a, 0xa9, 0x0a, 0xe9, 0x0d, 0xa9, 0x78, 0xe9, 0xf9, 0xa9, 0x88, 0xe9, 0xf8, 0xa2, 0xfe, 0xe8, 0xe8, 0xca, 0xca, 0xa9, 0xfe, 0x85, 0x64, 0xe6, 0x64, 0xe6, 0x64, 0xc6, 0x64, 0xc6, 0x64

While testing this program I actually picked up that the getDecodedStr() function doesn't return anything for implied mode instructions (like INX). I fixed up the code and fix will be available in the tag for this Blog Post.

In Summary

I this Blog Post I have covered the ADC and SBC instructions together with the related instructions Increment and Decrement.

In the next Blog I will be covering Comparisons and Branching.

As in the previous post I will provide the source code covered this post in a tag. The tag for this post is ch4.

Till Next time!

Thursday 21 April 2016

Part 3: Viewing emulator state

Foreword

In the previous section we implemented all address mode variants of the LOAD ans STORE instructions.

Up to this point in time our emulator doesn't give much information about its state while executing instructions. We will give some attention to this shortfall in this post.

Something else I was busy with the last week or so, was to put the source code of our JavaScript emulator on GitHub.

After each part in my blog series, I will commit my code to GitHub and tag it. Doing it like this you can immediately access the source code covered in any part of my series.

To get to the source code, please follow this link and click releases:

https://github.com/ovalcode/c64jscript

What we are working to

As mentioned previously, in this post we will be jacking up our html page so that we can have more info on the state of our emulator while it is executing instructions.

So, let us start with a picture showing what we will be working to:

As you can see the screen will be divided in the regions. The first region will show to values of all the registers: Accumulator, X, Y and Program counter.

The second region will show a small dump of memory starting at memory location 0. I have plans for future posts that will allow users to specify a different region in memory, but for now this hardcoding will be sufficient.

The last region will show a disassembled version of the next instruction to be executed.

Finally, we will be adding a button called Step, allowing us to single step through a 6502 assembly program and viewing the state of our emulator. This is in contrast with the earlier posts where we actually hardcoded a couple of calls to step().

Getting HTML layout right

Now that we something to work from, let us get the HTML together so a browser will show this layout.

The HTML is fairly straightforward and I present it here:

<!DOCTYPE html>
<html>
  <head>
    <title>6502 Emulator From Scratch</title>
    <script src="Memory.js"></script>
    <script src="Cpu.js"></script>
  </head>
  <body>
    <h1>6502 Emulator From Scratch</h1>
    <p>This is JavaScript Test</p>
<textarea id="registers" name="reg" rows="1" cols="60"></textarea>
<br/>
<textarea id="memory" name="mem" rows="15" cols="60"></textarea>
<br/>
<textarea id="diss" name="diss" rows="11" cols="60"></textarea>
<button>Step</button>
  </body>
</html>

I have highlighted the three regions and the button for clarity.

Let us proceed with a small discussion on this piece of HTML.

First of all, to display each region you will be making use of the html element textarea. This element have a couple of important attributes:

id: This property allows JavaScript code to reach a particular element on a html page in order to do some manipulation on a given element, like outputting text
rows: specify height of textarea as in number of textlines
cols: specify width of textarea as in number of characters

Finally, to display our Step button, we make use of the html element button. For now we will not be adding any attributes to our button. In a later section, however, we will be adding the attribute onclick, which will inform the browser which JavaScript Function to invoke when the button is clicked.

Extracting Emulator State Data

Next, we need to consider how to generate the data for populating our three textareas with.

Register State

Firstly, how do we get a string showing the content of all our registers? We can create a function on our CPU class that will generate this for us. We will call this function getDebugReg. Here is the implementation:

  this.getDebugReg = function ()  {
    var astr = "00" + acc.toString(16); astr = astr.slice(-2);
    var xstr = "00" + x.toString(16); xstr = xstr.slice(-2);
    var ystr = "00" + y.toString(16); ystr = ystr.slice(-2);
    var pcstr = "0000" + pc.toString(16); pcstr = pcstr.slice(-4);
    var result = "";
    result = result + "Acc:" + astr + " X:" + xstr + " Y:" + ystr + " PC:" + pcstr;
    return result;
  }

Lets quickly discuss this code. First, you will see lots of string concatenations done throughout the method (the plus operator). This is pretty much expected.

Next, you will see also see the frequent use of toString(16). Numbers in JavaScript is provided with the function, which as expected, return the string representation of the number.

When you call the toString method without any parameters, it will return the string representation of the number as a decimal. In our Emulator project, however, we are more interested in dealing with numbers as hexadecimal.

Luckily, JavaScript cater for this requirement. The toString method on a number provides an optional parameter called radix. So, if you call to toString with 16, you get a hexadecimal number. If you wanted to, you could even pass 2 to this method, and you will get back the number in Binary!

You will also notice the use of method call slice() and with a negative number like slice(-2). Slice(-2) means take only the last two characters of the string. We use this technique to ensure our outputted register values is always two characters in length, padding it with a zero if necessary.This will create a consistent look.

Memory State

As discussed earlier we will output the contents of memory starting at location 0.

Here is a snippet of code that will return the contents of memory as a String:

        tempmemstr = ""
        for (i = 0; i < 160; i++) {
          if ((i % 16) == 0) {
            labelstr = "";
            labelstr = labelstr + "0000" + i.toString(16);
            labelstr = labelstr.slice(-4);

            tempmemstr = tempmemstr + "\n" + labelstr;
          }
          currentByte = "00" + mymem.readMem(i).toString(16);        
          currentByte = currentByte.slice(-2);
          tempmemstr = tempmemstr + " " + currentByte;
        }

This snippet will return a string representing the contents of the first 160 bytes in memory.

The if statement in the code check for every 16th byte. If it is a newline character is inserted to the String (\n) and and a line address added.

Next Instruction Disassembled

To disassemble machine language is not hard. We can write a case statement using address mode as a selector. In each case statement we will then format the operand in the correct way.

What we are missing at the moment is a lookup table that will give us applicable mnemonic for a given opcode. We can adjust our CreateSource program mentioned in my previous blog post to create this table for us.

Here is the resulting table:

const opCodeDesc = 
["BRK", "ORA", "", "", "", "ORA", "ASL", "", "PHP", "ORA", "ASL", "", "", "ORA", "ASL", "", 
"BPL", "ORA", "", "", "", "ORA", "ASL", "", "CLC", "ORA", "", "", "", "ORA", "ASL", "", 
"JSR", "AND", "", "", "BIT", "AND", "ROL", "", "PHP", "AND", "ROL", "", "BIT", "AND", "ROL", "", 
"BMI", "AND", "", "", "", "AND", "ROL", "", "SEC", "AND", "", "", "", "AND", "ROL", "", 
"RTI", "EOR", "", "", "", "EOR", "LSR", "", "PHA", "EOR", "LSR", "", "JMP", "EOR", "LSR", "", 
"BVC", "EOR", "", "", "", "EOR", "LSR", "", "CLI", "EOR", "", "", "", "EOR", "LSR", "", 
"RTS", "ADC", "", "", "", "ADC", "ROR", "", "PLA", "ADC", "ROR", "", "JMP", "ADC", "ROR", "", 
"BVC", "ADC", "", "", "", "ADC", "ROR", "", "SEI", "ADC", "", "", "", "ADC", "ROR", "", 
"", "STA", "", "", "STY", "STA", "STX", "", "DEC", "", "TXA", "", "STY", "STA", "STX", "", 
"BCC", "STA", "", "", "STY", "STA", "STX", "", "TYA", "STA", "TXS", "", "", "STA", "", "", 
"LDY", "LDA", "LDX", "", "LDY", "LDA", "LDX", "", "TAY", "LDA", "TAX", "", "LDY", "LDA", "LDX", "", 
"BCS", "LDA", "", "", "LDY", "LDA", "LDX", "", "CLV", "LDA", "TSX", "", "LDY", "LDA", "LDX", "", 
"CPY", "CMP", "", "", "CPY", "CMP", "DEC", "", "INY", "CMP", "DEC", "", "CPY", "CMP", "DEC", "", 
"BNE", "CMP", "", "", "", "CMP", "DEC", "", "CLD", "CMP", "", "", "", "CMP", "DEC", "", 
"CPX", "SBC", "", "", "CPX", "SBC", "INC", "", "INX", "SBC", "NOP", "", "CPX", "SBC", "INC", "", 
"BEQ", "SBC", "", "", "", "SBC", "INC", "", "SED", "SBC", "", "", "", "SBC", "INC", ""];

We now have enough information to start writing our disassemble method. We will call our method getDecodedStr. The logical place to add this method is within the cpu class, where it will have access to the pc register in order to know where to dissassemble from.

Here is the resulting function:

  this.getDecodedStr = function () {
    opCode = localMem.readMem (pc);
    mode = addressModes[opCode];
    numArgs = instructionLengths[opCode] - 1;
    if (numArgs > 0) {
      argbyte1 = localMem.readMem (pc + 1);
    }

    if (numArgs > 1) {
      argbyte2 = localMem.readMem (pc + 2);
    }
    
    
    address = 0;
    addrStr = "";
    var result = getAsFourDigit(pc);
    result = result + " " + opCodeDesc[opCode] + " ";
    switch (mode) {
      case ADDRESS_MODE_ACCUMULATOR: return 0; 
      break;

      case ADDRESS_MODE_ABSOLUTE: addrStr = getAsFourDigit(argbyte2 * 256 + argbyte1);
        result = result + "$" + addrStr;
        return result;
      break;

      case ADDRESS_MODE_ABSOLUTE_X_INDEXED: addrStr = getAsFourDigit(argbyte2 * 256 + argbyte1);
        result = result + "$" + addrStr + ",X";
        return result;
      break;

      case ADDRESS_MODE_ABSOLUTE_Y_INDEXED: addrStr = getAsFourDigit(argbyte2 * 256 + argbyte1);
        result = result + "$" + addrStr + ",Y";
        return result;
      break;

      case ADDRESS_MODE_IMMEDIATE: addrStr = getAsTwoDigit(argbyte1);
        result = result + "#$" + addrStr;
        return result; 
      break;

      case ADDRESS_MODE_IMPLIED:
      break;

      case ADDRESS_MODE_INDIRECT:
        tempAddress = (argbyte2 * 256 + argbyte1);
        return (localMem.readMem(tempAddress + 1) * 256 + localMem.readMem(tempAddress));
      break;

      case ADDRESS_MODE_X_INDEXED_INDIRECT:
        addrStr = getAsTwoDigit(argbyte2 * 256 + argbyte1);
        result = result + "($" + addrStr + ",X)";
        return result;      break;

      case ADDRESS_MODE_INDIRECT_Y_INDEXED:
        addrStr = getAsTwoDigit(argbyte1);
        result = result + "($" + addrStr + "),Y";
        return result;
      break;

      case ADDRESS_MODE_RELATIVE:
      break;

      case ADDRESS_MODE_ZERO_PAGE:
        addrStr = getAsTwoDigit(argbyte1);
        result = result + "$" + addrStr;
        return result;
      break;

      case ADDRESS_MODE_ZERO_PAGE_X_INDEXED:
        addrStr = getAsTwoDigit(argbyte1);
        result = result + "$" + addrStr + ",X";
        return result;
      break;

      case ADDRESS_MODE_ZERO_PAGE_Y_INDEXED:
        addrStr = getAsTwoDigit(argbyte1);
        result = result + "$" + addrStr + ",Y";
        return result;
      break;

    }
  }

When looking into this function, you will see some similarities to our step function and calcEffectiveAdd function.

You will also see the use of the functions getAsTwoDigit and getAsFourDigit. This a glorified toString-hex function that will return the number as a string padded with zeros where applicable. Here is the implementation of getAsTwoDigit and getAsFourDigit:

 function getAsTwoDigit(number) {
    result = "00" + number.toString(16);
    result = result.slice(-2);
    return result;
  }

  function getAsFourDigit(number) {
    result = "0000" + number.toString(16);
    result = result.slice(-4);
    return result;
  }

Putting it all together

Ok, let's put everything together.

Firstly, lets add a script tag to the end of our html as presented in the beginning of this post.

<!DOCTYPE html>
<html>
  <head>
    <title>6502 Emulator From Scratch</title>
    <script src="Memory.js"></script>
    <script src="Cpu.js"></script>
  </head>
  <body>
    <h1>6502 Emulator From Scratch</h1>
    <p>This is JavaScript Test</p>
<textarea id="registers" name="reg" rows="1" cols="60"></textarea>
<br/>
<textarea id="memory" name="mem" rows="15" cols="60"></textarea>
<br/>
<textarea id="diss" name="diss" rows="11" cols="60"></textarea>
<button>Step</button>

    <script language="JavaScript">
      var mymem = new memory();
      var mycpu = new cpu(mymem);
    </script>

  </body>
</html>

When compared to the html as of end of Part 1 of this series, you will that I stripped out hardcoded calls to step and the alert.

As mentioned previously the button will invoke the step() function when you click it. So, we will need to add a onClick attribute on our button element. So, your button tag element will now look like this:

<button onclick="step()">Step</button>

Please note that this step function is not the step function on the cpu class. We still need to declare this step function in the script tag of our html page. One of the tasks of this method, however, will be to call the step function on the cpu class. For starters, our step function will look like this

    <script language="JavaScript">
      var mymem = new memory();
      var mycpu = new cpu(mymem);
      function step() {
        mycpu.step();
      }

    </script>

This is not the only thing we want our step method todo. It should also populate our textareas with the the current state of our emulator.

Let's start with our register textarea. First we should give JavaScript a handle to this element on the page. This is where the id attribute of the textarea element come in handy:

var t = document.getElementById("registers");

We are now free to access all properties of the registers textarea. The property we are interested in is value. This property allows you to read or set the string contents of the cell.

Therefore, to populate this textarea with the contents of the cpu registers you will issue:

t.value = mycpu.getDebugReg();

You need to do a similar exercise to populate the textarea showing next instruction to be executed:

        var ins = document.getElementById("diss");
        ins.value = mycpu.getDecodedStr();

Finally, we need to populate the memory dump text area. We include memory dump code snippet inline in our script tag:

        var m = document.getElementById("memory");
        tempmemstr = ""
        for (i = 0; i < 160; i++) {
          if ((i % 16) == 0) {
            labelstr = "";
            labelstr = labelstr + "0000" + i.toString(16);
            labelstr = labelstr.slice(-4);
            tempmemstr = tempmemstr + "\n" + labelstr;
          }
          currentByte = "00" + mymem.readMem(i).toString(16);        
          currentByte = currentByte.slice(-2);
          tempmemstr = tempmemstr + " " + currentByte;
        }
        m.value = tempmemstr;

In Summary

In this part we extended the html page of our emulator so that more state is visible. You can find the source code for this Part via this link: https://github.com/ovalcode/c64jscript/releases/tag/ch3

In the next part we will be implementing the ADd with Carry and Subtract with Carry instructions in our emulator.

Till next time!

Saturday 16 April 2016

Part 2: 6502 Address modes

Foreword

We ended off the previous section with a very basic JavaScript 6502 emulator with two implemented instructions: Immediate LDA and zeropage STA.

In this section we will be covering the other address modes of the 6502.

There is about 12 address modes supported on the 6502. Going into detail on each one can be quite overwhelming. So I will try to keep the description on each mode as short as possible. In each section I will also provide the JavaScript implementation code. Two lines of code can say a thousand words!

Once I have covered all the address modes I will implement all the address mode variants of the load and store instructions for our emulator.

I will end off this blog by running a test 6502 assembly program covering a big chunk of the address modes.

6502 Addressing Modes

If you study the address modes of the 6502 you will soon realise that 80% of these modes with the following goal in common: Figuring out an effective address.

This functionality is a very good candidate for moving into a method of its own, aiding in reducing clutter in the switch-construct of our emulator for decoding instructions.

This noble idea have a bit of a snag. In order for this method to do its job, it will need a lookup table to know which address mode is associated for a given opcode. Knowing that this table will need to have 256 entries to cater for all 8-bit combination, this could end up been both a tedious and error prone task to do by hand.

There is however nothing wrong with a bit of automation😆 We can write a parser taking the content of the website detailing the opcode vs address mode info. In return for this info, the parser will spit out the tables for us.

I happen to write such a parser last year while busy with my Java emulator. So, let's re-use it!

The source is available on my GitHub site:

https://github.com/ovalcode/c64jjs/blob/master/src/co/za/jjs/CreateSource.java

In order to use this parser you will need to copy and paste the contents of Appendix A of the website http://www.masswerk.at/6502/6502_instruction_set.html into a text file. Running the parser against this text file will output the arrays to standard output. Here is a snippet of how the output will like:

Printing array Address Modes
5, 7, 0, 0, 0, 10, 10, 0, 5, 4, 0, 0, 0, 1, 1, 0, 
9, 8, 0, 0, 0, 11, 11, 0, 5, 3, 0, 0, 0, 2, 2, 0, 
1, 7, 0, 0, 10, 10, 10, 0, 5, 4, 0, 0, 1, 1, 1, 0, 
...
...
Printing array with Byte Lengths
1, 2, 0, 0, 0, 2, 2, 0, 1, 2, 1, 0, 0, 3, 3, 0, 
2, 2, 0, 0, 0, 2, 2, 0, 1, 3, 0, 0, 0, 3, 3, 0, 
3, 2, 0, 0, 2, 2, 2, 0, 1, 2, 1, 0, 3, 3, 3, 0, 
2, 2, 0, 0, 0, 2, 2, 0, 1, 3, 0, 0, 0, 3, 3, 0, 
....
....
Printing array with Instruction Cycles
7, 6, 0, 0, 0, 3, 5, 0, 3, 2, 2, 0, 0, 4, 6, 0, 
2, 5, 0, 0, 0, 4, 6, 0, 2, 4, 0, 0, 0, 4, 7, 0, 
6, 6, 0, 0, 3, 3, 5, 0, 4, 2, 2, 0, 4, 4, 6, 0, 
2, 5, 0, 0, 0, 4, 6, 0, 2, 4, 0, 0, 0, 4, 7, 0, 
....
....

You will see in addition to the Address Mode array, two other arrays are outputted as well.We will making use of all three arrays soon.

The output of the Address mode array is a bit confusing. With the array filled with numbers, it is not easy to tell which address mode is which. The address mode mapping can be found in the method getAddMethod of the CreateSource class:

 public static int getAddMode (String line) {
  String strippedLine = line.substring(5, 19).trim();
  
  if (strippedLine.equals("accumulator"))
    return 0;
  else if (strippedLine.equals("absolute"))
    return 1;
  else if (strippedLine.equals("absolute,X"))
    return 2;
  else if (strippedLine.equals("absolute,Y"))
    return 3;
  else if (strippedLine.equals("immidiate"))
    return 4;
  else if (strippedLine.equals("implied"))
    return 5;
  else if (strippedLine.equals("indirect"))
    return 6;
  else if (strippedLine.equals("(indirect,X)"))
    return 7;
  else if (strippedLine.equals("(indirect),Y"))
    return 8;
  else if (strippedLine.equals("relative"))
    return 9;
  else if (strippedLine.equals("zeropage"))
    return 10;
  else if (strippedLine.equals("zeropage,X"))
    return 11;
  else if (strippedLine.equals("zeropage,Y"))
    return 12;
  
  return -1;

 }

To make our code more readable let's create constants in our CPU class for these Address modes:

  const ADDRESS_MODE_ACCUMULATOR = 0;
  const ADDRESS_MODE_ABSOLUTE = 1;
  const ADDRESS_MODE_ABSOLUTE_X_INDEXED = 2;
  const ADDRESS_MODE_ABSOLUTE_Y_INDEXED = 3;
  const ADDRESS_MODE_IMMEDIATE = 4;
  const ADDRESS_MODE_IMPLIED = 5;
  const ADDRESS_MODE_INDIRECT = 6;
  const ADDRESS_MODE_X_INDEXED_INDIRECT = 7;
  const ADDRESS_MODE_INDIRECT_Y_INDEXED = 8;
  const ADDRESS_MODE_RELATIVE = 9;
  const ADDRESS_MODE_ZERO_PAGE = 10;
  const ADDRESS_MODE_ZERO_PAGE_X_INDEXED = 11;
  const ADDRESS_MODE_ZERO_PAGE_Y_INDEXED = 12;

We can now create an outline for our method that will figure out effective address

  function calculateEffevtiveAdd(mode, argbyte1, argbyte2) {
    var tempAddress = 0;
    switch (mode)
    {
    }
  }

Let's stand still at the method signature for a moment. The first parameter is mode, so we will need to figure out the address mode outside of the method. argbyte1 and argbyte2 are the remaining bytes of the instruction. If the instruction to be decoded is a two byte instruction (e.g. one byte opcode and one byte operand), argbyte2 will be ignored.

We are now ready to discuss the different address modes. As we discuss each one, we will add it to our calculateEffewctiveAdd method

Accumulator

With this mode of instruction, both the source and the destination is the accumulator. A typical example of an instruction implementing this mode is LSR A (Logical Shift Right).

No address is associated with this instruction, so no need to add an entry to calculateEffewctiveAdd.

Absolute

An example of an absolute mode instruction is STA $2233. In this example the operand is memory location $2233.

Let us implement this address mode in our calculateEffectiveAdd method.

Firstly, instructions of this mode are all three bytes long. One byte opcode and two bytes operand.

The address in the operand is stored in low byte, high order. So, to take our example of STA $2233 again: argbyte1 will be $33 and arbyte2 will be $22. The implementation of this mode in our method will look like this:

      case ADDRESS_MODE_ABSOLUTE: return (argbyte2 * 256 + argbyte1);
      break;

Absolute, X-indexed

An example of an absolute, x-indexed instruction is STA $2233,X. The effective address is $2233 incremented by X.

Suppose X is 9, the operand would be memory location $223C.

Again, this instruction is a three byte instruction, with the address stored in low, high byte order.

      case ADDRESS_MODE_ABSOLUTE_X_INDEXED: tempAddress = (argbyte2 * 256 + argbyte1);
        tempAddress = tempAddress + x;
        return tempAddress;
      break;

Absolute, Y-indexed

This is same as Absolute, X-indexed. In this case, however Y register is used instead of X register. Implementation of this mode in our method:

      case ADDRESS_MODE_ABSOLUTE_Y_INDEXED: tempAddress = (argbyte2 * 256 + argbyte1);
        tempAddress = tempAddress + y;
        return tempAddress;
      break;

Immediate

An example of this mode of instruction is LDA #$25. In this example the accumulator is loaded with $25.

As you can see, for these instructions the value is provided by the operand and no address resolution is involved. So, nothing to implement for this mode in our method.

Implied

Instructions with address mode Implied are single byte instructions (e.g. only opcode and no operand). Also, for this mode nothing to implement in our method.

As you can see, for these instructions the value is provided by the operand and no address resolution is involved. So, nothing to implement for this mode in our method.

Indirect

An example of an instruction implementing this mode is STA ($1234). In this instance, the effective address is not $1234, but the address stored in location $1234 and $1235. Again address in $1234 and $1235 should be low, high order. Implementation in our code:

       case ADDRESS_MODE_INDIRECT:
       tempAddress = (argbyte2 * 256 + argbyte1);
        return (localMem.ReadMem(tempAddress + 1) * 256 + localMem.ReadMem(tempAddress));
      break;

X-Indexed, indirect

An example of an instruction implementing this mode is LDA ($20,X). The effective address is stored at location $20 + X. The carry is ignored during addition. Implementation in our code:

      case ADDRESS_MODE_X_INDEXED_INDIRECT:
        tempAddress = (argbyte1 + x) & 0xff;
        return (localMem.ReadMem(tempAddress + 1) * 256 + localMem.ReadMem(tempAddress));
      break;

Indirect, Y-Indexed

An example of an isntruction implementing this mode is STA ($20),Y. Address is looked up at location $20 and Register Y added to it.

      case ADDRESS_MODE_INDIRECT_Y_INDEXED:
        tempAddress = localMem.ReadMem(argbyte1 + 1) * 256 + localMem.ReadMem(argbyte1) + y;
        return tempAddress;
      break;

Zero Page

Same as absolute, but address is only one byte instead of two

Relative

We will cover this address mode in a future section

Zero-Page, X-Indexed

Effective address is operand plus X. Only low byte is taken

      case ADDRESS_MODE_ZERO_PAGE_X_INDEXED:
        return (argbyte1 + x) & 0xff;
      break;

Zero-Page, Y-Indexed

Same as Zero-Page, X-Indexed, but Y register is used instead.

      case ADDRESS_MODE_ZERO_PAGE_Y_INDEXED:
        return (argbyte1 + y) & 0xff;
      break;

Putting it all together

After the previous section, our completed calculateEffevtiveAdd method looks like this:

  function calculateEffevtiveAdd(mode, argbyte1, argbyte2) {

    var tempAddress = 0;
    switch (mode)
    {
      case ADDRESS_MODE_ACCUMULATOR: return 0; 
      break;

      case ADDRESS_MODE_ABSOLUTE: return (argbyte2 * 256 + argbyte1);

      break;

      case ADDRESS_MODE_ABSOLUTE_X_INDEXED: tempAddress = (argbyte2 * 256 + argbyte1);
        tempAddress = tempAddress + x;
        return tempAddress;
      break;

      case ADDRESS_MODE_ABSOLUTE_Y_INDEXED: tempAddress = (argbyte2 * 256 + argbyte1);
        tempAddress = tempAddress + y;
        return tempAddress;

      break;

      case ADDRESS_MODE_IMMEDIATE: 

      break;

      case ADDRESS_MODE_IMPLIED:

      break;

      case ADDRESS_MODE_INDIRECT:
        tempAddress = (argbyte2 * 256 + argbyte1);
        return (localMem.ReadMem(tempAddress + 1) * 256 + localMem.ReadMem(tempAddress));
      break;

      case ADDRESS_MODE_X_INDEXED_INDIRECT:
        tempAddress = (argbyte1 + x) & 0xff;
        return (localMem.ReadMem(tempAddress + 1) * 256 + localMem.ReadMem(tempAddress));
      break;

      case ADDRESS_MODE_INDIRECT_Y_INDEXED:
        tempAddress = localMem.ReadMem(argbyte1 + 1) * 256 + localMem.ReadMem(argbyte1) + y;
        return tempAddress;
      break;

      case ADDRESS_MODE_RELATIVE:

      break;

      case ADDRESS_MODE_ZERO_PAGE:
         return argbyte1;

      break;

      case ADDRESS_MODE_ZERO_PAGE_X_INDEXED:
        return (argbyte1 + x) & 0xff;
      break;

      case ADDRESS_MODE_ZERO_PAGE_Y_INDEXED:
        return (argbyte1 + y) & 0xff;
      break;
    }
  }

We will need to write some code in our step() method to invoke this method:

    var iLen = instructionLengths[opcode];
    var arg1 = 0;
    var arg2 = 0;
    var effectiveAdrress = 0;
    if (iLen > 1) {
      arg1 = localMem.readMem(pc);
      pc = pc + 1;
    }    

    if (iLen > 2) {
      arg2 = localMem.readMem(pc);
      pc = pc + 1;
    }
    

    effectiveAdrress = calculateEffevtiveAdd(addressModes[opcode], arg1, arg2);

As you can see, we make use of the tables we created earlier on to read just enough bytes for the arguments depending on the opcode at hand and adjusting the program counter accordingly. This simplifies our switch construct for decoding instructions by not worrying about adjusting the program counter in each and every case statement.

Let us have a go now at implementing all address mode variants of the LDA instruction:

/*LDA  Load Accumulator with Memory

     M -> A                           N Z C I D V
                                      + + - - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     immediate     LDA #oper     A9    2     2
     zeropage      LDA oper      A5    2     3
     zeropage,X    LDA oper,X    B5    2     4
     absolute      LDA oper      AD    3     4
     absolute,X    LDA oper,X    BD    3     4*
     absolute,Y    LDA oper,Y    B9    3     4*
     (indirect,X)  LDA (oper,X)  A1    2     6
     (indirect),Y  LDA (oper),Y  B1    2     5* */

      case 0xa9:
        acc = arg1;
        zeroflag = (acc == 0) ? 1 : 0;
        negativeflag = ((acc & 0x80) != 0) ? 1 : 0;
      break;
      case 0xA5:
      case 0xB5:
      case 0xAD:
      case 0xBD:
      case 0xB9:
      case 0xA1:
      case 0xB1:
        acc = localMem.readMem(effectiveAdrress);
        zeroflag = (acc == 0) ? 1 : 0;
        negativeflag = ((acc & 0x80) != 0) ? 1 : 0;
      break;

For reference, I have included a snippet for LDA in section A of the masswerk website.

Already we can see that our tables are giving tangible benefits. The opcodes A5, B5, AD, BD, B9, A1 and B1 shares the same piece of code.

Now, let's give a shot at LDX and LDY:

/*LDX  Load Index X with Memory

     M -> X                           N Z C I D V
                                      + + - - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     immediate     LDX #oper     A2    2     2
     zeropage      LDX oper      A6    2     3
     zeropage,Y    LDX oper,Y    B6    2     4
     absolute      LDX oper      AE    3     4
     absolute,Y    LDX oper,Y    BE    3     4**/

      case 0xA2:
        x = arg1;
        zeroflag = (x == 0) ? 1 : 0;
        negativeflag = ((x & 0x80) != 0) ? 1 : 0;
      break;

      case 0xA6:
      case 0xB6:
      case 0xAE:
      case 0xBE:
        x = localMem.readMem(effectiveAdrress);
        zeroflag = (x == 0) ? 1 : 0;
        negativeflag = ((x & 0x80) != 0) ? 1 : 0;


break;


/*LDY  Load Index Y with Memory

     M -> Y                           N Z C I D V
                                      + + - - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     immediate     LDY #oper     A0    2     2
     zeropage      LDY oper      A4    2     3
     zeropage,X    LDY oper,X    B4    2     4
     absolute      LDY oper      AC    3     4
     absolute,X    LDY oper,X    BC    3     4*/


      case 0xA0:
        y = arg1;
        zeroflag = (y == 0) ? 1 : 0;
        negativeflag = ((y & 0x80) != 0) ? 1 : 0;
      break;

      case 0xA4:
      case 0xB4:
      case 0xAC:
      case 0xBC:
        y = localMem.readMem(effectiveAdrress);
        zeroflag = (y == 0) ? 1 : 0;
        negativeflag = ((y & 0x80) != 0) ? 1 : 0;
      break;

Now, that was more like a copy and paste exercise!

Finally, let us give a shot at STA, STX, STY:

/*STA  Store Accumulator in Memory

     A -> M                           N Z C I D V
                                      - - - - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     zeropage      STA oper      85    2     3
     zeropage,X    STA oper,X    95    2     4
     absolute      STA oper      8D    3     4
     absolute,X    STA oper,X    9D    3     5
     absolute,Y    STA oper,Y    99    3     5
     (indirect,X)  STA (oper,X)  81    2     6
     (indirect),Y  STA (oper),Y  91    2     6  */

      case 0x85:
      case 0x95:
      case 0x8D:
      case 0x9D:
      case 0x99:
      case 0x81:
      case 0x91:
        localMem.writeMem(effectiveAdrress, acc);


break;


/*STX  Store Index X in Memory

     X -> M                           N Z C I D V
                                      - - - - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     zeropage      STX oper      86    2     3
     zeropage,Y    STX oper,Y    96    2     4
     absolute      STX oper      8E    3     4  */

      case 0x86:
      case 0x96:
      case 0x8E:

        localMem.writeMem(effectiveAdrress, x);


break;
/*STY  Sore Index Y in Memory

     Y -> M                           N Z C I D V
                                      - - - - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     zeropage      STY oper      84    2     3
     zeropage,X    STY oper,X    94    2     4
     absolute      STY oper      8C    3     4  */

      case 0x84:
      case 0x94:
      case 0x8C:
        localMem.writeMem(effectiveAdrress, y);
break;

Testing it all

We write quite some code for our emulator, so its time for some testing. What we are most concerned about, is if we are more or less on track with address mode implementation, so I wrote the following 6502 assembly program to try and test of bunch of them:

lda #$d0    a9 d0
ldx #$01    a2 01
sta $96     85 96
lda #$07    a9 07
sta $96,X   95 96 
sta $99     85 99
lda #$d1    a9 d1
sta $98     85 98
lda #$55    a9 55
ldx #$02    a2 02
sta ($96,X) ;  81 96
ldy #$02       a0 02
lda #$56       a9 56
sta ($96),y ;  91 96
lda $07d1      ad d1 07
sta $07d1,x ;  9d d1 07

As you can see, I have placed the machine code version of each instruction to the right. Again, we will hardcode our program within the memory class via the mainMem private variable:

  var mainMem = new Uint8Array ([0xa9, 0xd0, 0xa2, 0x01, 0x85, 0x96, 0xa9, 0x07, 0x95, 0x96, 0x85, 0x99, 0xa9, 0xd1, 0x85, 0x98, 0xa9, 0x55, 0xa2, 0x02, 0x81, 0x96, 0xa0, 0x02, 0xa9, 0x56, 0x91, 0x96, 0xad, 0xd1, 0x07, 0x9d, 0xd1, 0x07 ]);

This program consists of 16 instructions, so in index.html we hard code 16 calls to step() in our script block.

After this program has executed, we expect that memory location 2001 will contain the hexadecimal value 55, 2002 hexadecimal 56 and location 2003 the hexadecimal value 55. So, we will add some alert statements that will output the values of these locations to verify whether our program executed correctly on our emulator. With all these changes, your index.html should look like this:

<!DOCTYPE html>
<html>
  <head>
    <title>6502 Emulator From Scratch</title>
    <script src="Memory.js"></script>
    <script src="Cpu.js"></script>
  </head>
  <body>
    <h1>6502 Emulator From Scratch</h1>
    <p>This is JavaScript Test</p>
    <script language="JavaScript">
      var mymem = new memory();
      var mycpu = new cpu(mymem);
      mycpu.step();
      mycpu.step();
      mycpu.step();
      mycpu.step();
      mycpu.step();
      mycpu.step();
      mycpu.step();
      mycpu.step();
      mycpu.step();
      mycpu.step();
      mycpu.step();
      mycpu.step();
      mycpu.step();
      mycpu.step();
      mycpu.step();
      mycpu.step();

      alert(mymem.readMem(0x7d1));
      alert(mymem.readMem(0x7d2));
      alert(mymem.readMem(0x7d3));

    </script>
  </body>
</html>

Now, let me give my experience when I tried to run this assembly program. I used Google chrome to run this test.

On my first attempt, only the html page opened with no alert at all. So, it is was time for me to do some javascript debugging! Luckily Chrome is equipped with a Javascript debugger.

To invoke the debugger, you right click on the page and select inspect:

Once the debugger opened it, the debugger immediately showed where the issue is:

Clicking on Cpu.js:118 actually options up the file at the offending line:

And, the error is within our calculateEffevtiveAdd method we wrote. In this case it is a capitalisation issue. In our Memory class, we defined readMem, with a lowercase r, but we call it with an uppercase R. I have done it in a couple of places within the calculateEffevtiveAdd method.

So, lets go about and fix them all and hit refresh (arrow next to our URL bar) to restart the application.

This time around our alerts does fire, but all three show undefined. So, lets go into the debugger again!

This time, no error is reported, so we will need to set a breakpoint on the first alert. So click on the sources tab and select index.html. If you now click on the line number of the first alert statement, the breakpoint will be set:

Now, hit refresh again to restart the emulator. The debugger will no stop at the breakpoint you have defined. What we are interested at this point is the contents of our memory array defined within the memory object. At this level we can't really see anything meaningful. Stepping into the readMem function will give us this level of detail.

So, hit the Step-into button in the debugger.
The step-into button is the icon with the arrow pointing to the dot.

Once you step into the memory object, you can hover over any variable name with the mouse cursor, and the debugger will tell you the applicable variable's value.

If we inspect our mainMem object, the debugger conveniently show the values of elements for us. To our dismay, our array stops at element 33:

Ok, on the one hand not so much of a surprise. We have after all declared the mainMem array and assigned it our program which is 34 bytes!

But, an array in JavaScript is suppose to resize itself when accessing an element outside its declared range. Such a resize would have been expected when writing something memory location 2001, 2002 and 2003.

The answer to this catch 22 comes from the fact that Uint8array, is not a true JavaScript array, but a typed array. There is some subtle differences between a ordinary JavaScript array and a typed array, as pointed out by the following website:

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Typed_arrays

Firstly, calling isArray() on on typed array will return false. Also, a typed array doesn't support all the methods provided a normal array. Two of these methods an typed array doesn't support is push and pop.

Push is the method that allows you to append elements to an array. This gives us a clue that typed arrays doesn't really supports resizeable arrays!

With these limitations in mind, we will need to rethink how we declare our memory array. We will need to declare our array with the maximum size of 65536 bytes and then call set on the array to store our program:

  var mainMem = new Uint8Array(65536);

  mainMem.set ([0xa9, 0xd0, 0xa2, 0x01, 0x85, 0x96, 0xa9, 0x07, 0x95, 0x96, 0x85, 0x99, 0xa9, 0xd1, 0x85, 0x98, 0xa9, 0x55, 0xa2, 0x02, 0x81, 0x96, 0xa0, 0x02, 0xa9, 0x56, 0x91, 0x96, 0xad, 0xd1, 0x07, 0x9d, 0xd1, 0x07 ]);

When we call set, our array will still have 65336 element aftwards. The contents of the first 34 bytes will just be replaces by our program.

If we now open up index.html again, The alerts with show 85, 86 and 85 in sequence, which is just the decimal equavlent of 0x55, 0x56 and 0x55.

Yippeee!!! Our program works

In Summary

In this post we have implemented all address variants of the load and store instructions.

What will be covered next? What comes to mind is that our emulator currently doesn't give much output, apart from a couple of alerts we need to each time we execute a program.

So, in the next post I will try to improve on this limitation by implementing the following:

Single stepping: Provide a button on our index page, allowing you to single yourself through a program
Showing contents of the cpu registers and memory after each single step

That it for this post.

Till next time!

Friday 1 April 2016

Part 1: Humble Beginnings

Foreword

In my previous post I gave a brief introduction on what I am planning in this series of Blog posts on writing an emulator from scratch.

In this post, I will start writing initial code for the emulator with only two instructions implemented.

Keeping it this small will aid in keeping the focus on figuring out what intitial components are required for the emulator.

Writing a basic 6502 Assembler program and converting it to Machine Code

I think a good starting point would be to write a very simple assembly program and take it from there.

So, here it goes:

LDA #$20
STA $08

Next, let us see what this program does.

The first line LDA #$20 basically instruct the CPU to LoaD the Acculator with the hexadecimal value 20.

The second line STA $08 instructs the CPU to store the contents of the accumulator in memory location 8.

Let us know try to convert this assembly listing to machine code.

There is a helpful resource on the internet that provides you with necessary info for this task:

http://www.masswerk.at/6502/6502_instruction_set.html

Scroll down to the LDA section. For your convenience, I have included it here:

LDA  Load Accumulator with Memory
     M -> A                           N Z C I D V
                                      + + - - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     immediate     LDA #oper     A9    2     2
     zeropage      LDA oper      A5    2     3
     zeropage,X    LDA oper,X    B5    2     4
     absolute      LDA oper      AD    3     4
     absolute,X    LDA oper,X    BD    3     4*
     absolute,Y    LDA oper,Y    B9    3     4*
     (indirect,X)  LDA (oper,X)  A1    2     6
     (indirect),Y  LDA (oper),Y  B1    2     5*

Lets briefly discuss the contents in this section. First, M->A explains the basic operation of the instruction, move memory to accumulator.

Next to M->A, an indication is given on which flags can be possibly effected by this instruction. For this particular instruction there a plus (+) below the N flag and the Z flag, meaning this instruction can potentially set the negative flag or the zero flag. If the number assigned the accumulator is positive, both the negative and zero flag will be set to zero.

Next, a list of opcodes for the various addressing modes is provided. Addressing modes is an important part of the 6502 operation, so in the next part of this series, we will dedicate some time explaining the different addressing modes of the 6502.

For now, we will only identify the address mode applicable to our assembly program. In our case, this is immediate mode, identified by the hash # in the assembly instruction. So the opcode of the first instruction is A9. The first line of our program therefore translates to:

A9 20

Next, let us try to locate the opcode for the second line of our program. Again lets retrieve the section for STA from above mentioned website:

STA  Store Accumulator in Memory
     A -> M                           N Z C I D V
                                      - - - - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     zeropage      STA oper      85    2     3
     zeropage,X    STA oper,X    95    2     4
     absolute      STA oper      8D    3     4
     absolute,X    STA oper,X    9D    3     5
     absolute,Y    STA oper,Y    99    3     5
     (indirect,X)  STA (oper,X)  81    2     6
     (indirect),Y  STA (oper),Y  91    2     6

First, notice no flags are effected by this instruction.

Now, from the list of opcodes there is two possible opcodes that can fullfil the second line of our program: zeropage and absolute.

Zeropage you can use if the applicable memory location falls within the first 256 locations of memory.

Absolute you will use if applicable memory location is at memory location number 256 and above.

The difference between these two modes is that zeropage instruction occupies two bytes in memory( instruction opcode and one byte memory location) and absolute takes up three bytes in memory (instruction opcode and two bytes for memory location)

In our case, for simplicity, we will use zeropage addressing. So the instruction STA $08 will translate to the following:

85 08

Our machine language program is thus: A9 20 85 08

We can now start to write our emulator that will execute this snippet .

Designing

We start off with creating two classes for our emulator: CPU and Memory. We will create a JavaScript file for each one: Memory.js and Cpu.js

Let us focus on the memory part first. The contents of the memory we will implement as an array of numbers. To get the ball rolling, we will hardcode our machine language program to the memory array on memory object instantiation.

With this in mind, create the file Memory.js with a text editor and type the following:

function memory()

{
  var mainMem = new Uint8Array ([0xa9, 0x20, 0x85, 0x08, 0, 0, 0, 0, 0, 0]);
}

For the array we we use the Uint8Array type. With this type each element is a single unsigned byte. This closely models the the size of storage elements on a 6502 CPU.

There is an assumption we are making at this point: Our emulator emulator will start executing instructions at memory location 0. This is not the case with a real 6502 CPU, but this assumption is good enough to get the ball rolling.

Back to our memory code. As it stands, the mainMenu array is declared as a private variable. So, we will surface its contents to the other parts of the system with getters and setters. With this implemented, the memory class will look like this:

function memory()

{
  var mainMem = new Uint8Array ([0xa9, 0x20, 0x85, 0x08, 0, 0, 0, 0, 0, 0]);

  this.readMem = function (address) {
    return mainMem[address];
  }

  this.writeMem = function (address, byteval) {
    mainMem[address] = byteval;
  }

}

This is enough of memory class for now.

Next, we should create our cpu class. This will happen in Cpu.js

We start again of what private variables is required for our object state. The first private variable that comes to mind we should add to the cpu class is the accumulator register. In the 6502, there is also two similar registers: X and Y register.

We will also need a register that will keep track of where in memory we are at any point during execution. In the 6502, this called the Program Counter (or PC for short). This register differ a bit from our A, X an Y registers. in that it is 16-bit register instead of a 8-bit register.

At this point our Cpu class will look like this:

function cpu() {
  var acc = 0;
  var x = 0;
  var y = 0;
  var pc = 0;
}

One thing we are still missing, is access for our CPU class to a memory object to retrieve instructions to execute.

The simplest way would be to create another private variable and instantiate there and then. However, there might be other components that might require access to the memory object. The best would therefore be to pass the memory object as a parameter when cpu object is created and assign it to a private variable. The following will do:

function cpu(memory) {
  var localMem = memory;

...

Next, we need to decide how we are going to code the logic for our emulator. For now, we will implement a step function in the CPU class. Each time you call this function, it will execute a single instruction. In this function we will do more or less the following:

Retrieve instruction opcode from memory at location pointed to by register PC
Create a switch statement to decode instruction, with opcode determining execution path
Remember to increment PC register each time you retrieve the next element from memory.

Keeping this in mind, our step function will look this:

  this.step = function () {
    var opcode = localMem.readMem(pc);
    pc = pc + 1;
    switch (opcode)
    {
      case 0xa9:
        acc = localMem.readMem(pc);
        pc = pc + 1;
      break;

      case 0x85:
        address = localMem.readMem(pc);
        localMem.writeMem(address, acc);
        pc = pc + 1;
      break;
    }
  }

One remaining thing is to implement the flag operation for the LDA instruction (e.g. opcode A9). We create two private variables for the flags zero and negative and modify the case statement for LDA as follows:

      case 0xa9:
        acc = localMem.readMem(pc);
        zeroflag = (acc == 0) ? 1 : 0;
        negativeflag = ((acc & 0x80) != 0) ? 1 : 0;
        pc = pc + 1;
      break;

I will give a quick explanation for those not familiar with the ?-operator in the two assignments. First of all the ? is called the ternary operator. It gets preceded by a condition. If condition evaluates true, the first value is the effective value, otherwise the second value.

The assignment for zeroflag is fairly straightforward. There might be some doubts regarding the negativeflag assignment, so let me quickly explain. The 6502 uses the most significant bit of a register (left most digit) as negative indicator. If it is zero it is a positive number. If it is a one, it is is negative number. When the value of register changes the 6502 will update the negative status flag with whether the new number is positive number or negative. I will cover more in detail in a future post on how the 6502 works with negative numbers.

Putting everything together

We have now written all the code bits for executing our assembly language we wrote earlier in the article. It is now time to put every thing together.

First let us view our finished off CPU class:

function cpu(memory) {
  var localMem = memory;
  var acc = 0;
  var x = 0;
  var y = 0;
  var pc = 0;
  var zeroflag = 0;
  var negativeflag = 0;

  this.step = function () {
    var opcode = localMem.readMem(pc);
    pc = pc + 1;
    switch (opcode)
    {
      case 0xa9:
        acc = localMem.readMem(pc);
        zeroflag = (acc == 0) ? 1 : 0;
        negativeflag = ((acc & 0x80) != 0) ? 1 : 0;
        pc = pc + 1;
      break;

      case 0x85:
        address = localMem.readMem(pc);
        localMem.writeMem(address, acc);
        pc = pc + 1;
      break;
    }
  }
}

Next, it is important to remember that a browser can't just execute the javascript on its own. It needs an html page as a starting point.So here is a suitable html page for our javascript code:

<!DOCTYPE html>
<html>
  <head>
    <title>6502 Emulator From Scratch</title>
    <script src="Memory.js"></script>
    <script src="Cpu.js"></script>
  </head>
  <body>
    <h1>6502 Emulator From Scratch</h1>
    <p>This is JavaScript Test</p>
    <script language="JavaScript">
      var mymem = new memory();
      var mycpu = new cpu(mymem);
      mycpu.step();
      mycpu.step();
      alert(mymem.readMem(8));
    </script>
  </body>
</html>

Save this html file under the name index.html

As you can see, this html file also has an embedded script snippet. It creates an instance of a memory object and CPU object. On the created CPU object we call the step function twice.

The alert function call pops up a message Box with the showing the contents of memory location 8. This will give us an indication of whether our program executed correctly.

Time to start up up our emulator. Save all three files (index.html, Cpu.js and Memory.js) in the same folder and open index.html with a browser. If all goes well, you will see a Message Box popping up showing "32". This is the decimal equivalent of 20.

That is it for this Post!

In Summary

In this Post we created a very simple 6502 emulator in JavaScript.

In the next post I will be explaining the different address modes of the 6502. We will then end it off by extending our emulator to include all the Load instructions (LDA, LDX, LDY) and all the Store instructions (STA, STX, STY) together with all applicable address mode variants of these instructions.

Take care!