Nuts and Bolts of writing 8-bit emulator: May 2016

Monday, 30 May 2016

Part 11: Fixing the missing cursor

Foreword

In the previous blog I implemented a very simple emulation of the C64 screen.

We came to the point where it showed the welcome message, but no flashing cursor.

In this blog we are going to investigate why the cursor is not shown and fix it.

I will be taking a bit of a detour in this blog showing how to get info for an emulator from schematics and chip Data Sheets.

Although not really necessary for this blog for investigating something as simple as a missing flashing cursor, this technique will be really useful for future blogs.

Looking for a clue

So, we don't see a cursor. Where do we start to investigate this issue?

First price would be to have a commented disassembled version of the ROM code. There is indeed a resource on the internet that will provide us with this information:

http://www.ffd2.com/fridge/docs/c64-diss.html

We should now ask the browser to search for the keywords flash or cursor on the commented listing.

When I do the search I get very related hits like read/set XY cursor position, move cursor to previous line and so on.

Finally I found what I found what I was looking for:

; normal IRQ interrupt

EA31   20 EA FF   JSR $FFEA   ; do clock
EA34   A5 CC      LDA $CC   ; flash cursor
EA36   D0 29      BNE $EA61
EA38   C6 CD      DEC $CD
EA3A   D0 25      BNE $EA61
EA3C   A9 14      LDA #$14
EA3E   85 CD      STA $CD
EA40   A4 D3      LDY $D3
EA42   46 CF      LSR $CF
EA44   AE 87 02   LDX $0287
EA47   B1 D1      LDA ($D1),Y
EA49   B0 11      BCS $EA5C
EA4B   E6 CF      INC $CF
EA4D   85 CE      STA $CE
EA4F   20 24 EA   JSR $EA24
EA52   B1 F3      LDA ($F3),Y
EA54   8D 87 02   STA $0287
EA57   AE 86 02   LDX $0286
EA5A   A5 CE      LDA $CE
EA5C   49 80      EOR #$80
EA5E   20 1C EA   JSR $EA1C   ; display cursor

I have highlighted important parts in bold. It is clear that the flashing of the cursor is part of the job of an IRQ handler.

Aha! We haven't implemented interrupts yet in our emulator, though we already did something similar with the BRK instruction.

Together with discovery comes a secondary question: When should the interrupts fire?

This is perhaps a more difficult question to answer. To answer this question would require some knowledge of the hardware peripheral registers on the C64.

Surely, there are hundreds of C64 memory maps available on the Internet that will provide you with this information, but in this blog I am going to show you another interesting way: Using schematics and DataSheets.

Isn't this an overkill? Maybe, but in my experience with my Java emulator I actually found that there is some information that the C64 memory simply doesn't give you. Especially when you want to emulate hardware.

So, lets get snooping...

Making our own memory map

Let us start by finding out what is connected to the IRQ pin of the CPU. There is some schematics via the following link that will provide us with this information:

http://www.zimmers.net/anonftp/pub/cbm/schematics/computers/c64/

Here is a snippet from schematic:

n this snippet I have highlighted in red what is connected to the IRQ pin of the CPU.

In this case we see that the IRQ pin of a 6526 CIA chip is connected to the IRQ pin of the CPU. What is nice about this schematic is that they also wrote on the CIA block which space in memory it is occupying.

Next we need to dig into a DataSheet for the 6526 CIA chip. Many DataSheet web sites will give you the original Datasheet as a scanned copy in PDF.

This DataSheet is written in paragraph form, so it is not very handy as a quick reference. So I tried to give a summarised version as follows:

0 PRA Peripheral Data Register
1 PRB Peripheral Data Register
2 DDRA Data Direction Register
3 DDRB Data Direction Register
4 Timer A Low
5 Timer A High
6 Timer B Low
7 Timer B High
8 TOD Tenths
9 TOD Seconds
A TOD Minutes
B TOD Hour
C SDR (Serial Data) 
D Interrupt -> IR 00 FLG SP Alarm TB TA
E Control Bit 0 = 1 -> Start Timer A
          Bit 1 = Timer A on PB6
          Bit 2 = 1-> Toggle 0=Pulse
          Bit 3 = 1-> One Shot 0->Continious
          Bit 4 = Load Force
          Bit 5 = 1-> Timer A counts CNT signals
                  0-> Timer A counts 02
          Bit 6 = Serial
          Bit 7 = 1->50Hz 0->60Hz
F Same as E Except Bit 1 Timer B on
  CRB6 CRB5
   0    0   Timer B counts 02
   0    1   Timer B counts CNT
   1    0   Timer B counts Timer A underflows
   1    1   Timer B counts Timer A underflows while CNT high
          Bit7 writing 1 -> Writing to TOD registers set alarm
               writing 0 -> Writing to TOD registers set TOD

As you can see the 6526 only has 16 registers. On the CIA block on the diagram, however, 256 memory locations is reserved for CIA 1. This means that in the address range DC00 to DCFF only the least four bits will be used.

Another thing also to take note of is that for some registers a read operation and a write operation will each access a different register. More on this to follow.

With all this information at our disposal, what is the next step? A good place to start is to find out which interrupts our emulator attempted to enable. So lets do a memory dump at DC00:

The register to look for is DC0D and its value is 81h. This means an interrupt is only enabled for timer A.

So, lets try get all the settings for timer A. We will start at address DC0E. Bit 0 is set, so we know the ROM code attempted to start the timer.

The next important bit is bit three. We see it is set to 0. This means our timer is running in continuous mode. It will count down to zero, cause an interrupt and then start counting down again from a pre-defined value.

Next, we need to check bit 5. It is set to zero and this means it is counting 02 signals. 02 signals runs at the same clock speed as the CPU, which is more or less 1MHz.

Finally, lets look at the the Timer A Low and Timer A High value. If we puts these two values together, you get 4025h (This is 16421 in decimal). So what does this value mean? We know our CIA is counting at 1Mhz. So an interrupt is thrown every 16421 clock cycles. Lets convert this to seconds:

16421/1000000 = 0.016421

This corresponds to more or less 60 Interrupts per second.

At last! We know now we should trigger an interrupt 60 times per second. This is very close to our batch interval. So for now it would be sufficient to trigger an interrupt once every time we execute the runBatch method.

There goes another hack! In a later blog post we will eventually come back and replace the hack with some proper code.

Implementing IRQ's in our emulator

Firstly we need a way to inform our CPU to perform an interrupt. For this we will create a public method setInterrupt(). This method will merely set a private variable. So firstly, define the private variable:

  var localMem = memory;
  var acc = 0;
  var x = 0;
  var y = 0;
  var pc = 0x400;
  var sp = 0xff;
  var zeroflag = 0;
  var negativeflag = 0;
  var carryflag =0;
  var overflowflag =0; 
  var decimalflag = 0;
  var interruptflag = 1;
  var breakflag = 1;
  var interruptOcurred = 0;

And next, the public method:

    this.setInterrupt = function () {
      interruptOcurred = 1;
    }

Next, in our Cpu step method we should actually check this variable and trigger an interrupt if it is set. This trigger should only happen if the interruptFlag is 0:

  this.step = function () {
    if ((interruptOcurred == 1) & (interruptflag == 0)) {
    }    
    var opcode = localMem.readMem(pc);
    pc = pc + 1;
    var iLen = instructionLengths[opcode];
    var arg1 = 0;

What do we put inside the if statement? We already wrote similar code when we implemented the BRK instruction. Lets refresh our memory:

      case 0x00:
        var tempVal = pc + 1;
        Push(tempVal >> 8);
        Push(tempVal & 0xff);
        Push(getStatusFlagsAsByte());
        interruptflag = 1;
        tempVal = localMem.readMem(0xffff) * 256;
        tempVal = tempVal + localMem.readMem(0xfffe);
        pc = tempVal;
      break;

We can almost copy and paste this source just as is. There is just a couple of things we should be aware of:

Don't increment the program counter by one as done in the first line.
Remember to set interruptOccured to back to 0
Be careful of the breakflag!

The last point warrant a special discussion.

As you remember we have implemented the BRK instruction while we had been running the Klaus Test Suite. All the tests run through fine when we left the break flag as one.

When implementing an IRQ, however, we should be more cautious. When a IRQ occur, the break flag gets set briefly to 0 just to distinguish itself from a BRK interrupt.

The key here is to ensure that the status byte being pushed onto the stack have the break flag set as zero. So, in our case it would be sufficient if we set the break flag to zero before doing the status flag push. Directly after the push we need to change the break flag back to 1.

With all this in mind the code will look as follows:

    if ((interruptOcurred == 1) & (interruptflag == 0)) {
        interruptOcurred = 0;
        Push(pc >> 8);
        Push(pc & 0xff);
        breakflag = 0;
        Push(getStatusFlagsAsByte());
        breakflag = 1;
        interruptflag = 1;
        tempVal = localMem.readMem(0xffff) * 256;
        tempVal = tempVal + localMem.readMem(0xfffe);
        pc = tempVal;
    }

What remains to be done is to cause an interrupt each time the runBatch Method gets executed:

      function runBatch() {
        if (!running)
          return;
        myvideo.updateCanvas();
        var targetCycleCount =  mycpu.getCycleCount() + 20000;
        mycpu.setInterrupt();
        while (mycpu.getCycleCount() < targetCycleCount) { 
          mycpu.step();
          var blankingPeriodLow = targetCycleCount - 100;
          if ((mycpu.getCycleCount() >= blankingPeriodLow) & (mycpu.getCycleCount() <= targetCycleCount)) {
            mymem.writeMem(0xD012, 0);
          } else  {
            mymem.writeMem(0xD012, 1);
          }
          if (mycpu.getPc() == breakpoint) {
            stopEm();
            return;
          }
        }
      }

If we now rerun the emulator, we see a flashing cursor!

In Summary

In this blog we investigate why our emulator is not showing a flashing cursor. This investigation took us a bit of a detour by finding information from datasheets and schematics.

In the end the solution was to implement interrupts in our emulator and triggering an interrupt once every time the runBAtch method is called.

In the next Blog we will see if we could simulate a key press in our emulator. This will be the first step in adding keyboard support to our JavaScript emulator.

Till next time!

Tuesday, 24 May 2016

Part 10: Emulating the C64 Screen

Foreword

We ended off the previous section by confirming that booting the C64 system did indeed write the welcome message to screen memory.

This confirmation involved lots of manual checking: Working through a dump of screen memory and converting each hex number from screen code to ascii.

In this post we will relieve ourselves from this manual checking by implementing screen rendering. This screen rendering is going to be very basic: Text mode only, monochrome and a hardcoded screen address of 400h.

Some Feedback

From the time I wrote my last Blog I got two useful hints from Ed, one of the users at 6502.org.

The first thing Ed pointed out is that if you have Python installed on your pc, it is unnecessary to install a local webserver for development since Python comes bundled with one.

You invoke the bundled Python webserver from the commandline as follows: python -m SimpleHTTPServer

This will kick off a webserver listening at port 8000 and will serve pages from the current directory at which you invoked the command from.

Lastly Ed gave me some pointers to the use of GitHub pages. With this functionality Github will serve content from one of your GitHub repos as web pages. Quite nice if you want to give users the ability to give your emulator a quick spin without installing anything.

I actually gave GitHub pages a try myself. If you have a moment please visit http://ovalcode.github.io/

You will instantly have access to our emulator in a browser with changes covered in this blog and the next planned blog.

Hooking up the Character ROM

When implementing C64 text mode rendering, one sensible question might arise: Where do you get hold of a font closely resembling the C64 characters displayed to the screen? The answer is of course via the character ROM.

The character stores all the characters that the C64 can display as a set of images. Each image is an 8x8 pixel array where each pixel can be either set or transparent. Eight bytes would there be sufficient to store each character image.

All character images in the character rom is ordered by screen code. This means bytes 0 to 7 represent the image of screen code 0 (the @ sign), bytes 8 to 15 the image of screen code 1 (an A), bytes 16 to 23 the image of screen code 2 (a B), and so on.

Lets put this theory to the test by verifying the character image of screen code 1. First, let us us retrieve the bytes from the character rom by opening it in a hex editor:

From the hex dump lets take these values and cvert it to binary:

18 = 00011000
3C = 00111100
66 = 01100110
7E = 01111110
66 = 01100110
66 = 01100110
66 = 01100110
00 = 00000000

If you closely, you can see he ones form an A. Cool!

Next, lets write some code that will give our emulator access to the character ROM. We will task our Memory class with this duty. First lets create a arraybuffer for the contents of the character ROM:

  var mainMem = new Uint8Array(65536);
  var basicRom = new Uint8Array(8192);
  var kernalRom = new Uint8Array(8192);
  var charRom = new Uint8Array(4192);

Notice the character ROM is smaller than the other two ROMS. It is only 4KB.

Next, it is time to do another XMLHttpRequest sing and dance for the Character ROM:

//------------------------------------------------------------------------

var oReqChar = new XMLHttpRequest();
oReqChar.open("GET", "characters.bin", true);
oReqChar.responseType = "arraybuffer";

oReqChar.onload = function (oEvent) {
  var arrayBuffer = oReqChar.response; // Note: not oReq.responseText
  if (arrayBuffer) {
    charRom = new Uint8Array(arrayBuffer);
    downloadCompleted();
  }
};

oReqChar.send(null);


//------------------------------------------------------------------------

Don't forget to adjust the oustandingDownloads variable accordingly:

  var outstandingDownloads = 3;

We are almost done with the code changes to our Memory class. There is just one issue. Our Memory class in its current state has a flat memory model. So we can't really map it to a sensible area in Memory space for readMem to access.

To overcome this issue we will just, for now, implement a separate public method for accessing character ROM contents:

  this.readCharRom = function (address) {
    return charRom[address];
  }

We are done with changes to our memory class!

A Canvas to Write on

OK, lets start to write code to emulate the C64 screen.

The first question would be how do you output the screen to a web page?

For this, HTML5 provide us with an element called the Canvas. Lets add a canvas element right away to our index.html page:

<body>
    <h1>6502 Emulator From Scratch</h1>
    <p>This is JavaScript Test</p>
<canvas id="screen" width="320" height="200">

</canvas><br/>
<textarea id="registers" name="reg" rows="1" cols="60"></textarea>
<br/>
<textarea id="memory" name="mem" rows="15" cols="60"></textarea>

This will add a canvas just below the titles. The canvas is 320 pixels wide and 200 pixels high. This is the same dimensions as the drawable area of the C64 screen.

Now, how do you draw to this canvas. The w3schools.com website provide us with an example:

var c=document.getElementById("myCanvas");
var ctx=c.getContext("2d");
var imgData=ctx.createImageData(100,100);
for (var i=0;i<imgData.data.length;i+=4)
  {
  imgData.data[i+0]=255;
  imgData.data[i+1]=0;
  imgData.data[i+2]=0;
  imgData.data[i+3]=255;
  }
ctx.putImageData(imgData,10,10);

As you can see we need to jump through a couple of hoops to get to the raw pixel data: canvas->getConext()->createImageData().data. data is your actual array of pixels. Note that each pixel is represented by four elements in the following order:

Red
Green
Blue
Alpha

The Alpha value controls the amount of transparency of the canvas. Most of the time we will specify a value of 255, meaning the pixel is not transparent at all.

After you have manipulated the pixels remember to call putImage afterwards so the that the canvas is updated to the screen.

We now have enough to start coding. Our index.html file is getting more clunky by the day, so I will write this in a separate Javascript file called Video.js. First, we need to remember to add an include in index.html:

  <head>
    <title>6502 Emulator From Scratch</title>
    <script src="Memory.js"></script>
    <script src="Cpu.js"></script>
    <script src="Video.js"></script>
  </head>

Know we create a skeleton for Video.js:

function video(mycanvas, mem) {
  var localMem = mem;
  var ctx = mycanvas.getContext("2d");
  this.updateCanvas = function() {
    var imgData = ctx.createImageData(320, 200); }
}

updateCanvas is the method we will invoke to update the canvas.

Now the question is what do we put inside updateCanvas? We can start off with a loop iterating through all the characters of screen memory:

    var currentScreenPos;
    for (currentScreenPos = 0; currentScreenPos < 1000; currentScreenPos++) {
      var screenCode = localMem.readMem(1024 + currentScreenPos);
    }

For each screenCode we can get all its image bytes in charcater ROM:

    var currentScreenPos;
    for (currentScreenPos = 0; currentScreenPos < 1000; currentScreenPos++) {
      var screenCode = localMem.readMem(1024 + currentScreenPos);
      var currentRow;
      for (currentRow = 0; currentRow < 8; currentRow++) {
        var currentLine = localMem.readCharRom((screenCode << 3) + currentRow);
      } 
     }

Each row of image data we also need to loop through to get hold of the individual pixels:

      var currentRow, currentCol;
      for (currentRow = 0; currentRow < 8; currentRow++) {
        var currentLine = localMem.readCharRom((screenCode << 3) + currentRow);
        for (currentCol = 0; currentCol < 8; currentCol++) {
          var pixelSet = (currentLine & 0x80) == 0x80;          
          currentLine = currentLine <<1; 
        }      }

The pixelset variable will therefore hold information about the individual pixels. Note that the pixelset variable so it can either true (pixel is set) or false (pixel transparent).

Next, we should figure out where to write the pixel at hand on the canvas. We can write a small algorithm for this in pseudo code:

X-pixel Pos = (Number of characters to the left of screen) * 8 +currentCol
Y-pixel Pos = (Number of characters to the top of screen) * 8 + currentRow

One catch here is that we don't have a two dimension screen position (X and Y). We only have a linear address. We could easily determine the two dimensional position by dividing the linear address by 40. The integer part of the division result would then be your characters to the top and the remainder (e.g. modules) would be the number of characters to the left.

This would cause quiet a number of multiplications and divisions per second. You might potentially pay a performance penalty, so a better option would be to keep keep count of the two dimensional position in two separate variables:

    var currentScreenPos;
    var currentScreenX = 0;
    var currentScreenY = 0;    
    for (currentScreenPos = 0; currentScreenPos < 1000; currentScreenPos++) {
      var screenCode = localMem.readMem(1024 + currentScreenPos);
      if (currentScreenX == 40) {
        currentScreenX = 0;
        currentScreenY++;
      }     
    ...
      currentScreenX++;
    }

As you can see we have adjusted our main loop a bit. We introduced two new variables currentScreenX and currentScreenY. At the end of each loop iteration we increment currentScreenX. Remember our screen is 40 characters wide, so the range for currentScreenX is from 0 to 39. So once currentScreenX hits 40 its time to reset currentScreenX and increment currentScreenY by one. Lets implement this code:

      for (currentRow = 0; currentRow < 8; currentRow++) {
        var currentLine = localMem.readCharRom((screenCode << 3) + currentRow);
        for (currentCol = 0; currentCol < 8; currentCol++) {
          var pixelSet = (currentLine & 0x80) == 0x80;
          var pixelPosX = (currentScreenX << 3) + currentCol;
          var pixelPosY = (currentScreenY << 3) + currentRow;          
          currentLine = currentLine << 1;
        }
      }

We are almost finished coding. I am going to show you the remaining code and then I am going to explain it:

      for (currentRow = 0; currentRow < 8; currentRow++) {
        var currentLine = localMem.readCharRom((screenCode << 3) + currentRow);
        for (currentCol = 0; currentCol < 8; currentCol++) {
          var pixelSet = (currentLine & 0x80) == 0x80;
          var pixelPosX = (currentScreenX << 3) + currentCol;
          var pixelPosY = (currentScreenY << 3) + currentRow;    
          var posInBuffer = (pixelPosY * 320 + pixelPosX) << 2;
          if (pixelSet) {
            imgData.data[posInBuffer + 0] = 0;
            imgData.data[posInBuffer + 1] = 0;
            imgData.data[posInBuffer + 2] = 0;
            imgData.data[posInBuffer + 3] = 255;
          } else {
            imgData.data[posInBuffer + 0] = 255;
            imgData.data[posInBuffer + 1] = 255;
            imgData.data[posInBuffer + 2] = 255;
            imgData.data[posInBuffer + 3] = 255;

          }                
          currentLine = currentLine << 1;
        }
      }

First, notice the introduction of the variable posInBuffer. Since our data array is one dimensional, we need to convert our (X,Y) to a one dimensional address.

Also notice we are shifting the result left by two bits. This is equivalent to multiplying by 4(Remember each pixel consists of four elements).

Finally, we are writing a black pixel if pixel is set, otherwise a white one.

imgData now have prepared frame ready for display. In the final line of our updateCanvas method we should remember to call putImageData in order for the new frame to be displayed:

  ctx.putImageData(imgData,0,0);

What remains to be done is to create an instance of this video class in index.html:

      var mymem = new memory(postInit);
      var mycpu = new cpu(mymem);
      var myvideo = new video(document.getElementById("screen"), mymem);
      var mytimer;
      var running = false;
      var breakpoint = 0;

Slowing down to a real C64

In the previous section we wrote code for drawing one frame of our emulated screen.

Now the question, when do we invoke updateCavas? This question actually raises the the issue of syncronisation.

Up to this point in time I was a bit absent minded about making the emulator emulating the real speed of the C64. This is now high time do this.

First lets decide to emulate the PAL version of the C64. The PAL version renders 50 frames per second. So, we need to modify our runBatch method to accomodate this.

We will expand our runBatch command to not only render a batch of instructions, but also to render a frame. This means we should invoke our runBatch method 50 times a second. So out interval will need to change to a 1/50th of a second or 20 milliseconds:

      function startEm() {
        document.getElementById("btnStep").disabled = true;
        document.getElementById("btnRun").disabled = true;
        document.getElementById("btnStop").disabled = false;
        var myBreak = document.getElementById("breakpoint");
        breakpoint = parseInt(myBreak.value, 16);
        running = true;
        myTimer = setInterval(runBatch, 20);
      }

In order to syncronise our emulator to a real C64, we need to know how many instructions a real C64 would execute in 20 milliseconds. Well, actually instructions per second is a very crude term. A 6502 take anything from 2 to 7 clock cycles to execute.

A better question would be to ask: How many C64 clock cycles ticks in 20 milliseconds? That is easy to calculate. We know the CPU clock speed of the C64 is 1 MHz (For now I am going to stick with the theoretical number and not go into PAL/NTSC differences). So you just do 1000000/50. This amounts to 20000 clock cyces per 20 milliseconds.

Currently our CPU doesn't keep count of the number of clock cycles that passed, so we need to implement this first.

We start off with a private variable in our CPU class called cycleCount:

  var localMem = memory;
  var acc = 0;
  var x = 0;
  var y = 0;
  var pc = 0x400;
  var sp = 0xff;
  var zeroflag = 0;
  var negativeflag = 0;
  var carryflag =0;
  var overflowflag =0; 
  var decimalflag = 0;
  var interruptflag = 1;
  var breakflag = 1;
  var cycleCount = 0;

And of course, we need a getter, so that our runBatch method knows what is cooking:

    this.getCycleCount = function() {
      return cycleCount;
    }

We update the cycleCount variable within the step method of the Cpu class:

  this.step = function () {
    var opcode = localMem.readMem(pc);
    pc = pc + 1;
    var iLen = instructionLengths[opcode];
    var arg1 = 0;
    var arg2 = 0;
    var effectiveAdrress = 0;
    cycleCount = cycleCount + instructionCycles[opcode];
  ...
  }

We are now ready to modify the runBatch method. I am going to show the the whole method after the change and then discuss the changes:

      function runBatch() {
        if (!running)
          return;
        myvideo.updateCanvas();
        var targetCycleCount =  mycpu.getCycleCount() + 20000;
        while (mycpu.getCycleCount() < targetCycleCount) { 
          mycpu.step();
          var blankingPeriodLow = targetCycleCount - 100;
          if ((mycpu.getCycleCount() >= blankingPeriodLow) & (mycpu.getCycleCount() <= targetCycleCount)) {
            mymem.writeMem(0xD012, 0);
          } else  {
            mymem.writeMem(0xD012, 1);
          }
          if (mycpu.getPc() == breakpoint) {
            stopEm();
            return;
          }
        }
      }

Our for loop change to a while loop since it is now the responsibility of the CPU class to increment the cycleCount variable. The while loop only checks against a target clock cycle, which at the beginning of the loop was set to 20000 clock cycles in the future.

Ok, lets put everything to the test. On testing the speed was more or less as expected, and we see the familiar C64 welcome screen.

But, I don't see any flashing cursor!

I will leave this investigation on the missing cursor for the next Blog post.

In Summary

In this blog we covered writing code to emulate a very basic C64 screen. It still doesn't show a flashing cursor, but we will cover this is the next blog.

Till next time!

Saturday, 21 May 2016

Part 9: Booting the C64 System

Foreword

In the previous blog we successfully ran the Klaus Test Suite.

In this Blog we are going to boot the C64 system with all its code ROMS.

Loading the ROMs

The Commodore 64 has three ROMs:

Kernal
Basic
Character ROM

In order to boot the C64 system on our emulator, one of the first things you would do is to load at least two of the ROMS (e.g. Kernal and Basic) each into a Uint8array of its own.

But, to populate each Uint8array by means of a zillion line JavaScript array definition, as I did in my previous posts, is starting to sound very clunky :-)

Clearly, there must be a way JavaScript can help us to populate Uint8array's by given it the paths to the Binary ROM images. Indeed there is. There is a JavaScript class called XMLHttpRequest that can do this job for us.

XMLHttpRequest has a downfall, however. It is doesn't work at all when you browse web pages directly from your file system (like opening them up via Windows explorer). This pose a issue for us since up to now we have tested our emulator only via our local file systems.

The obvious solution to this issue is to always deploy to a website and then test. This is not very feasible if you just want to quickly test something the emulator.

A better option would be to install a webserver on your local machine.

Any webserver would basically do, but I found that Tomcat works the best for me. You only need Java installed on your PC and is just a case of unzipping the zip file and starting it up.

In the next section I am going to give a quick crash course on installing Tomcat on your machine.

Installing Tomcat

As mentioned previously, to run Tomcat you need to have Java installed on your machine. If you open a Command Prompt and typing java -version gives you back a version number, you are good to go!

You can get Tomcat via the following link:

https://tomcat.apache.org/download-70.cgi

Download either the zip file for Windows or the .tar.gz for Linux

The contents of the two files are more less the same, but remember .tar.gz preserves file permissions saving you from fiddling with chmod in Linux.

Unzip/Untar the file to any location on your local file system. Then browse to the bin folder via a Command Prompt.

Now, to start tomcat in Linux you need to issue the command ./startup.sh If all went well you'll get a prompt stating that tomcat started successfully.

For Windows you will use the command startup.bat instead.

Now, to see if Tomcat is working properly open up a browser and type the following: localhost:8080 If the following page appears, you installed Tomcat successfully:

To install your web app is just as easy. In the webapps folder create a folder and stuck all the emulator files (e.g. index.html, Cpu.js and Memory.js) in it as shown here:

To invoke the emulator as per above screen shot, type the following in the address line in the browser: localhost:8080/emu/index.html

We now have a working base to continue with the rest of this blog.

Loading ROMS with XMLHttpRequest

Let us start off with an example from Mozilla.org showing how to load a binary file into an array with XMLHttpRquest:

var oReq = new XMLHttpRequest();
oReq.open("GET", "/myfile.png", true);
oReq.responseType = "arraybuffer";

oReq.onload = function (oEvent) {
  var arrayBuffer = oReq.response; // Note: not oReq.responseText
  if (arrayBuffer) {
    var byteArray = new Uint8Array(arrayBuffer);
    for (var i = 0; i < byteArray.byteLength; i++) {
      // do something with each byte in the array
    }
  }
};

oReq.send(null);

Lets dissect this code. First an instance of XMLHttpRequest is created.

The open method is then invoked on the object. This basically informs the object which file needs to be fetched. The third parameter of open states whether the request should be made asynchronously or synchronously. A true means asynchronously and vice versa.

What is the difference between asynchronously and synchronously? When the request is made asynchronously the request is send to the webserver and immediately returns. So your Javascript program will continue executing even though the whole file wasn't received yet.

When a synchronous call is made, the method call will not return until the whole file is received.

In subsequent Blog posts I will be following the asynchronous way of thinking. The body that governs the JavScript standards (that is W3C) has a big drive to get rid of synchronous calls because it potentially makes web pages unresponsive. My original hopes was in synchronous calls, but after hitting a couple of brick walls, I had to adjust my thinking pattern to an asynchronous one :-(

Back to out snippet of code. How do we know that the file has finished loading? The XMLHttpRequest class defines a callback property called onload. When the file has finished loading it invokes the method you assigned to this property.

When your callback method is eventually executed, the response attribute of your Request object will contain the data. You will see in the code that responseType is set to arraybuffer. In this way you let JavaScript do the work for you and populate a Uint8Array with the data.

Let us now modify this code snippet to fit our purpose. All this code is going to end up in Memory.js

First we need to declare two Uint8Array's that will hold the data of the two ROM's:

  var mainMem = new Uint8Array(65536);
  var basicRom = new Uint8Array(8192);
  var kernalRom = new Uint8Array(8192);

Next we repeat the code snippet for each ROM:

var oReqBasic = new XMLHttpRequest();
oReqBasic.open("GET", "basic.bin", true);
oReqBasic.responseType = "arraybuffer";

oReqBasic.onload = function (oEvent) {
  var arrayBuffer = oReqBasic.response; // Note: not oReq.responseText
  if (arrayBuffer) {
    basicRom = new Uint8Array(arrayBuffer);
  }
};

oReqBasic.send(null);

//------------------------------------------------------------------------

var oReqKernal = new XMLHttpRequest();
oReqKernal.open("GET", "kernal.bin", true);
oReqKernal.responseType = "arraybuffer";

oReqKernal.onload = function (oEvent) {
  var arrayBuffer = oReqKernal.response; // Note: not oReq.responseText
  if (arrayBuffer) {
    kernalRom = new Uint8Array(arrayBuffer);
  }
};

oReqKernal.send(null);

This will take care of loading both ROMs in an array. However, we still need a mechanism to know when both ROMS has loaded so we can kick off the emulator.

To cater for this need we will need to create a global variable called outstandingDownloads. The name speaks for itself. It will start off with a initial value of 2 (e.g two ROMS to download):

  var outstandingDownloads = 2;

When each download finishes it should decrement this global variable. When this global reached 0 we can call a callback method

function memory(allDownloadedCallback)

{
  var mainMem = new Uint8Array(65536);
  var outstandingDownloads = 2;
  var basicRom = new Uint8Array(8192);
  var kernalRom = new Uint8Array(8192);
  
  function downloadCompleted() {
    outstandingDownloads--;
    if (outstandingDownloads == 0)
      allDownloadedCallback();
  }
var oReqBasic = new XMLHttpRequest();
oReqBasic.open("GET", "basic.bin", true);
oReqBasic.responseType = "arraybuffer";

oReqBasic.onload = function (oEvent) {
  var arrayBuffer = oReqBasic.response; // Note: not oReq.responseText
  if (arrayBuffer) {
    basicRom = new Uint8Array(arrayBuffer);
    downloadCompleted();
  }
};

oReqBasic.send(null);

//------------------------------------------------------------------------

var oReqKernal = new XMLHttpRequest();
oReqKernal.open("GET", "kernal.bin", true);
oReqKernal.responseType = "arraybuffer";

oReqKernal.onload = function (oEvent) {
  var arrayBuffer = oReqKernal.response; // Note: not oReq.responseText
  if (arrayBuffer) {
    kernalRom = new Uint8Array(arrayBuffer);
    downloadCompleted();
  }
};

oReqKernal.send(null);


  this.readMem = function (address) {
    return mainMem[address];
  }

  this.writeMem = function (address, byteval) {
    mainMem[address] = byteval;
  }

}

So, here we first moved the oustandingDownload variable manipulation into a method of its own. The callback method is supplied as parameter when we instantiated the memory object.

As it stands, there is one remaining issue with our memory class: our ROM arrays is not mapped in memory space. So, when you call readMem it will only return information from the mainMem array.

We will need to build some if statements into the readMem method:

  this.readMem = function (address) {
    if ((address >= 0xa000) & (address <=0xbfff))
      return basicRom[address & 0x1fff];
    else if ((address >= 0xe000) & (address <=0xffff))
      return kernalRom[address & 0x1fff];
    return mainMem[address];
  }

Remember, on a C64 you get the BASIC Rom in memory range A000 - BFFF and the KERNAL ROM in memory range E000 - FFFF. For all other memory access you can just return the contents from the mainMem array.

Something else that might be funny is the anding of the address with 1FFF for both ROM's. This operation just keeps the lower 13 bits of the address, so you stay within the bounds of the ROM's address range.

Do we need to worry to modify writeMem? No, the memory model of the C64 is such that if you try to set a value at a memory location where there is ROM, it will write the value to RAM, even though you will not be able to see the written value at point in time.

This conclude the required changes to the Memory class.

The only thing left to do is to create a callback method in index.html and pass it as a parameter when creating an instance of the memory class

      var mymem = new memory(postInit);
      var mycpu = new cpu(mymem);
      var mytimer;
      var running = false;
      var breakpoint = 0;
...

     function postInit() {
     }

Booting

Before we start, lets consider a couple of point that must happen during boot:

When the emulator window opens all buttons should be disabled until all ROMS has loaded. This is to prevent the user from firing off the emulator during this process
When all ROMS has loaded we can enable all buttons again

To implement the first the first point we adjust the properties off all buttons on the html page:

<textarea id="memory" name="mem" rows="15" cols="60"></textarea>
From Location:
<input type="text" id="frommem">
<button onclick="showMem()" disabled = "true">Refresh Dump</button>
<br/>
<textarea id="diss" name="diss" rows="11" cols="60"></textarea>
<button id="btnStep" disabled = "true" onclick="step()">Step</button>
<button id="btnRun" disabled = "true" onclick="startEm()">Run</button>
<button id="btnStop" disabled = "true" onclick="stopEm()">Stop</button>
<br/>

In our callback method we can enable all buttons when all the ROMS has loaded:

     function postInit() {
        document.getElementById("btnStep").disabled = false;
        document.getElementById("btnRun").disabled = false;
        document.getElementById("btnStop").disabled = false;
     }

When a 6502 starts to execute out of RESET, one of the first things it do is to load the program counter with the value of the reset vector, which is at address FFFC and FFFD. We also need to implement this in our emulator. We do this by implementing a method called reset in the CPU that will do this for us:

    this.reset = function () {
      pc = localMem.readMem(0xfffc);
      pc = pc + localMem.readMem(0xfffd) * 256;
    }

Now, one of the extra responsibilities of our callback method is also to call this reset method:

     function postInit() {
        document.getElementById("btnStep").disabled = false;
        document.getElementById("btnRun").disabled = false;
        document.getElementById("btnStop").disabled = false;
        mycpu.reset();
     }

Our C64 system would now be able to boot if we hit the run button.

Since we currently don't have a emulated screen, how are we going to get some meaningful output? Well, we know that the screen memory (text mode) of the C64 starts at 400h at boot up. So, we can stop execution at any time and inspect this location to check if we can spot the welcome message.

Doing the above I get the following:

There was definitely stuff happening. The screen is padded with 20h bytes. 20h is the screen code for a space. But no sign of a welcome message!

Let us continue stepping and see if we can find some clues. And, indeed we have a clue! The emulator is stuck in this loop:

ff5e LDA $d012
ff61 BNE $ff5e

The emulator is waiting for memory location D012 to change to a zero.

What is special about memory location D012? This is one of the location within the memory space of the VIC-II display chip. This particular location is the raster counter keeping count at which line position the raster beam is at any point in time on the screen. So probably this code does like waiting for he raster beam to be off screen before refreshing it.

Lets see if we can do a quick hack just to get out of this loop. In later posts we will eventually come back to implement the raster functionality properly. But, for now we just put in a hack to see how far we can get.

The hack I am thinking of is to say after every 10000th instruction, write the value 0 to location D012. Keeping writing 0 to this location after each instruction for 30 instructions. After that just write 1 to this location after each instruction:

      function runBatch() {
        if (!running)
          return;
        for (i=0; i < 100000;  i++) { 
          mycpu.step();
          if ((i % 10000) < 30) {
            mymem.writeMem(0xD012, 0);
          } else {
            mymem.writeMem(0xD012, 1);
          }
          if (mycpu.getPc() == breakpoint) {
            stopEm();
            break;
          }
        }
      }

We have more luck this time:

At the end of address line 420 you start to see none space characters. For the record, the following website will give you a map to which character each screen code maps to: http://sta.c64.org/cbm64scr.html

Lets convert the first 10 screencodes, starting from memory location 42Ch, to characters:

42C 2A *
42D 2A *
42E 2A *
42F 2A *
430 20
431 03 C
432 0F O
433 0D M
434 0D M
435 0F O

Looks more or less what we expect. What I am more interested at ths point is whether our emulator calculated the number of free bytes correctly. This is represented by the sequence of bytes at address line 480:

Ok, looks like we are more or less on track with our C64 boot!

In Summary

In this blog we managed to boot the C64 with all its code ROMS.

In the next Blog we are going to emulate the C64 screen. It is going to be very basic though: Monochrome, text mode only, hardcoded to screen memory location 400h.

Till next time!

Friday, 13 May 2016

Part 8: Running a Test Suite

Foreword

In the previous section we implemented the remaining instructions of our CPU emulator.

In this section we will run our emulator against the Klaus Test Suite.

But, before we start, we need to extend the debugging functionality of our emulator a bit...

Extending Debug functionality

Currently the only way to run our emulator is to single step it. A very painful way to run the Klaus Test Suite!

So, it is obvious that we need to add a continuous run feature in our emulator. At the same time, we will also need a feature that will stop continuous execution at any time, allowing the user to single step again from that point onwards.

The stop functionality is important in running the Klaus Test Suite, as you will have no other way to evaluate at which point the tests failed.

Here is a list of two other nice debug features we will implement in this post:

Breakpoints: For a first round we will give the option to add a single breakpoint
Memory Dump: We will modify the exiting functionality so that you can specify a different area in memory to view

Run/Stop Functionality

Implementation time! Lets start by adding a Run and a Stop button next to the step button in index.html:

<button id="btnStep" onclick="step()">Step</button>
<button id="btnRun" onclick="startEm()">Run</button>
<button id="btnStop" onclick="stopEm()">Stop</button>

Before we head off and wrote code for these two methods, it is important to keep the threading model of JavaScript/Browser in mind.

The most important aspect of the threading model is that each page open in a browser can only have a single thread at any time. With this thread the page in the browser must do all its work: From rendering the content of the page to the screen, checking for button clicks, executing JavaScript code and so on.

This means that when a page is busy executing JavaScript, that page will appear unresponsive.

This single threaded nature has very important implications for our Run/Stop functionality. When our emulator is busy executing instructions continuously, it won't be able to accept the Stop command!

How do you get around this limitation? Well, JavaScript provides you with a method called setInterval(). To understand this method let's look at an example snippet of code:

myTimer = setInterval(runBatch, 200);

This method call basically says: Run the method runBatch every 200 milliseconds. In the runBatch method you will run a small batch of assembly instructions (e.g. calling step() a couple of times) and then exit. RunBatch will again be called when 200 milliseconds lapsed and the whole process will be repeated.

This whole process will continue running, until at some point in time you call clearInterval(). More on this method in a moment.

So, the question is: How many 6502 instructions do you execute in a batch? For this Blog Post, it doesn’t really matter. As long as his emulator web page is kind of responsive.

When I played with this method I started off with a batch of 100 instructions. After a while however, I found that the Test Suite takes ages to complete. So my patience ran out and I ended off with a batch of 100000 :-)

Obviously, in later blog posts where we will add a screen as output, we will want to get everything working at the expected speed of a C64. In this case you will need to tune the batch size and also the interval.

Back to our snippet of code. As you see the returned value of SetInterval() gets assigned to a variable. So the question is: What does this variable contain? It contains a instance of the timer you have just created.

What can you do with this timer instance? This is actually where clearInterval() comes in, as I mentioned earlier on. When you call clearInterval with the Timer Instance as a parameter, you actually stops continuous execution of the runBatch method.

With all this information, we have enough to implement the Run/Stop functionality for the emulator:

      function startEm() {
        document.getElementById("btnStep").disabled = true;
        document.getElementById("btnRun").disabled = true;
        document.getElementById("btnStop").disabled = false;
        myTimer = setInterval(runBatch, 200);
      }

      function stopEm() {
        clearInterval(mytimer);
        displayEmuState();
        document.getElementById("btnStep").disabled = false;
        document.getElementById("btnRun").disabled = false;
        document.getElementById("btnStop").disabled = true;
      }

You will notice a new method displayEmuState().This is basically the display code in index.html's step method that I moved into a method of its own. I was busy with a bit of code cleanup for index.html. You can have a look at the GitHub zip file for this post to get an idea of what I was up to with index.html.

myTimer you will need to declare as a global variable:

      var mymem = new memory();
      var mycpu = new cpu(mymem);
      var mytimer;

Finally, lets implement runBatch:

      function runBatch() {
        for (i=0; i < 100;  i++) { 
          mycpu.step();
        }
      }

Very minimalistic.

Breakpoints

Lets implement the single breakpoint functionality as I mentioned earlier.

At the bottom of the page, lets add a input field allowing the user to enter a address at which the emulator should break:

<button id="btnStep" onclick="step()">Step</button>
<button id="btnRun" onclick="startEm()">Run</button>
<button id="btnStop" onclick="stopEm()">Stop</button>
<br/>
Break at: <input type="text" id="breakpoint">

The runBatch method should check the program counter after each call to Cpu.step() to determine whether this address was reached.

There is two foreseeable issues that we need to resolve first before implementing the checking in runBatch.

Firstly, as our Cpu class is currently written, outsiders don't have access to the program counter. So, lets implement a getter in the CPU class:

    this.getPc = function () {
      return pc;
    }

Lets discuss the second problem. The value of the text input field for the breakpoint address is a String. This means that each time the runBatch method need to first convert the String to an integer, and then do the comparison with the program counter. Sounds like a bit of waste!

We can do better. When clicking run the startEm() method can actually do this conversion once and assign it to a global variable. The runBatch method can then use the value of this global variable each time.

Here is how the global variable and the startEm() method will look like:

      var breakpoint = 0;
...

      function startEm() {
        document.getElementById("btnStep").disabled = true;
        document.getElementById("btnRun").disabled = true;
        document.getElementById("btnStop").disabled = false;
        var myBreak = document.getElementById("breakpoint");
        breakpoint = parseInt(myBreak.value, 16);        myTimer = setInterval(runBatch, 200);
      }

We are now ready to implement to breakpoint checking in runBatch:

      function runBatch() {
        for (i=0; i < 100;  i++) { 
          mycpu.step();
          if (mycpu.getPc() == breakpoint) {
            stopEm();
            break;
          }
        }
      }

So, when breakpoint is reached, stopEm() is called which clears the interval timer and update the state display of the emulator. The break exits the loop so that no further instructions is executed for the batch.

Ok, this is about it for implementing breakpoint functionality. Or, is it really? When I used this breakpoint functionality to figure out why some test cases fails on emulator, I stumbled across a weird anomaly. This anomaly entailed that when the breakpoint point fired, clicking step wouldn't take you to the next instruction, but to the tenth or so instruction where you stopped.

It was as if there was a pending timer event that was still queued in the JavaScript event queue.

Googling didn't reveal much info on this issue. There was actually one useful piece of info suggesting the use of a global variable like running. When you click stop, you need to set this variable to false and in your runBatch method you actually check this variable. If runBatch sees running is false, it just exits. Here is how the changes will look like:

      var running = false;

...

      function runBatch() {
        if (!running)
          return;
        for (i=0; i < 100000;  i++) { 
          mycpu.step();
          if (mycpu.getPc() == breakpoint) {
            stopEm();
            break;
          }
        }
      }

      function startEm() {
        document.getElementById("btnStep").disabled = true;
        document.getElementById("btnRun").disabled = true;
        document.getElementById("btnStop").disabled = false;
        var myBreak = document.getElementById("breakpoint");
        breakpoint = parseInt(myBreak.value, 16);
        running = true;
        myTimer = setInterval(runBatch, 200);
      }

      function stopEm() {
        running = false;
        clearInterval(mytimer);
        displayEmuState();

        document.getElementById("btnStep").disabled = false;
        document.getElementById("btnRun").disabled = false;
        document.getElementById("btnStop").disabled = true;

      }

Flexible Memory dump

Currently the emulator's memory dump functionality is limited to viewing memory from location 0. In this section we will add an input text field so that the user can choose to view a different area in memory.

Lets start off by adding the input field and button to index.html:

<textarea id="memory" name="mem" rows="15" cols="60"></textarea>
From Location:
<input type="text" id="frommem">
<button onclick="showMem()">Refresh Dump</button><br/>

The showMem() is one of the methods that I wrote while doing code cleanup index.html. This method also needs to be adjusted to look at the frommem input field for the start address:

      function showMem() {
        var m = document.getElementById("memory");
        var location = document.getElementById("frommem");
        locationInt = parseInt(location.value, 16);        tempmemstr = ""
        for (i = locationInt; i < (160 + locationInt); i++) {
          if ((i % 16) == 0) {
            labelstr = "";
            labelstr = labelstr + "0000" + i.toString(16);
            labelstr = labelstr.slice(-4);

            tempmemstr = tempmemstr + "\n" + labelstr;
          }
          currentByte = "00" + mymem.readMem(i).toString(16);        
          currentByte = currentByte.slice(-2);
          tempmemstr = tempmemstr + " " + currentByte;
        }
        m.value = tempmemstr;

      }

We are now ready for preparing to run the Klaus Test Suite.

Preparing to run Test Suite

Firstly we need to get hold of the Klaus Test Suite. It is available via GitHub:

https://github.com/Klaus2m5/6502_65C02_functional_tests

The assembly files (e.g. the ones with the a65) needs to be compiled with the a65 assembler.

For convenience, compiled binary files is also provided in the bin_files folder. These binaries are memory images that you copy straight to your 64KB emulated memory. I think I am going to take the convenience route :-)

You will also notice that together with compiled binary files in the bin_files folder, there are .lst files. The a65 assembler generates these listings when it is assembling the source. The beauty of these listing files is that it keeps all the comments together with the memory address where each instruction is in memory.

This info in the listing files comes in very handy when a test fails at a particular address, so you can look it up in the listing file and almost immediately know what the test that fail is about. Sadly, I only discovered this after I managed to run the Test Suite successfully :-(

From the assembly listing it stated that you need to start executing at memory location $400. So we will assign this address to the program counter when the CPU class is created.

As in the previous posts will add the Test Suite binary as an array in the Memory class. To assist in this task, I wrote a quick Java program taking the binary and spitting out the contents as a JavaScript array. In the GitHub zipfile for this post the Memory class will be populated with the Test Suite program.

While running the Test suite I also found it handy to have a disassembled listing of the binary at hand. This is nice for quick reference. There is a nice online tool that can assist you in the disassembly of the binary at the following link:

http://e-tradition.net/bytes/6502/disassembler.html

Just open the binary with a hex editor and copy the hex values from $400 till about $370D. You will see the succeeding memory locations is filled with Hex FF's.

Now paste the copied hex values in the left panel (called code) of the online disassembler, enter 400 as opcode start address and click disassemble. Your screen will look similar to the following:

We now have all the necessary resources to start running the Test Suite.

Running the Test Suite

With everything in place, lets hit run to kick off the Test Suite.

The start off not so well. Right at the beginning I get a pop-up: Op code 216 not implemented. PC = 401

216 is D8 in hex and this is the CLD (clear decimal flag) the instruction.

OK, I must admit I didn't consider BCD at all in the emulator up now. To get pass this error let implement the decimal flag private variable and implement the CLD and SED instructions. For now I will not worry about BCD addition or subtraction:

  var decimalflag = 0;
...
/*CLD  Clear Decimal Mode

     0 -> D                           N Z C I D V
                                      - - - - 0 -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     implied       CLD           D8    1     2 */

       case 0xD8:
         decimalflag = 0;
       break;

/*SED  Set Decimal Flag

     1 -> D                           N Z C I D V
                                      - - - - 1 - 

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     implied       SED           F8    1     2  */

      case 0xF8:
        decimalflag = 1;
      break;

With this added we run the test again. This time no alert pops up. I let it run for three minutes and then hit stop. At this point, this is how my browser window looks like:

Clicking step from this point onwards keeps jumping back to $5f0. A test has therefore failed. Lets investigate by looking at the surrounding code:

05E7   08         PHP
05E8   BA         TSX
05E9   E0 FE      CPX #$FE
05EB   D0 FE      BNE $05EB
05ED   68         PLA
05EE   C9 FF      CMP #$FF
05F0   D0 FE      BNE $05F0
05F2   BA         TSX
05F3   E0 FF      CPX #$FF
05F5   D0 FE      BNE $05F5

What is happening here? First the status register is pushed unto the stack, and then popped into the accumulator. The accumulator is then compared with $FF

So the applicable test tests that all flags of the status register are set.

From the above screen we see that the accumulator is $C3. This translates to 1100 0011 binary and means that in our case bits 2 - 5 our status register was cleared. These bits represents the following flags:

bit 5: Ignored
bit 4: Break
bit 3: Decimal
bit 2: Interrupt

Lets see if we can get some more information about these flags.

From a couple of resources I found that bit 5 is always hardcoded to one.

Bit 4 (the break flag) is used during a interrupt do determine whether the interrupt was caused by an IRQ/NMI or whether was due to executing the BRK instruction. If the interrupt was due to a IRQ/NMI the flag will be cleared. For all other instances the BRK flag will be set. For normal operations it is therefore save to assume a BRK flag value of 1.

Bit 3 (Decimal). Ok, the previous fix we did implement SED and CLD, but we never adjusted the getStatusFlagsAsByte and setStatusFlagsAsByte method!

Bit 2. Interrupt flag. First of all we didn't implement the SEI and CLI instruction. But, our emulator didn't complain about missing these instructions. So it must be something todo with the initial state of the interrupt flag. Googling around it looks like the Interrupt disable flag is set on a fresh 6502 start.

With all this information, we make the following changes in the CPU class:

   var carryflag =0;
   var overflowflag =0; 
   var decimalflag = 0;
   var interruptflag = 1;
   var breakflag = 1;

...

     function getStatusFlagsAsByte() {

      var result = (negativeflag << 7) | (overflowflag << 6) | (1 << 5) | (breakflag << 4) | (decimalflag << 3) | (interruptflag << 2) | (zeroflag << 1) |
         (carryflag);
       return result;
     }

    function setStatusFlagsAsByte(value) {
      negativeflag = (value >> 7) & 1;
      overflowflag = (value >> 6) & 1;
      decimalflag = (value >> 3) & 1;
      interruptflag = (value >> 2) & 1;
      zeroflag = (value >> 1) & 1;
      carryflag = (value) & 1;
    }

With this fixed lets run the emulator again. This time this alerts pops up: Op code 234 not implemented. PC = 870

The NOP is not implemented. So lets do that quickly:

/*NOP  No Operation
     ---                              N Z C I D V
                                      - - - - - -
     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     implied       NOP           EA    1     2 */

     case 0xEA:
     break;

At the next run, there is a complained about the BRK instruction not implemented. This instruction trigger an interrupt, so lets implement BRK and RTI together:

/*BRK  Force Break

     interrupt,                       N Z C I D V
     push PC+2, push SR               - - - 1 - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     implied       BRK           00    1     7 */
       
      case 0x00:
        interruptflag = 1;
        var tempVal = pc + 1;
        Push(tempVal >> 8);
        Push(tempVal & 0xff);
        Push(getStatusFlagsAsByte());
        tempVal = localMem.readMem(0xffff) * 256;
        tempVal = tempVal + localMem.readMem(0xfffe);
        pc = tempVal;
      break;

/*RTI  Return from Interrupt

     pull SR, pull PC                 N Z C I D V
                                      from stack

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     implied       RTI           40    1     6 */
      
      case 0x40: 
        setStatusFlagsAsByte(Pop());
        var tempVal = Pop();
        tempVal = (Pop() << 8) + tempVal;
        pc = tempVal;
      break;

Next, the emulator gets stuck at $36E5. Lets look again at the surrounding code:

36C9   88         DEY
36CA   88         DEY
36CB   08         PHP
36CC   88         DEY
36CD   88         DEY
36CE   88         DEY
36CF   C9 42      CMP #$42
36D1   D0 FE      BNE $36D1
36D3   E0 52      CPX #$52
36D5   D0 FE      BNE $36D5
36D7   C0 48      CPY #$48
36D9   D0 FE      BNE $36D9
36DB   85 0A      STA $0A
36DD   86 0B      STX $0B
36DF   BA         TSX
36E0   BD 02 01   LDA $0102,X
36E3   C9 30      CMP #$30
36E5   D0 FE      BNE $36E5

Looking further down in the code, I see the code actually ends off with a RTI, so this code is part of a interrupt service routine.

Looking back in the code I copied, I see the stack pointer gets copied to the X register and then the value of memory location $102,X is compared with value $30.

From the debug window I figured out the stackpointer and register X has value $FB. So the effective meomory location is $01FD. Since this is inside an interrupt service routine, we know pushed values on the stack should be the address and the value of the status register. So looking at everything I feel comfortable that the value we compare is a status register. The value at $01FD is $34 and the Test Suite expects $30.

The two values differ by a single bit; Bit 2, which is the interrupt disable register. So, my emulator set the interrupt disable flag and the Test Suite expects this flag to be cleared. After some reasoning I established the following:

Calling BRK should set the I flag, but
this doesn't necessarily means that the status register pushed onto the stack will have this bit set.

Keeping this in mind, I actually discovered that I set the interrupt flag to early in the case statement of the BRK instruction. The setting should actually happen after the status byte is pushed onto the stack:

      case 0x00:
        interruptflag = 1;
        var tempVal = pc + 1;
        Push(tempVal >> 8);
        Push(tempVal & 0xff);
        Push(getStatusFlagsAsByte());

        interruptflag = 1;
tempVal = localMem.readMem(0xffff) * 256;
        tempVal = tempVal + localMem.readMem(0xfffe);
        pc = tempVal;
      break;

Next, issue: SEI and CLI not implemented. So lets implement them:

/*CLI  Clear Interrupt Disable Bit

     0 -> I                           N Z C I D V
                                      - - - 0 - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     implied       CLI           58    1     2 */

      case 0x58: 
        interruptflag = 0;        
      break;

/*SEI  Set Interrupt Disable Status

     1 -> I                           N Z C I D V
                                      - - - 1 - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     implied       SEI           78    1     2 */

      case 0x78: 
        interruptflag = 1;                
      break;

Next failing point is at address $d23. Let look again at the surrounding code:

0D13   8D 00 02   STA $0200
0D16   A2 01      LDX #$01
0D18   A9 FF      LDA #$FF
0D1A   48         PHA
0D1B   28         PLP
0D1C   9A         TXS
0D1D   08         PHP
0D1E   AD 01 01   LDA $0101
0D21   C9 FF      CMP #$FF
0D23   D0 FE      BNE $0D23

While debugging through this code, it seemed that the TXS effected some flags. Something just didn't seem right. I had a look again at the datasheet from masswerk and they actually stated that TXS should effect some flags:

 X to Stack Register

     X -> SP                          N Z C I D V
                                      + + - - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     implied       TXS           9A    1     2

But still, something didn't seem right. I consulted other sources and Bingo! Flags shouldn't be effected for this instruction. I commented out the setting of flags for this instruction and restarted the Test Suite.

Next, the emulator got stuck at address $22B3. Here is the surrounding code:

22A5   28         PLP
22A6   4A         LSR A
22A7   08         PHP
22A8   DD 19 02   CMP $0219,X
22AB   D0 FE      BNE $22AB
22AD   68         PLA
22AE   49 7C      EOR #$7C
22B0   DD 29 02   CMP $0229,X
22B3   D0 FE      BNE $22B3

From this code this test entails doing a LSR and checking effected flags. Id did some debugging and found EOR yields a result of 82. This gets compared to the effective address 22C which has a value of 02. It appears that after the LSR operation, the negative flag is set. This is strange because a LSR operation should always insert a zero in bit 7, yielding a positive number.

I looked carefully at the definition on the masswerk datasheet, and here we go: It states that the negative flag is unaffected. I again consulted other datasheets and all says that the Negative flag is set to 0 as part of the LSR operation.

I adjusted these Instructions accordingly:

      case 0x4A: 
          carryflag = ((acc & 0x1) != 0) ? 1 : 0;
          acc = acc >> 1;
          acc = acc & 0xff;
          zeroflag = (acc == 0) ? 1 : 0;
          negativeflag = 0;
        break;
      case 0x46: 
      case 0x56: 
      case 0x4E: 
      case 0x5E: 
          var tempVal = localMem.readMem(effectiveAdrress);
          carryflag = ((tempVal & 0x1) != 0) ? 1 : 0;
          tempVal = tempVal >> 1;
          tempVal = tempVal & 0xff;
          zeroflag = (tempVal == 0) ? 1 : 0;
          localMem.writeMem(effectiveAdrress, tempVal);
          negativeflag = 0;
        break;

I got two last issues I had to fix in order for all Test cases to succeed. I will try to give a summary on these two issues.

The first issue was that for SBC I applied two's complement for the operand instead of one's complement, so I had a off by one issue with SBC.

Finally I had to implement BCD mode addition and subtraction. For this I made use of code from my Java emulator and adopted it to JavaScript.

In Summary

In this blog we started off by extending the Debugging functionality. We then ran the Klaus Test Suite.

As expected, there was a couple of issues that came out. In the end however, we managed to run all tests.

In my next Blog we will be booting the C64 system with its ROM's.

Till next time!

Tuesday, 10 May 2016

Part 7: The Remaining Instructions

Foreword

In the previous Blog I covered the Stack and related operations.

In this post I will cover the remaining instructions that needs to be implemented for our 6502 emulator.

So, we are almost done implementing the CPU part of our emulator! This is very exciting since blog posts after this one I will cover running running the Klaus Test Suite on the emulator and after that booting the C64 system with its ROM's.

Logical operations

The 6502 supports the following logical operations: AND, XOR and OR.

Lets implement them:

/*AND  AND Memory with Accumulator

     A AND M -> A                     N Z C I D V
                                      + + - - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     immediate     AND #oper     29    2     2
     zeropage      AND oper      25    2     3
     zeropage,X    AND oper,X    35    2     4
     absolute      AND oper      2D    3     4
     absolute,X    AND oper,X    3D    3     4*
     absolute,Y    AND oper,Y    39    3     4*
     (indirect,X)  AND (oper,X)  21    2     6
     (indirect),Y  AND (oper),Y  31    2     5* */


      case 0x29:  acc = acc & arg1;
          zeroflag = (acc == 0) ? 1 : 0;
          negativeflag = ((acc & 0x80) != 0) ? 1 : 0;
        break;
      case 0x25:
      case 0x35:
      case 0x2D:
      case 0x3D:
      case 0x39:
      case 0x21:
      case 0x31: acc = acc & localMem.readMem(effectiveAdrress);
          zeroflag = (acc == 0) ? 1 : 0;
          negativeflag = ((acc & 0x80) != 0) ? 1 : 0;
        break;



/*EOR  Exclusive-OR Memory with Accumulator

     A EOR M -> A                     N Z C I D V
                                      + + - - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     immediate     EOR #oper     49    2     2
     zeropage      EOR oper      45    2     3
     zeropage,X    EOR oper,X    55    2     4
     absolute      EOR oper      4D    3     4
     absolute,X    EOR oper,X    5D    3     4*
     absolute,Y    EOR oper,Y    59    3     4*
     (indirect,X)  EOR (oper,X)  41    2     6
     (indirect),Y  EOR (oper),Y  51    2     5* */

      case 0x49:  acc = acc ^ arg1;
          zeroflag = (acc == 0) ? 1 : 0;
          negativeflag = ((acc & 0x80) != 0) ? 1 : 0;
        break;
      case 0x45:
      case 0x55:
      case 0x4D:
      case 0x5D:
      case 0x59:
      case 0x41:
      case 0x51: acc = acc ^ localMem.readMem(effectiveAdrress);
          zeroflag = (acc == 0) ? 1 : 0;
          negativeflag = ((acc & 0x80) != 0) ? 1 : 0;
        break;


/*ORA  OR Memory with Accumulator

     A OR M -> A                      N Z C I D V
                                      + + - - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     immediate     ORA #oper     09    2     2
     zeropage      ORA oper      05    2     3
     zeropage,X    ORA oper,X    15    2     4
     absolute      ORA oper      0D    3     4
     absolute,X    ORA oper,X    1D    3     4*
     absolute,Y    ORA oper,Y    19    3     4*
     (indirect,X)  ORA (oper,X)  01    2     6
     (indirect),Y  ORA (oper),Y  11    2     5* */

      case 0x09:  acc = acc | arg1;
          zeroflag = (acc == 0) ? 1 : 0;
          negativeflag = ((acc & 0x80) != 0) ? 1 : 0;
        break;
      case 0x05:
      case 0x15:
      case 0x0D:
      case 0x1D:
      case 0x19:
      case 0x01:
      case 0x11: acc = acc | localMem.readMem(effectiveAdrress);
          zeroflag = (acc == 0) ? 1 : 0;
          negativeflag = ((acc & 0x80) != 0) ? 1 : 0;
        break;

Set and Clear Flag Instructions

The instructions for setting and clearing flags is as follows:

/*CLC  Clear Carry Flag

     0 -> C                           N Z C I D V
                                      - - 0 - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     implied       CLC           18    1     2 */

      case 0x18: 
          carryflag = 0;
        break;



/*CLV  Clear Overflow Flag

     0 -> V                           N Z C I D V
                                      - - - - - 0

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     implied       CLV           B8    1     2 */

      case 0xB8: 
          overflowflag = 0;
        break;


/*SEC  Set Carry Flag

     1 -> C                           N Z C I D V
                                      - - 1 - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     implied       SEC           38    1     2 */

      case 0x38: 
          carryflag = 1;
        break;

Transfer Instructions

The implementation of the Transfer instructions is as follows:

/*TAX  Transfer Accumulator to Index X

     A -> X                           N Z C I D V
                                      + + - - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     implied       TAX           AA    1     2 */

      case 0xAA: 
          x = acc;
          zeroflag = (x == 0) ? 1 : 0;
          negativeflag = ((x & 0x80) != 0) ? 1 : 0;
        break;


/*TAY  Transfer Accumulator to Index Y

     A -> Y                           N Z C I D V
                                      + + - - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     implied       TAY           A8    1     2 */

      case 0xA8: 
          y = acc;
          zeroflag = (y == 0) ? 1 : 0;
          negativeflag = ((y & 0x80) != 0) ? 1 : 0;
        break;


/*TSX  Transfer Stack Pointer to Index X

     SP -> X                          N Z C I D V
                                      + + - - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     implied       TSX           BA    1     2 */

      case 0xBA: 
          x = sp;
          zeroflag = (x == 0) ? 1 : 0;
          negativeflag = ((x & 0x80) != 0) ? 1 : 0;
        break;



/*TXA  Transfer Index X to Accumulator

     X -> A                           N Z C I D V
                                      + + - - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     implied       TXA           8A    1     2 */

      case 0x8A: 
          acc = x;
          zeroflag = (acc == 0) ? 1 : 0;
          negativeflag = ((acc & 0x80) != 0) ? 1 : 0;
        break;


/*TXS  Transfer Index X to Stack Register

     X -> SP                          N Z C I D V
                                      + + - - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     implied       TXS           9A    1     2 */

      case 0x9A: 
          sp = x;
          zeroflag = (sp == 0) ? 1 : 0;
          negativeflag = ((sp & 0x80) != 0) ? 1 : 0;
        break;



/*TYA  Transfer Index Y to Accumulator

     Y -> A                           N Z C I D V
                                      + + - - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     implied       TYA           98    1     2 */

      case 0x98: 
          acc = y;
          zeroflag = (acc == 0) ? 1 : 0;
          negativeflag = ((acc & 0x80) != 0) ? 1 : 0;
        break;

The Bit Instruction

The BIT instruction can be described by the three flags it effects:

Negative flag: Gets set by the value of bit 7 of the contents at the effective memory address
Overflow flag: Gets set by the value of bit 6 of the contents at the effective memory address
Zero Flag: The result of Anding the accumulator and the memory location

Here is the implementation:

/*BIT  Test Bits in Memory with Accumulator

     bits 7 and 6 of operand are transfered to bit 7 and 6 of SR (N,V);
     the zeroflag is set to the result of operand AND accumulator.

     A AND M, M7 -> N, M6 -> V        N Z C I D V
                                     M7 + - - - M6

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     zeropage      BIT oper      24    2     3
     absolute      BIT oper      2C    3     4 */

      case 0x24: 
      case 0x2C: 
          var tempVal = localMem.readMem(effectiveAdrress);
          negativeflag = ((tempVal & 0x80) != 0) ? 1 : 0;
          overflowflag = ((tempVal & 0x40) != 0) ? 1 : 0;
          zeroflag = ((acc & tempVal) == 0) ? 1 : 0;
        break;

Shifting Instructions

Here is the implementation of all the shift Instructions:

/*ASL  Shift Left One Bit (Memory or Accumulator)

     C <- [76543210] <- 0             N Z C I D V
                                      + + + - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     accumulator   ASL A         0A    1     2
     zeropage      ASL oper      06    2     5
     zeropage,X    ASL oper,X    16    2     6
     absolute      ASL oper      0E    3     6
     absolute,X    ASL oper,X    1E    3     7 */

      case 0x0A: 
          acc = acc << 1;
          carryflag = ((acc & 0x100) != 0) ? 1 : 0;
          acc = acc & 0xff;
          negativeflag = ((acc & 0x80) != 0) ? 1 : 0;
          zeroflag = (acc == 0) ? 1 : 0;
         break;
      case 0x06: 
      case 0x16: 
      case 0x0E: 
      case 0x1E: 
          var tempVal = localMem.readMem(effectiveAdrress);
          tempVal = tempVal << 1;
          carryflag = ((tempVal & 0x100) != 0) ? 1 : 0;
          tempVal = tempVal & 0xff;
          negativeflag = ((tempVal & 0x80) != 0) ? 1 : 0;
          zeroflag = (tempVal == 0) ? 1 : 0;
          localMem.writeMem(effectiveAdrress, tempVal);
        break;



/*LSR  Shift One Bit Right (Memory or Accumulator)

     0 -> [76543210] -> C             N Z C I D V
                                      - + + - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     accumulator   LSR A         4A    1     2
     zeropage      LSR oper      46    2     5
     zeropage,X    LSR oper,X    56    2     6
     absolute      LSR oper      4E    3     6
     absolute,X    LSR oper,X    5E    3     7 */

      case 0x4A: 
          carryflag = ((acc & 0x1) != 0) ? 1 : 0;
          acc = acc >> 1;
          acc = acc & 0xff;
          zeroflag = (acc == 0) ? 1 : 0;
        break;
      case 0x46: 
      case 0x56: 
      case 0x4E: 
      case 0x5E: 
          var tempVal = localMem.readMem(effectiveAdrress);
          carryflag = ((tempVal & 0x1) != 0) ? 1 : 0;
          tempVal = tempVal >> 1;
          tempVal = tempVal & 0xff;
          zeroflag = (tempVal == 0) ? 1 : 0;
          localMem.writeMem(effectiveAdrress, tempVal);
        break;


/*ROL  Rotate One Bit Left (Memory or Accumulator)

     C <- [76543210] <- C             N Z C I D V
                                      + + + - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     accumulator   ROL A         2A    1     2
     zeropage      ROL oper      26    2     5
     zeropage,X    ROL oper,X    36    2     6
     absolute      ROL oper      2E    3     6
     absolute,X    ROL oper,X    3E    3     7 */


      case 0x2A: 
          acc = acc << 1;
          acc = acc | carryflag;
          carryflag = ((acc & 0x100) != 0) ? 1 : 0;
          acc = acc & 0xff;
          zeroflag = (acc == 0) ? 1 : 0;
          negativeflag = ((acc & 0x80) != 0) ? 1 : 0;
      break;

      case 0x26: 
      case 0x36: 
      case 0x2E: 
      case 0x3E: 
          var tempVal = localMem.readMem(effectiveAdrress);
          tempVal = tempVal << 1;
          tempVal = tempVal | carryflag;
          carryflag = ((tempVal & 0x100) != 0) ? 1 : 0;
          tempVal = tempVal & 0xff;
          zeroflag = (tempVal == 0) ? 1 : 0;
          negativeflag = ((tempVal & 0x80) != 0) ? 1 : 0;
          localMem.writeMem(effectiveAdrress,tempVal);
      break;

/*ROR  Rotate One Bit Right (Memory or Accumulator)

     C -> [76543210] -> C             N Z C I D V
                                      + + + - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     accumulator   ROR A         6A    1     2
     zeropage      ROR oper      66    2     5
     zeropage,X    ROR oper,X    76    2     6
     absolute      ROR oper      6E    3     6
     absolute,X    ROR oper,X    7E    3     7  */

      case 0x6A: 
          acc = acc | (carryflag << 8);
          carryflag = ((acc & 0x1) != 0) ? 1 : 0;
          acc = acc >> 1;
          acc = acc & 0xff;
          zeroflag = (acc == 0) ? 1 : 0;
          negativeflag = ((acc & 0x80) != 0) ? 1 : 0;
      break;
      case 0x66: 
      case 0x76: 
      case 0x6E: 
      case 0x7E: 
          var tempVal = localMem.readMem(effectiveAdrress);
          tempVal = tempVal | (carryflag << 8);
          carryflag = ((tempVal & 0x1) != 0) ? 1 : 0;
          tempVal = tempVal >> 1;
          tempVal = tempVal & 0xff;
          zeroflag = (tempVal == 0) ? 1 : 0;
          negativeflag = ((tempVal & 0x80) != 0) ? 1 : 0;
          localMem.writeMem(effectiveAdrress, tempVal);
      break;

This concludes the development for this blog post.

I will not be writing a test program for this blog post since I will be running the Klaus Test Suite in my next Blog. I'm sure we will find a couple of bugs when running this Test Suite!

One Final thought

I am just about done with this Blog Post. We have implemented all instructions we planned for and are ready to run a Test Suite.

One thing that would be nice is to know if we missed an instruction. For this purpose I am adding the following default selector in our switch construct:

      default: alert("Op code "+opcode+" not implemented. PC = "+pc.toString(16));

So, if the switch statement doesn't find any case statements matching the given opcode, it will execute the default label, which will pop up a message.

In Summary

In this Blog we covered implementing the remaining instructions for our emulator.

In the next Post I will give a detailed account on what went right and what went wrong when I tried to run the Klaus Test Suite on the emulator.

Till Next Time!

Thursday, 5 May 2016

Part 6: The Stack and Related Operations

Foreword

In the previous post we covered the compare and branching instructions.

In this post we will cover the stack and related instructions.

Introduction to the stack

A real world example of a stack is a stack of receipts on a spike in a restaurant.

Consider the situation where you have 100 receipts on the spike and you need to get to the very first receipt you pushed onto the spike. This would mean pulling all 99 receipts above this receipt to get to it. Clearly this system is not very suitable for these kind of operations.

However, if you need to get to the last receipt you pushed onto the spike, the task is much simpler, since the last pushed receipt will always be on the top of the stack.

How is the stack implemented on the 6502. Firstly, the stack is located in page 1 of memory. This is memory location 100 to 1FF.

Also, on the 6502 the stack grows downwards. This means from location 1FF towards location 100.

How does the 6502 keeps track where we are on the stack? The 6502 contains a register called the Stack Pointer (SP). This is register is also 8 bits in size, so the ninth bit of the stack address (which is always one), is implied.

The most basic operations of the stack are push and pop. A push puts a byte of data on the stack and decrement the stack pointer (e.g. remember the stack grows downwards). A Pop retrieves a byte of data on the stack and increment the stack pointer.

Implementing the stack

Ok. Lets implement the stack in our Emulator and implement its basic operations, push and pop.

First, we need to create a private variable in our Cpu class for the stackpointer:

  var sp = 0xff;

As you can we initialise this register from the start with 0xff. This is because the stack starts to grow at memory address 1FF.

Next, lets implement the push operation:

    function Push(value) {
      localMem.writeMem((sp | 0x100), value);
      sp--;
      sp = sp & 0xff;
    }

And, finally the pop:

    function Pop() {
      sp++;
      sp = sp & 0xff;
      var result = localMem.readMem(sp | 0x100);
      return result;
    }

Note, in the pop we do the increment of SP before the memory operation. This is the opposite as done in the push. This is because the stack pointer points to the top of the stack, not to the data itself. The data is always one location below the top of the stack.

Implementing Push and Pop opcodes

Time to implement the Push and Pop opcodes.

Two of these opcodes require special mention: PHP Push (Processor Status on Stack) and PLP (Pull Processor Status from Stack). These two opcodes works with all the status flags as a single byte. Currently this is not how we implement status flags in our emulator. To help us out, I will create two convenience methods that will retrieve and set the flags as and with a single byte.

Firstly we need to know which flags the different bits in the status byte represents. We also get this info on the masswerk website:

SR Flags (bit 7 to bit 0):

N .... Negative
V .... Overflow
- .... ignored
B .... Break
D .... Decimal (use BCD for arithmetics)
I .... Interrupt (IRQ disable)
Z .... Zero
C .... Carry

For the moment we will not worry about the break, Decimal or Interrupt flag. We will implement as required.

Next, lets create a method for retrieving the status flags as a byte:

    function getStatusFlagsAsByte() {
      var result = (negativeflag << 7) | (overflowflag << 6) | (zeroflag << 1) |
        (carryflag);
      return result;
    }

Here is the method to set the status flags with a byte:

    function setStatusFlagsAsByte(value) {
      negativeflag = (value >> 7) & 1;
      overflowflag = (value >> 6) & 1;
      zeroflag = (value >> 1) & 1;
      carryflag = (value) & 1;
    }

With this implemented, it is now straight forward to implement the push and pop instructions:

/*PHA  Push Accumulator on Stack

     push A                           N Z C I D V
                                      - - - - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     implied       PHA           48    1     3 */

      case 0x48:
        Push(acc);
      break;


/*PHP  Push Processor Status on Stack

     push SR                          N Z C I D V
                                      - - - - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     implied       PHP           08    1     3 */

      case 0x08:
        Push(getStatusFlagsAsByte());
      break;


/*PLA  Pull Accumulator from Stack

     pull A                           N Z C I D V
                                      + + - - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     implied       PLA           68    1     4 */

      case 0x68:
        acc = Pop();
        zeroflag = (acc == 0) ? 1 : 0;
        negativeflag = ((acc & 0x80) != 0) ? 1 : 0;
      break;



/*PLP  Pull Processor Status from Stack

     pull SR                          N Z C I D V
                                      from stack

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     implied       PHP           28    1     4 */

      case 0x28:
        setStatusFlagsAsByte(Pop());
      break;

JSR and RTS

The JSR (Jump to Subroutine) is almost the same as the Jump (JMP) instruction. The only difference is that JSR remembers the address it was called from, so that when RTS (Return from Subroutine) is called, it jumps back to this address.

Lets have a look at the definition of JSR:

JSR  Jump to New Location Saving Return Address

     push (PC+2),                     N Z C I D V
     (PC+1) -> PCL                    - - - - - -
     (PC+2) -> PCH

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     absolute      JSR oper      20    3     6

From this we see that we do not store the address of the next instruction, but the preceding memory location (e.g. the third byte of the JSR instruction). This is something to keep in mind when executing the RTS instruction.

Something that is not clear from this definition, is the order in which the address is pushed unto the stack. Googling this question doesn't give you a straight answer. However, after some digging I found this link:

http://nesdev.com/6502.txt

At the end of this page it give some snippets of code from the Vice emulator showing how each instruction is implemented. For JSR here is the snippet:

/* JSR */
    PC--;
    PUSH((PC >> 8) & 0xff); /* Push return address onto the stack. */
    PUSH(PC & 0xff);
    PC = (src);

The first line, PC-- means don't use the address of the next instruction, but the preceding byte. This is as expected as we discussed earlier.

The next two lines gives us the answer to our question. First, the high byte is pushed and then the low byte. Not very little-endian like :-) However, when RTS executes, the POPS will return the parts of the return address in the reverse order, which is little endian!

Lets implement JSR:

/*JSR  Jump to New Location Saving Return Address

     push (PC+2),                     N Z C I D V
     (PC+1) -> PCL                    - - - - - -
     (PC+2) -> PCH

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     absolute      JSR oper      20    3     6 */

      case 0x20:
        var tempVal = pc - 1;
        Push((tempVal >> 8) & 0xff);
        Push(tempVal & 0xff);
        pc = effectiveAdrress;
      break;

Let us also implement RTS:

/*RTS  Return from Subroutine

     pull PC, PC+1 -> PC              N Z C I D V
                                      - - - - - -

     addressing    assembler    opc  bytes  cyles
     --------------------------------------------
     implied       RTS           60    1     6 */

      case 0x60:
        var tempVal = Pop();
        tempVal = tempVal + Pop() * 256;
        pc = tempVal + 1;
      break;

Testing

As usual we end off a post with a small test program:

0000 LDA #$52 A9 52
0002 PHA      48
0003 LDA #$07 A9 07
0005 JSR $000A 20 0A 00
0008 PLA       68
0009 00        00
000A SBC #$06  E9 06
000C RTS       60

Here is the program as a byte array to copy to the memory class:

0xA9, 0x52,
0x48,
0xA9, 0x07,
0x20, 0x0A, 0x00,
0x68,
0x00,
0xE9, 0x06,
0x60

I also adjusted getDebugReg() in the Cpu class to also return the contents of the stack pointer. All this you can get in ch6 tag on GitHub.

In Summary

In this post we covered the stack and related instructions.

In the next post I will cover the remaining Instructions we still need to implement in our emulator. This post will be my final post relating to implement 6502 instruction to our emulator.

With the majority of instructions implemented on our emulator, I will cover a post where I give the Klaus Test suite a spin on our emulator and fix bugs it will bring forth.

This is all for this post.

Till next time!