Path: news1.icaen!news.uiowa.edu!news.physics.uiowa.edu!math.ohio-state.edu!howland.erols.net!portc02.blue.aol.com!audrey01.news.aol.com!not-for-mail From: regnirps@aol.com (Regnirps) Newsgroups: comp.sys.apple2 Subject: ARM and 6502 Lines: 99 Message-ID: <1998052107100200.DAA08978@ladder01.news.aol.com> NNTP-Posting-Host: ladder01.news.aol.com X-Admin: news@aol.com Date: 21 May 1998 07:10:02 GMT Organization: AOL http://www.aol.com Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit Xref: news1.icaen comp.sys.apple2:134123 ARM Simulation of 65C02 Name some of the 16 registers available. 14 are general purpose, number 14 is the link register and 15 is the program counter. STACK RN 4 ; The 65C02 bit stack pointer XREGISTER RN 5 ; The 8 bit X register YREGISTER RN 6 ; The 8 bit Y register ACCUM RN 7 ; The 8 bit accumulator PC02 RN 8 ; The 8 bit program counter ESTATUS RN 9 ; The 8 bit status register ECODE RN 10 ; Simulator’s pointer into 6502 code. EP RN 11 ; Simulator’s pointer to base of code table. ETemp RN 03 ; Calculate offset into code table. INTERPRET RN 02 ; Holds address of interpreter. CRAM RN 12 ; Base of 65C02 RAM SP RN 13 ; ARM Stack Pointer LK RN 14 ; Link PC RN 15 ; Program Counter LEA INTERPRET,EXECUTE ; Load effective address of interpreter. The main idea is to treat 65C02 instructions as tokens like in BASIC or Java or some forms of Forth. The instruction is used as an 8 bit address of the code we want to run in order for the instruction to be emulated. If the instruction is left shifted by 6 bits (multiplied by 64). A shifted instruction will point to a unique 64 byte section of memory. This is enough room for16 ARM instructions, so the total emulator space will be 64 x 256 or 16 kBytes plus a few for the interpreter, which is less than10 instructions in its simplest form. From here on you simply disect the actions of the instruction and do them as if you were the “microcode” inside a 65C02. Lets say the 6502 program counter points to some location that contains an instruction -- always true at this point in the code. We fetch the byte and multiply by 16 to get the address of the code to simulate this instruction. That will be a fetch and left shift by 6 and a jsr to that loaction. EXECUTE LDR Etemp,[ECODE],#1 ; Post increment CP B ETemp,ASL #6 ; Thats ARMish for Branch. Lets try one of the hardest first. Say the instruction is the pre-indexing load indirect. LDA (ZP,X) Where ZP is an immediate zero page address. Op-code and data are A1 ZP LDA (ZP,X) says add ZP to X and the result points to a 16 bit address in zero page which points to the data to load. The A1 has already been used to get the address of the code below so ECODE points to the value ZP. Since a LDA trashes the contents of A, we can use A (ACCUM) as a temporary location for calculating the final address. ; The code we just branched to is: EMA1 LDRB ACCUM,[ECODE],#1 ; ECODE points to ZP. Post inc ECODE by 1. ADD ACCUM,ACCUM,X ; ACCUM = X + ZP LDRB ETEMP,[ACCUM],#1 ; Low byte of Zpage data, post inc ACCUM LDRB ACCUM,[ACCUM] ; ACCUM = high byte. OR ACCUM,ETEMP,ACCUM LSL 8 ; Put them together. LDRBS ACCUM,[ACCUM] ; ACCUM = (ZP + X), set status bits. MOV ESTATUS,PC ; Move ARM status bits to our status reg. MOV PC,INTERPRET ; Go do another instruction. 10 ARM instructions here and 2 for the interpreter (and I probobly did this the long way) gives 12 ARM instructions. 6502 is 6 cycles for a 12/6 ratio. Note that if we have a spare register, we can keep the address of the interpreter, called EXECUTE above in it and go directly to the interpreter at the end of an instruction group with MOVE PC,INTERPRET This puts the goto in the local code instead of the interpreter and can be more flexible for some things but less for others. But it saves one ARM instruction every time. I don’t know what it does tot he pipeline. How about a simpler load from an absolute address like LDA ZP or absolute zero page? LDR ACCUM,[ECODE],#1 LDR ACCUM,[ACCUM] MOV ESTATUS,PC LDR PC,R2 4 ARM instructions here and 2 from interpreter = 6. 6502 is 3 cycles (I don’t have a 65C02 book handy. But some are shorter in the C02). for a 6/3 ratio. Well, if this keeps up, we get about a 2/1 ratio which implies: 40 MHz ARM7500 = 20 MHz 6502 60 MHz ARM 7500 = 30 MHz 6502 200 MHz SA1100 = 100 MHz 6502 I’m sure a good ARM programmer can do better. Charlie Springer