Subject: Re: Q: Timing: How many clock-cycles to print one char on screen? Newsgroups: comp.sys.apple2 From: dempson@actrix.gen.nz (David Empson) Date: Tue, 3 Oct 2000 03:02:49 +1300 Message-ID: <1ehwll4.s6wwzpro10krN%dempson@actrix.gen.nz> References: Organization: Empsoft User-Agent: MacSOUP/2.4.2 NNTP-Posting-Host: 202.49.157.176 X-Original-NNTP-Posting-Host: 202.49.157.176 X-Trace: 3 Oct 2000 03:13:52 NZST, 202.49.157.176 Lines: 114 Holger Picker wrote: > I've got another (hardware-)question concerning the AppleII: What is the > exact timing of the 6502 (65C02)? I guess there are differences between > the NTSC and the PAL-version (well, so it was with the CBM 64), and some > Apples (especially the AppleIIc) may have had 2 Mhz (or more). The original Apple ][ and subsequent NTSC derivatives (IIe, IIc) run at a master clock frequency of 14.31818 MHz. This is divided by 14 to get the CPU clock frequency, which is 1.0227... MHz, and various other internal clock signals are also derived from the same source. There is a complication: to comply with NTSC video timing, the Apple ][ stretches every 65th CPU clock cycle by two cycles of the 14 MHz clock. The PAL version of the Apple IIe and IIc run at a slightly slower master clock frequency: 14.25 MHz. I don't think it needs the stretched cycle. The Apple IIgs has two timing subsystems. One of them, which includes the video generation, is exactly the same as the NTSC IIe (14.31818 MHz). The CPU, ROM and "fast" RAM operates from a different clock, and can run at a higher speed (up to 2.8 MHz), but it must synchronize with the "slow" clock (1.0227 MHz) whenever the CPU reads or writes RAM or I/O locations in the "slow" part of the system. The Apple IIc+ is just like an Apple IIc as far as bus speed goes, but it has a built-in 4 MHz accelerator which allows code to execute faster for cached read operations. There are various other accelerators available for every model, which cause the execution speed of code to vary. > To put it in a nutshell: How many (processor-)clockcycles does it normally > take to print one char (textmode) on screen? A memory read or write on the 6502 always takes one clock cycle. The entire time required to execute an instruction depends on the number of cycles for that instruction. This can usually be calculated by looking at the number of bytes in the instruction and adding one for each data access, but some instructions also require one or more "internal operation" cycles. The minimum instruction execution time is two cycles and the maximum is 7 for a 6502 (longer for a 65816, as used on the IIgs; the 65C02 may also have some 8 cycle instructions in a rather limited case). In the case of a write to screen memory, the fastest addressing mode would be absolute. This requires reading a three byte instruction and storing the contents of a register into screen memory (four cycles total). Most code would be more likely to use an indexed mode to write to screen memory, which adds at least one CPU cycle. You also have the overhead of fetching or calculating the byte to be written, and the complicated addressing scheme of the lines tends to slow things down somewhat (table lookup can level this out). If the characters were being output one at a time by calling the COUT routine in the monitor, then the output rate is significantly slower. The actual cycle which writes the character to screen memory is always at 1.0227 MHz. Assuming an unaccelerated NTSC IIe using an unrolled loop to write the same character to every text screen location (i.e. lots of STA absolute instructions), you could write one character every 4 CPU cycles, i.e. about 255,000 characters per second. On an accelerated machine, only the write cycle to screen memory would be forced to slow down. Most of the other cycles required to execute each instruction would be able to run at the speed of the accelerator, depending on caching issues. > Or how many cycles will it take to plot one graphic byte (= 7 hires > pixel) on screen? Same as the text screen: the memory write itself requires one CPU clock cycle, and the fastest CPU instruction which could do this requires 4 CPU cycles. > I could need that as I plan to include emulating of the video-bus byte > (read $c050 etc as found in e.g. 'Drol'). This is reading back the value which was last fetched by the video hardware for output to the display. This has no direct connection with the CPU writing to video memory. The Apple ][ memory ("slow" memory in the case of the IIgs) is accessed twice every CPU cycle: the CPU gets to access it in one half, and the video display gets to access it in the other half. In the case of IIe/IIc/IIgs double resolution modes (80 column text, double hi-res graphics), two memory bytes are fetched simultaneously from main and auxiliary memory. Reading certain I/O locations has the side effect of reading the last video byte value, because these I/O locations do not drive the data bus for a read access, allowing the CPU to see the value which is still floating there from the previous video access. To make use of this, the video bytes would either have to be unique on the screen, or the program would have to synchronize with the vertical blanking interval so that it could identify which part of the screen it is looking at. There is no way the program could keep up with every video byte, since it can't read them faster than every fourth CPU cycle, much slower if it wants to do anything with the byte. > I also could need precise information on the actual amount of video lines > (visible and non-visible) for emulating the display output. Thanks a lot > in advance. If you need this much detail, I'd recommend buying a book such as Jim Sather's "Understanding the Apple IIe", which is probably still available from Byte Works . -- David Empson dempson@actrix.gen.nz Snail mail: P O Box 27-103, Wellington, New Zealand