FIRMWARE

<< Hardware   ·    Home   ·   Character Set >>

The routine which generates all SVGA signals and supports PS/2 keyboard and UART is located in OC1 (Output Compare 1 module) interrupt. User can use the other interrupt sources, but they must have the lower priority level.

SVGA timings for resolution 800x600 are represented on the drawing (1 pixel = 25 ns). Here are detailed timings for 60Hz refresh rate:

Horizontal timing:

Pixel frequency: 40 MHz
Horizontal frequency: 37.88 KHz
Visible area: 800 pixels (20us)
Front porch: 40 pixels (1us)
Sync pulse: 128 pixels (3.2us)
Back porch: 88 pixels (2.2us)

Vertical timing:

Vertical frequency: 60 Hz
Visible area: 600 lines (15.84ms)
Front porch: 1 line (26.4us)
Sync pulse: 4 lines (105.6us)
Back porch: 23 lines (607.2us)
Whole frame: 628 lines (16.58ms)

Dot clock for 800x600 resolution @ 60 Hz vertical frequency is exactly 40 MHz, which makes it very convenient for signal generation by PIC @ 80 MHz, as the instruction cycle frequency is 40 MHz. So, each pixel on the screen corresponds to one instruction cycle of the microcontroller.

It would be easy to make the software generated video in graphic mode for full 800x600 resolution with 256 colours, but there is one problem - amount of internal Video RAM required for that application. You would need 480 Kbytes, but PIC with the largest internal RAM at this moment is dsPIC33FJ256xx710, with only 30 Kbytes of RAM. Although it is possible to add external RAM, it would be not be usable, as the access would be too slow.

If you base your project on the text mode only, it will decrease the number of possible applications, but it can be realized with a small amount of video memory. In this project, it needs just 60x25=1500 bytes for character storage, and the same quantity for attribute storage (colour and blinking bits). So the whole video memory takes only 3K of internal Data RAM. About 1K more is used for housekeeping. If the demo software is enabled, it will need additional 1.3K of Data RAM.

The problem with the video firmware which supports text mode is that it is much more complicated. It has not only to read the byte and to move it to the port, but also to "pass it through" the character generator, to combine the result with attribute bytes (ink and paper colour), keep track with character row counter, turn the output on and off for blinking and to draw cursor - all that in real time.

Of course that all that can not be done in only one instruction for each screen pixel, so the dot clock had to be lowered. In this application, one character pixel is actually two screen pixels wide, which gives the final resolution of 400x300 pixels. Still the requirements for the software are high, and the explanation of the algorithm might be a little bit confusing. If you are not interested in details, skip it an jump to the next page. You do not have to understand the theory of operation to embed this project in your application. The only significant thing is that each character in video memory is represented by two bytes: the first one contains the ASCII character, and the second one is the argument byte for that character. The format of those 2-byte locations is represented on the next drawing. Video memory contains the total of 1500 such words:

THEORY OF OPERATION

The video signal generation for each text row is performed in four steps:

1. NEXT COLOUR SETUP. The first step starts even before the row scanning started, in the scan lines 22 and 24 of the previous text row. Here, the colour attributes for each character of the next text row are prepared in the 60-byte part of the line buffer. This task is represented by green areas on the text row drawing.

In the simplified example represented on the drawing at the right side, the odd locations from the video memory are conditionally (depended on blink state) transferred to the LINE BUFFER 2 or LINE BUFFER 3 (those buffers are used alternatively, as the routine reads from one buffer until it writes to another one)

2. CHARACTER SETUP. When the new frame generation starts, the LINE BUFFER 2 (or LINE BUFFER 3) writing is completed, and the preparation of pixels for the first two scan lines begins (yellow areas on the text row drawing). This must be completed before the actual pixel generation starts.

Even locations from the video memory are translated in the character generator and written to LINE BUFFER 1. The high byte of address for character generator is actually the scan line counter divided by two.

3. CURSOR SETUP. This routine adds cursor pixels on the areas on the screen which are defined by cursor1 and cursor2 X and Y positions, with colour defined by cursor1 and cursor2 colours and cursor blinking attributes. This is represented by pink areas on the text row drawing.
4. VIDEO OUTPUT GENERATION. This is the main part of the routine, which combines data in "character pixel" line buffer and "colour" line buffer to generate the VGA video signal. This is represented by the cyan areas on the text row drawing.

STEP 1   Next colour setup (scan lines 22 and 24 only)

The first step presets the 60-byte portion of the line buffer with the color data. It picks the bytes from the odd locations of the video memory (high bytes of regular words), which are reserved for argument storage, and writes them to the special 60-byte line buffer.

As there is not enough time to process all 60 bytes in a single scan line slice, which is reserved for that job (notified by green areas on the text row drawing), the task is divided in two parts. The first one, contained in the subroutine LINE4,  presets the first 30 locations of the line buffer, and LINE5 presets the rest of the line buffer. In the addition, there are two 60-byte line buffers for temporary argument storage - one is at LINE_BUFFER+60 (we shall name it Line Buffer 2 here), and the another one is at LINE_BUFFER+120 (Line Buffer 3). The reason is that the whole line buffer has to be ready when the text row begins, and the program has to preset the next row while the first one is read; so, the two line buffers are written/read alternatively, and the bit FLAG,#14 decides which one shall be read, and which one written. The routine toggles that bit after each frame generation.

This routine does not only move data from the attribute (odd) bytes of the video memory to the line buffer, but also processes blinking for both ink (foreground) and paper (background). RGB ink (foreground) bits (012) are copied to the line buffer, but they are reset to 0,0,0 IF  the blink bit (3) is set AND IF blinking counter output (FLAG,#11) is set. That is why blink bits (3 and 7) from the video memory are represented as 0 in the line buffers - they are already "embedded" in RGB bits and they are not needed any more. In the same step, the equal operation is done with the upper nibble of the byte, with RGB paper (background) bits 456. The whole operation is significantly speeded up by using the lookup table BLINKTAB, which contains the outputs for all possible inputs in this logical operation.

Note that Line Buffer 1 is built at the beginning of the each odd scan line (12 times during one text row generation, each time with the new scan line for character generator reading), but Line Buffers 2 and 3 only once for one text row, as they will remain the same for all scan lines. Actually, not both buffers but only one of them - the one which is not used for video signal generation at that moment.

The first step is realized at the end of each text row generation, but its results will be used in the next text row. It is represented by the green areas on the next drawing.

 

STEP 2   Character setup (all odd scan lines)

This step reads B&W character from video memory, translates it through the character generator (only for the required row, placed in WREG1H) and puts the output from the character generator in Line Buffer 1 (LINE_BUF...LINE_BUF+59). Only two instructions are needed for each byte:

                                                                                  
   mov.b    [w3+0],w1    ; in the next cycles, it will read from w3+2, w3+4...    
   tblrdl.b [w1],[w5++]  ; read character generator and put pixels in line buffer 
                                                                                  
where:

w3 = video memory read pointer
w1 = ASCII byte
w5 = line buffer write pointer

There is no time for looping, so the whole sequence of 60 bytes is realized in a string of 120 instructions. There is one more problem which could slow down the operation - RAW, Read After Write dependency. This problem is solved by using two sets of registers alternatively, to make some kind of pipeline. The final routine looks like:

                                                                                  
   mov.b    [w3+0],w2    ; read byte 1 from video memory (fill the queue)         
   mov.b    [w3+2],w1    ; read byte 2 from video memory                          
   tblrdl.b [w2],[w5++]  ;
read character set for byte 1 and write to line buffer 
   mov.b    [w3+4],w2    ; read byte 3 from video memory                          
   tblrdl.b [w1],[w5++]  ;
read character set for byte 2 and write to line buffer 
   mov.b    [w3+6],w1    ; read byte 4 from video memory                          
   tblrdl.b [w2],[w5++]  ;
read character set for byte 3 and write to line buffer 
   mov.b    [w3+8],w2    ; read byte 5 from video memory                          
   tblrdl.b [w1],[w5++]  ;
read character set for byte 4 and write to line buffer 
                         ; ...                                                    
                                                                                  

...and so on, until the byte 60 is translated and written into the line buffer.

All this is done at the beginning of subroutine LINE1 (yellow areas on the drawing), before the actual video signal output starts. It should be noted that this step has to be done only at the beginning of odd lines, as the same line buffer contents will be used in the next scan line (note that one pixel is represented by 2x2 screen pixels).

 

STEP 3   Cursor setup (even scan lines 6...20)

This step writes the graphic line for two cursors in both line buffers, one for colour (Line Buffer 2 or Line Buffer 3, prepared in STEP 1) and the another one for data (Line Buffer 1, which was prepared in STEP 2). Subroutine LINE2 does this, but as the program calls it only at the beginning of even lines, when one character graphics (odd) scan line was already displayed without the cursor, it will be visible only in even lines. This makes the cursors pseudo-transparent, as the block does not cover all lines of the character. This is represented by pink areas on the previous drawing.

 

STEP 4   Video output generation (all scan lines)

This is the most important and critical step, and it is represented by cyan areas on the previous drawing.. Each dot data (RGB ink or RGB paper) is created in two instruction cycles. Those "magic" instructions are:

                                                                                   
   and.b   w1,[w2],w3     ; mask out all except the target bit from Line Buffer 1  
   and.b   w4,[w3],[w5]   ; ...if it is =0, translate it to paper colour           
                          ; ...if it is =1, translate it to ink colour             
                          ; ...and write it to the output port addressed by w5     
                                                                                   

where:

w1 = 0b00100000 for the leftmost pixel in the character
         0b00010000 for the next pixel
         0b00001000 for the next pixel
         0b00000100 for the next pixel
         0b00000010 for the next pixel
         0b00000001 for the rightmost pixel in the character (which is used only for frame pseudographics)
w2 = Line Buffer 1 (for character data) pointer. This pointer will be post-incremented only after the last (6th) pixel output.
w3 = Byte from Line Buffer 1, used for translation via BITTAB (the high byte is already preset to the page in internal RAM where the table is located).
w4 = Colour (bits 012= Ink, bits 456=Paper), taken from Line Buffer 2 or Line Buffer 3
w5 = Literal address of output latch, used as the output port for video signal out (e.g. #LATB).

Only the rightmost pixel (bit 0 in the character set) is represented as three screen pixels (1.5 character pixels), as it is displayed during the execution of three instructions. The 3rd "extra" instruction in the routine reads the colour byte from the "argument portion" of the line buffer (Line Buffer 2 or Line Buffer 3). That makes the rightmost 50% pixel wider than the other ones, but it does not affect the character shape, as this pixel is used only by the pseudographics frame characters, and all the ASCII characters use this pixel only for horizontal spacing. So, the character spacing is actually 150% of one character pixel.

There is the same problem with the execution speed, like in the STEP 2. As the program has to be highly optimized for speed, there is no time for looping, so the whole routine must be located in one string, about 800 instructions long. Also, the same problem with RAW is present, so there are two sets of registers used alternatively here.

Literals 0b0010000 to 0b00000001, which are represented in w1 in our example, are actually loaded at six registers, w9...w13. This seems like the very "bad economy" in microcontroller resources using, but it was the only way to maintain the required speed, as there was no time to load or shift their contents in the real time. Anyhow, there are no consequences "visible" from the user's program, as all those registers are saved on the stack at the beginning of the interrupt routine.

In order to correct the uneven delays caused by interrupt latency, the routine aligns its flow with TMR2 contents at the beginning of each scan line (#hor_align is the literal which defines the horizontal position of text):

                                                          
   neg      TMR2,WREG      ; get -TMR2                    
   add      #hor_align,w0                                 
   repeat   w0                                            
   nop                     ; loop #hor_align-TMR2 cycles  
                                                          

After this sequence, the program flow is synchronized with TMR2. As the OCx (which is driven by TMR2) generates the horizontal sync, the video signal is now synchronized with the program flow and the video signal is jitter-free.

 

This drawing explains the pixel generation by two  and.b  instructions in STEP 4 operation. At the first instruction cycle, this literal is ANDed with the Line Buffer 1 contents, and the result is written in w3 (red in this drawing). In the program, registers w6 and w7 are used alternatively (because of the RAW dependency problem) and the high byte of both registers are preset to the BITTAB page in RAM (this table is aligned to 0x100). The result of this operation is that the byte addressed by w3 will contain ink bits (012) set if the currently processed bit (5...0) in Line Buffer 1 is set, and paper bits (456) set if the processed bit is reset. So, in the next AND operation, this output will set output port "ink" pins if the bit was set, or "paper" pins if it was reset.

Lookup table BITTAB is created in Data RAM during the initialization process, and, although it occupies 34 bytes, only 7 of them are actually employed (one for "no bit set", and six for "bit N set" in the current character row pixels). All other bytes are never accessed by the routine, and the smart (and crazy enough) programmer can even use them as general-purpose Data RAM, if the RAM space becomes critical.

 

<< Hardware   ·    Home   ·   Character Set >>