A practical PIC based On Screen Display / Video Time Inserter
This project was inspired by the Pico OSD the author of which I owe my thanks
My first attempt, using a 16F5x, could output pixels (using 'ROTL PORT') at about half the 'normal' AV pixel rate, i.e. 6.25MHz (with over-clocking to 25MHz)). Only by adding lots of external circuits (shift register etc) was it possible to get 12MHz i.e. anything like the PAL AV 'full speed' pixel rate (13.846 MHz).
To get achieve 12MHz (or better) pixels using a PIC alone, we have to move away from the 16Fxx series (which is OSC limited to 20MHz, or about 25MHz with overclocking) to the 18F series. The PIC 18F14K50, for example, supports a CPU CLK of at least 12MHz (to achieve 13.846 MHz means over-clocking**), whilst the PIC 18F13K22 is capable of 16MHz CPU CLK.
**If USB is to be supported by the PIC18F14K50 we can't over-clock it. For USB the primary crystal OSC has to be 12MHz (this is divided by 2 to get 6.0MHz (for USB 1.0, 1.5Mbps) and fed to a x4 PLL to get 48MHz (for USB 2.0, 12Mbps), whilst the CPU then runs at a max of 48/4 = 12MHz).
One advantage of the PIC18Fxx is the on-board Timer (Timer1) that can be used as a RTC (when the secondary OSC circuit is being driven with an external 32.768 kHz Xtal).
This 'solves' the 'time count' issue. So long as the 18Fx time is 'set' by some means after each power-on (eg, manually, by switches, or from a PC, by serial link (or USB) or from some other time source (GPS chip)), Timer1 can be relied on to maintain sub-second accuracy without the need for some other methods (such as counting video frames)
The 18F13K22 lets you use it's OSC circuit with a 32,768 Xtal whilst running the CPU from the internal 16MHz clk (x4 PLL, then /4 for the CPU clk). This eliminates the need for a second Xtal
The PIC18F14K50 has :-
8bit CPU (16 bit Instruction set) 8k Instruction space (16kb) 768 bytes of Register space (RAM - of which 256 can be used as the USB buffer) 256 bytes of EEPROM ECCP (Capture/Compare, PWM max 4) MSSP (SPI/I2C) - SPI Master mode max. data rate is OSC/4 (the data sheet says max OSC 64 MHz = 16.00 Mbps). USART (RS485/RS232/LIN2.0, max. 115,200 baud) An 'S/R latch' A-D Converter (10bit with internal 1.024 Vref) - max speed approx 500ksps 2x Analogue comparator Vref Out Output pins with 25mA source/sink, programmable pull-up, interrupt on change
New instructions
The PIC18Fx instruction set is 'enhanced' with a number of improvements. Whilst some are quite useful, others take 2 CLK cycles to execute so are of little help when speed is the issue
Register Indirect Addressing (INDF, FSR pointer) now supports 3 pointers, each of 12 bits, which can be incremented/decremented and even allows the 'effective address' to be offset by the current value in the Accumulator. Data Tables, using 'Return with data' Instruction, 'costs' the usual single instruction (which is now 16 bits i.e. 2 bytes in size) per entry. So, to allow 'packing' of 2 data bytes per word, there is a new 'Table Read' instruction. This takes 2 CPU cycles to 'read' 1 byte from the Table (in program address space) and copy it to the TBLAT (TableLatch) register and can be faster over-all as the Table pointer can be incremented/decremented as part of the instruction. Another new instruction allows you to copy one register direct to another, however it requires 2 instruction words and thus takes exactly the same instruction space and CLK count time as "Reg1->Acc, Acc->Reg2" pair, but does mean you can now copy data without effecting the Acc. There is still no 'load byte value to register' instruction (you still have to go via the Acc) Because of the larger address space, most Jump and Call instructions are now at least** 2 words (i.e. take 2 CPU CLK's to fetch) which makes it even harder to use subroutines or loops for high speed output.
** the 'page addressing' system means you may well have to set up 'page pointers' before changing the program flow
PIC18Fx pixel output
The 'target' is to 'beat' the previous 16F5x 'solution' of 6MHz pixels (when overclocked to 24MHz, so CPU Clk will be 24/4 = 6Mhz).
The PIC18Fx MSSP (SPI/I2C) circuit (in SPI Master mode) can achieve a max. data rate of OSC/4. This means the 48MHz device can achieve 12MHz and the 64MHz device 16.00 Mbps.
HSync synchronisation
To ensure the superimposed text is 'stable' it is necessary to detect the Vertical and Horizontal 'sync'. Whilst most people will immediately order up an LM1881 (or similar) the PIC internal comparator circuit is quite adequate for this task.
However, all that's needed is to program the internal voltage reference module with the video '0v clip' (black or 'blanking' level). The PAL AV video specification is 0v (black) to 0.7v (peak white) with sync from 0v to -0.3v, so any voltage below the 'clip' level are 'sync' pulses ( NOTE however, that digital CCTV cameras typically set sync = 0v, black = .3v, white = 1v - and reply on the receiver to 'recover' the 'blanking level' clip voltage.
When outputting at 6MHz (half speed == double width pixels) we don't have to care about 'odd/even' fields (i.e. the 'double width' pixels are 'painted' in both field's thus resulting in 'double height' result). However 'full speed' pixels need to be output 'interleaved' (7 pixels high = 4 pixels of one files + 3 in the other)
We set up the PIC comparator to trigger an interrupt each time the video voltage falls below the clip level, and again when it returns to the clip level. This allows us to measure the width of the Sync pulses. A normal 'line sync' (Horizontal sync) is about 4uS long (64 CPU clk at 16MHz), whilst a 'long sync' (Vertical sync) is about 30uS (480 CPU clk at 16MHz). The full vertical sync (i.e the frame start) sequence can be found in the PAL specification. It is a series of 'long' and 'short' sync pulses. All fields have five long sync's (at 28-30uS each). To detect these, one of the PIC's timers will be used (one is used as the RTC (Real Time Clock) and the final counts scan lines). The exact sequence of short/long/short sync pulses indicates the 'field'. An 'odd field' starts 6 short, followed by 5 long, and then 5 short, whilst an 'even field' starts 5 short, then the 5 long, and finally 4 short. Both short and long sync 'lines' are one half the normal scan line rime (i.e 64uS/2 = 32uS, 512 CPU clk), so the 'gap' between long sync pulses is only 2-4uS (32-64 CPU clk) whilst the gap between short sync's is 30uS (480). If all characters use 'double line high' pixels (which implies double width), the Interrupt routine only needs to distinguish between the 'long sync' and 'any other sync' (i.e. we don't care if it's an odd or even field). When 5 long syncs are seen, this indicates a 'frame start' after which 'any' syncs can be counted. When the correct line count position is reached (somewhere within the 305 visible scan lines) the Interrupt routine will initiate the 'pixel paint' code that waits for some number of CPU (OSC/4) clocks (in 64uS there will be about 1,000 of them) and then outputs the pixels. By changing a PIC pin from the 'high Z' state to 'output drive' mode, we can impose either 0v or +1v pixels (i.e. black or white solid pixels) as 'superimposed' text over the existing video image background
To stay within the PAL spec, a 'white' pixel should be no more than 0.7v and black no less than 0. There is no problem with the blacks - but to achieve 64MHz the PIC supply (Vdd) must be at least 3v - and thus a logic '1' output will be approaching that level. Fortunately all video displays incorporate 'input clamp diodes' that will prevent an 'over voltage' damaging their input amplifiers, however it's not 'good practice' to be driving a 0.7v spec. line at 3v or above ...
So to drive 'white' pixels it's a 'good idea' to fit a series resistor. This can be calculated using the PAL AV spec. '75 ohms input impedance' of the display (i.e. assume that the video line is loaded with 75 ohms to Gnd, so your series Resistor forms a 'divider'). At 3v Vdd, R series will be about 220 ohms.
Note that all the 'character pixels output' is done within the Interrupt routine, so the 'main-line code' only has to set up the character 'shape' pointers, monitor the user 'time setup' switches (and monitor the serial comms link for new instructions from any external controller) - and maintain the RTC clock.