How a 3:2 Pull Down Cadence works when converting Film to Video Telecine
Much of the “HD” content we see is not actually shot in HD. *GasP* I know, those dirty liars. It is shot on Film, which is very high resolution, but doesn’t have the frame rate that your home theatre does. A long time ago the US picked 60hz for its AC power, so when Television was introduced in order to have it not get interference from the power, or give you a nasty headache from the flicker of lights, it was decided that it would update its picture at 60hz as well.
I’m over simplifying but only a little, there are some technical reason why 60hz is really 59.94, but for the purpose of this discussion they aren’t relevant.
Film was shot at 24 frames per second, but because it was decided that it should align with TV, it changed its frame rate to 23.976 which is 4/5ths of the frame rate of TV 29.97. This is half of the 59.94 I just told you was the rate the picture updated on your TV which is true, your TV in SD updates one Field every 59.94, in 720p and 1080p it updates one Frame at 59.94, and in 1080i it updates one field every 59.94, two fields make a frame.
When doing 3:2 (the process of converting film to video) in Standard definition you have to convert 4 frames, to 10 fields. The first logical step is to simply take the 4 frames and divide them in to their 8 fields.
Presto! We have 8 Field we are most of the way done.
In NTSC fields are presented Bottom field first. For the purpose of this discussion that is represented as the “Black” Fields. Using the 8 fields from above we create one new frame by repeating two of the fields.
Because each Frame of Film is a single moment in time, but each field has a specific Spatial placement, when we “repeat” a field we don’t repeat it right away… Remembering that bottom field display before Top field, we display frame 1 and 2 for two fields each, and then frame 3 for three fields, and frame 4 for three fields.
“Folded” in to frames you See the Following. notice the 4th Frame is what is referred to as dirty. It is made up of two half frames. This is not visible on an interlaced display,
Why does this all matter? if you don’t use the proper cadence bad things can happen. For example you could spatially displace content if you don’t use the correct top or bottom field this results in the content bouncing up and down on a regular pattern. Using the wrong cadence can result in displays that do up-conversion to 1080p or from 480i to 720p, or even 1080i downconverted to 720 may not know which fields to match to reconstruct a given frame.
If the above example is an SD signal, and we want to display it at 720p 60, we would throw out the dirty frame, and Present frame 1 and 2 twice, and frame 3 and 4 three times. This would give us the highest resolution image possible with out distorting the amount of time each frame was on the screen more than we would have to. A cheap upconverter won’t detect the cadence, and instead will present each field as a frame, this results in 60 frames that are created from 640×240, rather than 640×480, effectively half the resolution that the image could be.