Accurate video deinterlacing
Pixop Deinterlacer is our filter for enhancing the perceived visual quality of deinterlaced video compared to classic algorithms. This solution significantly reduces various aliasing artifacts that appear when fine details are deinterlaced. In a nutshell, this AI filter can reduce:
- Spatial aliasing
- Moiré patterns
- Interline twitter (shimmering)
Our deep convolutional neural network (CNN) architecture uses a combination of spatial and temporal filtering, learning how to spatially deinterlace frames and then optimally combine the effects of motion and temporal imperfections to generate the deinterlaced output.
During the learning phase, the CNN is presented with tens of thousands of image pairs of artificially degraded and perfect image patches. These degradations have been carefully engineered to resemble the type of deinterlacing artifacts commonly found in video.
We performed extensive validation on the trained model using several different video sources to ensure that the output is consistently attractive to the end-user.
How it works
Video is processed in blocks of two interlaced video frames as input. An enhanced block of four deinterlaced frames is produced via inference using our pre-trained neural network model as shown in the diagram below:
This type of multi-frame approach is common among deinterlacing algorithms as it allows better filtering to be achieved for regions in a frame with little or no motion.
A comparison to other deinterlacers
We conducted a test in November 2020, of Pixop Deinterlacer's performance relative to a couple of other algorithms on the 15-second pedestrian_area sequence which is part of Derf's Test Media Collection at Xiph.org.
Initially, the source video was downscaled and cropped from 1080p HD to 720x576 pixels via FFmpeg in order to produce a ground truth baseline. From the ground truth SD, we then created an interlaced version using FFmpeg with parameters "tinterlace=interleave_top", which creates a top-fields first interlaced version.
We then ran four deinterlacers on the interlaced version:
- YADIF, Bob Weaver and Weston Three-Field: video filters built into FFmpeg 4.3-2 using default parameters; output encoded via lossless FFV1
- Pixop Deinterlacer: production model available in our web app; output encoded via H.264 @ 37.2 Mbps
- YADIF: 42.33 dB / 0.987
- Bob Weaver: 43.75 dB / 0.990
- Weston Three-Field: 42.36 dB / 0.985
- Pixop Deinterlacer: 45.19 dB / 0.993
In this test, Pixop's Deinterlacer is clearly the most accurate performer of the three, both in terms of PSNR and SSIM (higher numbers are better).
Here are two examples of original frames (143/144 and 293/294) in consecutive pairs along with the interlaced version:
Example 1: Ground truth frames 143 and 144, interlaced frame 72
Example 2: Ground truth frame 293 and 294, interlaced frame 147
For each of these frames, here are zoomed-in 75x150 pixels cropouts of the original ("ground truth") and the interlaced frame, along with the result of applying the YADIF, Bob Weaver, Weston Three-Field and Pixop Deinterlacer respectively.
Frame 72 (interlaced) / 143 (progressive) cropouts
Frame 72 (interlaced) / 144 (progressive) cropouts
Frame 147 (interlaced) / 293 (progressive) cropouts
Frame 147 (interlaced) / 294 (progressive) cropouts
With the Pixop Deinterlacer, you get the desirable mix of both accuracy, sharpness and anti-aliasing of fine vertical details, as demonstrated in these cropouts.
What type of video material is the Pixop Deinterlacer best suited for?
Ideal for interlaced video from any source, as long as the file metadata correctly states the interlaced field ordering.
- The maximum output resolution is currently UHD 4K (3920x2160 pixels).
- Determination of field order relies on the file metadata being consistent with how the video was actually interlaced.
- Camera motion reduces the anti-aliasing properties of the Pixop Deinterlacer.
- For optimal results, don't apply any filtering during the interlacing process.
Expected runtime performance (frames per second (FPS) is stated, as well as the time to process 1 hour of PAL video in parentheses from 50i to 50p):
- SD - 90 FPS (33 minutes)
- HD - 12 FPS (4 hours)
- 4K - 2.6 FPS (19 hours)
Note that these numbers assume H.264 encoding being used. Very CPU intensive codecs like H.265 and/or extremely high bitrates will yield less performance.