Typically slow-motion is achieved when each film frame is captured at a rate much faster than it will be played back. When replayed at normal speed, time appears to be moving more slowly. A term for creating a slow-motion film is over cranking which refers to hand cranking an early camera at a faster rate than normal (i.e. faster than 24 frames per second). Slow-motion can also be achieved by playing normally recorded footage at a slower speed. This technique is more often applied to video subjected to instant replay than to film. A third technique that is becoming common using current computer software post-processing (with programs like Twixtor and Kandao) is to fabricate digitally interpolated frames to smoothly transition between the frames that were actually shot.
Before we understand how slow-motion works, we need to unravel what exactly refresh rate is, and to do that, we need to know how displays work. There’s a lot of technical stuff involved here, but at its most basic level, a display works by showing you a series of images, or “frames”. To make a video, displays need to show a series of frames, one after another. The “refresh rate” is how many times an image is updated per second. So, a 60Hz display refreshes its image 60 times a second. This is obviously too fast for your brain to track, so it’s tricked into thinking it is watching a moving image rather than a series of single frames.
A higher refresh rate means more images are shown in the same amount of time, which means any movement between each frame seems smoother. Because there are more frames, it reduces the gap between individual frames. While not something you’re likely to consciously notice, most people can feel some difference between refresh rates. So, the whole process will feel more responsive, as it seems to react more quickly to your commands.
Smartphones are getting more and more powerful, but with last generation’s hardware still holding its own, the jump from generation-to-generation doesn’t seem as great as it once did. Where are manufacturers to go when a new phone doesn’t feel more powerful than last year’s device? One alternative is to make it feel smoother and more responsive — and a great way to do that is to increase the refresh rate of its display.
It sounds similar to your graphic processor’s frame rate, and that’s because it is. Frame rate is measured in frames-per-second, or “fps”, and that is how quickly a graphics processor can process and deliver individual images to your display. You’ll need a monitor with a refresh rate of at least 120Hz to display 120fps at its finest. However, while the refresh rate is similar to fps, it’s not the same thing. The refresh rate is tied to the monitor itself, while the frame rate is how quickly information is sent to your monitor by your graphics processor.
But a higher refresh rate isn’t just about day-to-day usability. Gaming performance is one of the biggest beneficiaries of a higher refresh rate. A display with a higher refresh rate also has a lower input lag. Input lag is the time between an action being triggered on the display and it taking place in the game. A standard 60Hz display cannot have an input lag faster than 16.63ms because that’s how long it takes for each image to refresh, while a 120Hz display can reach 8.33ms, as it refreshes twice as often.
It’s easy to spot poorly-edited videos when an editor takes standard frame rates and just changes the speed in Adobe Premiere Pro or Final Cut Pro X. Footage looks choppy and awful.
Kandao, makers of the Obsidian and QooCam line of 360 and VR cameras, has just released software that enables video shooters to take normal video and slow it down by up to 10X while looking still looking smooth. The catch is that the software only works with Kandao cameras — for now.
Kandao’s software helps that out by using machine learning to fill in the gaps in the footage so that it doesn’t look awful. Depending on the original frame rate the video was shot in, you can even get some insane results, such as footage slowed down to look as if it was shot with a 1200 fps capable camera.
The magic, Kandao says, comes from its machine-learning and neural networks to create smoother interpolated frames compared to existing software, such as Twixtor, which uses optical flow technology.
NVIDIA “Super SloMo” Makes Video Smooth
All of the most recent cameras have raised the bar when it comes to frame rates. It’s common to have access to a camera that shoots 120fps in Full HD these days. There is, however, a set of limitations that we have to deal with, like all sorts of thermal, buffer, and card issues – key factors that make the price of a true super slow-motion camera extremely expensive.
But again, the advancement of technology seems to be on the content creators’ side as Nvidia has developed a software algorithm to apply super slo-mo to any kind of footage, and all through the power of machine learning. Here’s how you can pull this off on your own.
Basically, in deep learning, you write an algorithm, and then you train it to recognize some patterns while repeating the process down the line.
In this case, the software needs to analyze the movement in the clips and slow it down while creating interpolated sub-frames to have smooth motion in playback. Using this software keep in mind that a 17sec clip took 12 minutes to convert with Cuda, and 6 hours without.
NVIDIA researchers have developed this deep learning-based system that can produce a high-quality slow-motion video from a standard (30 fps) video clip. In comparison with manual slow-motion results, the NVIDIA demonstration video shows far superior smoothness. The technique generates intermediate frames to achieve the super slow-motion effect and as it can generate an indefinite number of such intermediate frames there is no limit to how slow videos can be made to go.
Video demo 1: The paper Super SloMo: High-Quality Estimation of Multiple Intermediate Frames for Video Interpolation along with NVIDIA’s presentation at the Computer Vision and Pattern Recognition is the latest research from NVIDIA on such AI-empowered video transformation techniques.
CVPR spotlight video: https://people.cs.umass.edu/~hzjiang/projects/superslomo/superslomo_cvpr18_spotlight_v4.mp4
The paper introduces an end-to-end convolutional neural network for variable-length multi-frame video interpolation, which generates intermediate frame(s) between two consecutive frames to form both spatially and temporally coherent video sequences.
To address the challenge of generating multiple intermediate video frames, researchers first computed bi-directional optical flow between the input images using a U-Net architecture. The flows were then linearly combined at each time step to approximate the intermediate bi-directional optical flows. Although these approximate flows work well in locally smooth regions, they can produce artifacts around motion boundaries. To address this an additional U-Net is employed to refine the approximated flow and predict soft visibility maps. The two input images are then warped and linearly fused to form each intermediate frame. To avoid artifacts, the team applies visibility maps to the warped images before fusion, to exclude the contribution of occluded pixels to the interpolated intermediate frames.
The NVIDIA multi-frame approach outperforms state-of-the-art single frame methods on the Middlebury, UCF101, Slowflow, and High-framerate Sintel datasets. The paper Super SloMo: High-Quality Estimation of Multiple Intermediate Frames for Video Interpolation is on arXiv
-Article by Satya Anirudh , 3rd year Electronics and Communications Engineering