Artificial Intelligence in Video

bhobba · Mar 31, 2024

Behind the scenes, artificial intelligence usually makes use of what is known as a Neural Network:

In image applications, an implementation called a Convolutional Neural Network is often used:

In particular, for image super-resolution a General Adversarial Network or GAN is often used:

These form the basis of modern super-resolution:

For those who are interested in the details, see:
https://arxiv.org/abs/2204.13620

But things move on. Someone thought of using a CNN to down-scale the image first, then using super-resolution to recover the original image. One example is TAD-TAU:
https://openaccess.thecvf.com/conte...k-Aware_Image_Downscaling_ECCV_2018_paper.pdf

This is an example of an important AI concept - the Autoencoder:

Again, things move on, and it has been improved to not only be simpler and give improved performance but also allow down-scaling and super-resolution by arbitrary amounts, as well as encoding the colour in a resultant black and white image:
https://arxiv.org/pdf/2201.12576

So far, super-resolution has been done using lower-resolution images, but can also be done using a sequence of images from a video:
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4088133

It was mentioned to quantify how close a super-resolution image is to the original as perceived by a human being, and SSIM was invented. However, further work has been done on this, and a new measure, invented and used a lot by NETFLIX, has largely replaced it, called VMAF:
https://visionular.ai/vmaf-ssim-psnr-quality-metrics/

Image super-resolution is one of many proposals for reducing the bit rate of high-resolution images. ISIZE (recently acquired by SONY) preprocesses an image to make it more efficient to encode, yet still has a high VMAF:
https://discovery.ucl.ac.uk/10152967/1/SMPTE_v9_RPS.pdf

It produces substantial reductions in the bit rate of 8K video:
https://8kassociation.com/industry-info/8k-news/pre-encoding-8k-with-isize-bitsave/

A lot of ideas and concepts have been introduced in this post. If the reader has not seen them before, like anything new, it may take a while to get up to speed. However, they form the basis of my proposed method of an all-AI video codec,

My next post will be an overview of current video codecs, including EVC baseline, which forms the basis of the AI codec.

Thanks
Bill

reid666 · Apr 11, 2024

very interesting

Artificial Intelligence in Video

Similar threads

Hot Threads

Recent Insights