Machine learning for per-title encoding

Silhavy, Daniel; Chen, Anita; Krauss, Christopher; Nguyen, Anh Tu; Müller, Christoph; Arbanowski, Stefan; Steglich, Stephan; Bassbouss, Louay

doi:10.5594/JMI.2022.3154836

2022

Journal Article

Abstract

Video streaming content varies in terms of complexity and requires title-specific encoding settings to achieve a certain visual quality. Classic 'one-size-fits-all' encoding ladders ignore video-specific characteristics and apply the same encoding settings across all video files. In the worst-case scenario, this approach can lead to quality impairments, encoding artifacts, or unnecessarily large media files. A per-title encoding solution has the potential to significantly decrease the storage and delivery costs of video streams while improving the perceptual quality of the video. Conventional per-title encoding solutions typically require a large number of test encodes, resulting in high computational times and costs. In this article, we describe a solution that implements the conventional per-title encoding approach and uses its resulting data for machine learning-based improvements. By applying supervised, multivariate regression algorithms like random forest regression, multilayer perceptron (MLP), and support vector regression, we can predict video quality metric (VMAF) values. These video quality metric values are the foundation for deriving the optimal encoding ladder. As a result, the test encodes are eliminated while preserving the benefits of conventional per-title encoding.

Journal

SMPTE motion imaging journal

Options

Machine learning for per-title encoding