Packet loss recovery and control for voice transmission over the Internet

Under CopyrightSanneck, H.H.Sanneck2022-03-0631.07.20022000https://publica.fraunhofer.de/handle/publica/27404710.24406/publica-fhg-274047Paket-vermittelnde Netzwerke wie das Internet, die nach dem "best effort"-Prinzip arbeiten bieten keine Möglichkeit die Übertragung von Paketen für Echtzeitdienste wie Sprache zu garantieren. Somit wird durch Paketverluste die Dienstqualität beeinträchtigt. Bei Sprachübertragung treten dabei die folgenden Effekte auf: einerseits können schon kurze Folgen von verlorenen Paketen (Bündelverluste) die Fähigkeit des Empfängers beeinträchtigen die Paketverluste zu verschleiern. Dadurch wird das Sprachsignal als unterbrochen wahrgenommen. Andererseits können einzelne Pakete des Datenstromes besonders anfällig gegenüber einem Verlust sein, da sie Informationen beinhalten die entscheidend für die wahrgenommene Sprachsignalqualität am Empfänger sind. Zunächst wird ein Modell entwickelt welches auf der Anzahl der hintereinander verloren gegangenen Pakete basiert. Mit diesem Modell ist es möglich die Verlustverteilung innerhalb eines Datenstromes zu beschreiben. Die von dem Modell abgeleiteten Paketverlustmetriken werden dann mit Methoden der objektiven Sprachqualitätsmessung verbunden. Innerhalb dieses Rahmenwerkes stellen wir fest das schwach-komprimierende Sprachkodierer ("sample-based codecs", PCM) mit Fehlerverschleierung einzelne Paketverluste gut überbrücken können. Bündelverluste haben dagegen einen starken negativen Einfluss auf die Sprachqualität. Bei hochkomprimierenden Kodierern ("frame-based codecs", G.729) ist es einerseits so, das die Auswirkungen von Paketverlusten durch das Gedächtnis der Dekoder-Filter noch verstärkt werden. Andererseits machen es solche Kodiermethoden einfacher eine Fehlerverschleierung durchzuführen, da die Statusinformationen innerhalb des Dekoders extrapoliert werden können. Im Gegensatz zu den schwach-komprimierenden Sprachkodierern ist jedoch festzustellen, das die Qualität der Fehlerverschleierung an Sprachbereichsübergängen zusammenbrechen kann. Dann werden Mechanismen vorgestellt die zwischen Paketen innerhalb eines Sprachdatenstroms (flow) unterscheiden können, um die Auswirkungen von Paketverlusten zu minimieren. Wir bezeichnen diese Methoden als "intra-flow" Paketverlustbehandlung und -kontrolle. In den Endsystemen (end-to-end) findet dabei die Identifizierung von verlustsensitiven Paketen (am Sender) sowie eine Fehlerverschleierung (am Empfänger) statt. Unterstützungsmechanismen in den Netzwerkknoten (hop-by-hop) erlauben es dann Verluste von als wichtiger identifizierten Paketen auf Kosten von weniger wichtigen Paketen desselben Datenstroms zu vermeiden. Da für beide Paketarten dieselben Netzwerkressourcen aufgewendet werden müssten, ist somit ein Gewinn an Sprachqualität möglich. Es wird gezeigt das dieser Gewinn bedeutend ist, wobei jedoch der Netzwerkdienst, über längere Zeitabschnitte gesehen, praktisch mit einem "best effort"-Dienst gleichgesetzt werden kann."Best effort" packet-switched networks, like the Internet, do not offer a reliable transmission of packets to applications with real-time constraints such as voice. Thus, the loss of packets impairs the application-level utility. For voice this utility impairment is twofold: on one hand, even short bursts of lost packets may decrease significantly the ability of the receiver to conceal the packet loss and the speech signal play-out is interrupted. On the other hand, some packets may be particular sensitive to loss as they carry more important information in terms of user perception than other packets. We first develop an end-to-end model based on loss run-lengths with which we can describe the loss distribution within a flow. The packet-level metrics derived from the model are then linked to user-level objective speech quality metrics. Using this framework, we find that for low-compressing sample-based codecs (PCM) with loss concealment isolated packet losses can be concealed well, whereas burst losses have a higher negative perceptual impact. For high-compressing frame-based codecs (G.729) on one hand the impact of loss is amplified through error propagation caused by the decoder filter memories, though on the other hand such coding schemes help to perform loss concealment by extrapolation of decoder state. Contrary to samplebased codecs we show that the concealment performance may "break" at transitions within the speech signal however. We then propose mechanisms which differentiate between packets within a voice data flow to minimize the impact of packet loss. We designate these methods as "intra-flow" loss recovery and control. At the end-to-end level, identification of packets sensitive to loss (sender) as well as loss concealment (receiver) takes place. Hop-by-hop support schemes then allow trading the loss of one packet, which is considered more important, against another one of the same flow which is of lower importance. As both packets require the same cost in terms of network transmission, a gain in perceived quality is obtainable. We show that significant speech quality improvements can be achieved while still maintaining a network service which is virtually identical to best effort in the long term.Abstract S.iii Zusammenfassung S.iv Acknowledgments S.v List of Figures S.xi List of Tables S.xv 1 Introduction S.1 - 1.1 Motivation and Scope S.2 - 1.2 Approach S.7 - 1.3 Thesis Outline and Methodology S.8 2 Basics S.13 - 2.1 Digital voice communication S.13 - 2.1.1 Speech production S.13 - 2.1.2 Digitization S.14 - 2.1.3 Coding / compression S.15 - 2.1.4 Speech quality / intelligibility S.20 - 2.2 Voice transmission over packet-switched networks S.21 - 2.2.1 Quality impairments S.21 - 2.2.2 Sender / receiver structure S.25 - 2.2.3 The Internet conferencing architecture S.26 3 Related Work S.31 - 3.1 End-to-End loss recovery S.31 - 3.1.1 Impact of the choice of transmission parameters S.32 - 3.1.2 Mechanisms involving sender and receiver S.34 - 3.1.3 Receiver-only mechanisms: loss concealment S.44 - 3.2 Hop-by-Hop loss control S.50 - 3.2.1 Local approach: queue management S.51 - 3.2.2 Distributed approaches S.51 - 3.3 Combined end-to-end and hop-by-hop approaches S.56 - 3.3.1 Implicit cooperation S.56 - 3.3.2 Explicit cooperation S.57 4 Evaluation Models and Metrics S.61 - 4.1 Packet-level loss models and metrics S.62 - 4.1.1 General Markov model S.63 - 4.1.2 Loss run-length model with unlimited state space S.64 - 4.1.3 Loss run-length model with limited state space S.67 - 4.1.4 Gilbert model S.70 - 4.1.5 No-loss run-length model with limited state space S.72 - 4.1.6 Composite metrics S.73 - 4.1.7 Parameter computation S.73 - 4.1.8 Application of the metrics S.73 - 4.2 User-level speech quality metrics S.79 - 4.2.1 Objective quality measurement S.79 - 4.2.2 Subjective testing S.82 - 4.3 Relating speech quality to packet-level metrics S.86 - 4.4 Packet-level traffic model and topology S.90 - 4.5 Conclusions S.93 5 End-to-End-Only Loss Recovery S.97 - 5.1 Sample-based codecs S.97 - 5.1.1 Approach S.98 - 5.1.2 Adaptive Packetization / Concealment (AP/C) S.98 - 5.1.3 Results S.106 - 5.1.4 Discussion S.113 - 5.1.5 Implementation of AP/C and FEC into an Internet audio tool S.113 - 5.2 Frame-based codecs S.117 - 5.2.1 AP/C for frame-based codecs S.118 - 5.2.2 Approach S.120 - 5.2.3 G.729 frame loss concealment S.121 - 5.2.4 Speech Property-Based Forward Error Correction (SPB-FEC) S.123 - 5.2.5 Results S.129 - 5.3 Conclusions S.132 6 Intra-Flow Hop-by-Hop Loss Control S.135 - 6.1 Approach S.136 - 6.1.1 Design options S.139 - 6.2 Implicit cooperation: the Predictive Loss Pattern (PLoP) algorithm S.140 - 6.2.1 Drop pro les S.141 - 6.2.2 Description of the algorithm S.142 - 6.2.3 Properties S.143 - 6.2.4 Results S.144 - 6.3 Explicit cooperation: the Differential RED (Di RED) algorithm S.152 - 6.3.1 Description of the algorithm S.152 - 6.3.2 Results S.156 - 6.4 Comparison between PLoP and Di RED S.159 - 6.4.1 Results S.160 - 6.5 Conclusions S.165 7 Combined End-to-End and Hop-by-Hop Loss Recovery and Control S.169 - 7.1 Implicit cooperation: Hop-by-Hop support for AP/C S.170 - 7.2 Explicit cooperation: Speech Property-Based Packet Marking S.172 - 7.2.1 A simple end-to-end model for Di RED S.172 - 7.2.2 Simulation description S.175 - 7.2.3 Results S.176 - 7.3 Conclusions S.181 8 Conclusions S.183 - A Acronyms S.187 Bibliography S.191envoice over IPInternettelefoniePaketverlustPaketverlustbehandlungSprachqualitätsmessungQueue ManagementInternet telephonypacket lossloss recoveryQuality Measurementdifferentiated Service004Packet loss recovery and control for voice transmission over the Internetdoctoral thesis