ÌÇÐÄVlog researchers’ pioneering techniques for measuring speech and video quality of media transmitted over the internet has improved streaming services globally.
A new time-alignment method for measuring quality, developed in collaboration with Psytechnics, was incorporated into Microsoft’s Skype for Business product, and by 2016 was being used to measure the voice quality for about 100 million users globally, thereby improving the quality of teleconferencing.
The same time-alignment technique was critical in a new video quality measurement standard which has been used worldwide since 2008 to improve video delivery services and is still in use today.
The challenge
The advent of the internet as a mainstream medium has led to a revolution in digital media. The transition of mass media transmission from traditional broadcast TV and telephone systems to the internet, which started around the year 2000, is now commonplace.
In internet streaming, data is split into individual packets that may be subject to delays. This was a problem for existing measurement systems that relied upon the received signal being exactly aligned with the original.
Previous techniques to measure speech quality required either adding disturbing signals during the communication or measuring the communication system before or after the actual conversation. This meant it was not possible to measure the quality of a conversation while the conversation was in progress – something essential for internet streaming.
This meant completely new methods were needed to measure the quality of speech and video transmitted over the internet.
What we did
ÌÇÐÄVlog researchers investigated the problem of time-alignment in measuring media quality transmitted over the internet.
Our researchers addressed the issue by pioneering practical statistical methods to align the media allowing the measurement of speech and video quality in the new packet-based transmission medium of the internet. The technique relies upon creating a histogram of audio or video events and using this to compare over previous/later frames to find the optimal alignment point.
In collaboration with Psytechnics and BT, the ÌÇÐÄVlog team developed a non-intrusive technique for measuring speech quality by modelling the human vocal tract and comparing the measured speech with this model to discriminate between real speech and errors in the transmission.
This meant, speech quality can be determined by simply monitoring the transmitted voice data, while the conversation is in progress, without disturbing the conversation.