[Matroska-devel] Opus in Matroska
ashwood at msn.com
Fri Sep 21 01:21:09 CEST 2012
[removed all the content I'm replying too, there was simply too much, and I
address things very much out of order]
I'm going to begin with a viewpoint, and then go from there. In the longer
term it is important to have gapless playback, but in the near term I don't
think its necessary. I'm also going to just call it all preprocessing
because it is all about the delays. Lets look at some of the main usage
Genuine video almost always has a few frames at the beginning that are throw
away anyway. Having an audio codec that has no sound for 3 frames (~90 ms)
won't pose a viewing problem.
Almost all songs, mixes, etc have a built in cue delay. This goes back to
the tape recording days, but has been carried over. This is almost always >
During a video seek a black screen should be displayed to signal the cut
anyway (although very often a frozen one is used in computer decoding),
giving this black screen for 3 frames (~90 ms) is still reasonable. An audio
delay of 100 ms is perceptible but ignored by the human mind, this is the
same delay we all experience on a daily basis from communicating over about
110 feet (33 meters), and we never notice it.
In a music seek a delay is necessary to avoid popping the speaker anyway.
While this can be 1 sample (1/44100 second) typical lengths are much longer
for human comfort. Saying that these are now 80ms is not a real problem. A
roll-up volume change should be applied anyway, for human comfort, the
roll-up provides plenty of time for the codec to preprocess.
So I contend that from the actual usage standpoint, the addition of 80 ms
where the codec seems to be just sitting around does not form a real
While this is an issue that should be addressed at the next significant
update (v4) it can be addressed along with the other minor issues that have
been found. This also gives time to consider whether this will be an anomaly
for Opus alone, or if other codecs will be developed using the techniques as
So to be specific:
Opus stores sound to be played at time T as being at time T, there is no
When playback begins: Opus codec requests audio sample T+0. Opus codec
processes but provides no audio, only a series of samples all 0.
After preprocessing: Opus codec requests sample T+n for playback at time
After seek: Opus returns to playback begin state with new T.
The encoder (the person, not necessarily the program) needs to be aware of
the anomaly to correct for it at the beginning of the video, probably
through a sound roll-up (no sound for the first 100ms, then bringing the
sound from 0 to full volume over the next 100 ms). The viewer does not need
to even be aware.
Then fixing the problems that become known during the usage in Matroska v4
More information about the Matroska-devel