[Matroska-devel] Opus in Matroska

Ralph Giles giles at thaumas.net
Fri Sep 14 18:57:16 CEST 2012

On 12-09-14 8:51 AM, Moritz Bunkus wrote:

> Out of curiosity, which two elements are you talking about? And where
> would you put them? Let's start talking specifics here so that we can
> reach a proposal for those new elements.


As I understand the state of the discussion, there are a couple of issues.

The biggest one is pre-roll after seek. The output of the Opus decoder
takes a little while to converge. This is no issue when playback starts
at the beginning of the stream because the encoder and decoder start
from the same state. After a seek, however, that state doesn't match,
and immediate output of the decoder will be corrupt until all the
predictors have converged. It might sound ok most of the time, but it's
possible to get very loud pops and squeals: no fun at all. The Ogg
encapsulation recommends discarding the first 80 ms (3840 samples) out
of the decoder after seek.

Rolling-intra video streams have exactly the same issue. There's never
really a keyframe, but to seek and start playback at frame n, you have
to start decoding at frame n-30 (or whatever) to get correct output.

This is something the container needs to signal to players and splitters
so they can do the right thing. I'd suggest a new, optional element
under TrackEntry which defines the required pre-roll. Call it
Track::TrackEntry::PreRoll, SeekPreRoll, SeekPreSkip, something like
that, maybe?

The next issue is the 'preskip' field from the CodecPrivate field. This
is different from the pre-roll skip after seek. It's a count of samples
to discard from the start of the stream to correct for algorithmic delay
in the encoder so there's no phase shift between input and output, and
it should be applied *before* calculating timestamps.

The trimming can be handled by the decoder wrapper, but the container
needs to report the timestamps correctly, which means there needs to be
away to deal with initial blocks with apparently negative timestamps.

One option is to use the Timecode field in the SimpleBlock structure. Is
that generally properly supported by players? Unfortunately it's a
signed int16, while the codecprivate field is an _unsigned_ int16, so we
can't represent the full range that way.

Otherwise, maybe a TrackEntry::TimestampOffset makes sense?

Those are the two issues I was thinking of. Trimming the end of the
stream is also a problem. At least, I don't know how to do that in
Matroska. Like most block-transform codecs, Opus will produce a output
slightly longer than the original input, so it must be trimmed against
the exact duration signalled by the container. This is important for
gapless playback.


More information about the Matroska-devel mailing list