[Matroska-devel] Opus audio codec

Ralph Giles giles at thaumas.net
Wed Dec 14 21:27:49 CET 2011


I'm interested in adding support for the IETF Opus audio codec in Matroska.

Significant features, from the container point of view:

- Opus codes everything at 48 kHz
- The decoder can generate either mono or stereo from any stream
- Compressed frames are variable length, and the length of each frame
must be signaled to the decoder
- There's a 'multistream' extension for doing surround.

Since there's little extra signalling necessary, a simple approach would be:

  CodecID is V_OPUS
  SamplingFrequency is always 48000
  Channels is 1 or 2, based on what the muxer thinks is most
appropriate. When in doubt, use 2?
  CodecPrivate is void

That works great if you don't want more than two channels. The
'multistream' mode packs multiple mono/stereo streams into each frame,
but requires the container signal the number of streams and how
they're coupled. I can see two ways to make that work:

1) Define a new channel mapping element. The spec mentions a
ChannelPositions element, but doesn't define the format for it.
However we need more than just speaker positions. To decode
multistream Opus we need to know, for each stream packed into the
frames, whether to decode it as a coupled stereo pair (e.g.
REAR_LEFT+REAR_RIGHT) or as an isolated mono channel (e.g. LFE) and
how those map to the actual output channels.

The multistream packing is designed so a single mono or coupled-stereo
multistream frame is the same as a non-multistream frame, so for
non-surround uses, this element can just be omitted.

Pros: straightforward?
Cons: probably not useful for other codecs

2) Copy the headers used in the Ogg encapsulation into CodecPrivate,
similar to what is done for A_VORBIS and V_THEORA. This already
defines a binary format for the channel mapping, along with a bunch of
other things.

Pros: Lossless transmux between Ogg and Matroska, possible code sharing
Cons: more complex, codec specific code

For example, the Ogg header has a required gain field the demuxer is
supposed to pass down the decode pipeline to where it can applied,
similar to ReplayGain tags. I worry that supporting those fields
correctly will be painful.

Draft specs, for reference:


Comments? Alternate proposals?


More information about the Matroska-devel mailing list