[Matroska-devel] Opus in Matroksa Cont.

Frank Galligan frankgalligan at gmail.com
Thu May 23 00:23:33 CEST 2013


Hello all,

I have changed my position and I'm in favor of 2.1 from the wiki [1], which
I think is in line with what Ralph and Mosu were advocating. One of the
biggest issues I had with 2.1, was that I was worried about the unknown
ramifications of timeshifting all the samples. Well as it turns out,
I didn't really need to worry as they are already timeshifted. Vorbis is
shifted 128 samples and aac is shifted by 1024 (with FFmpeg at least). So
encoders/muxers are already doing this currently, but not explicitly
representing that in the Matroska file. I think Raplh mentioned that
earlier.

So I'm advocating 2.1, I.e. add a PreSkip element to the TrackEntry
element. PreSkip would be a non-mandatory unsigned integer with a default
value of 0. I agree with Mosu that PreSkip units should be samples wrt
audio. If we choose another resolution, I just want to make sure we can
convert exactly to samples.

I would also like to propose adding a new element, PostPadding to the
TrackEntry element. PostPadding is the number of samples that are added by
the encoder to the end of the stream. PostPadding would be a non-mandatory
unsigned integer with a default value of 0. PostPadding units would match
PreSkip units.

With these 2 new elements, encoded Matroska files should be able
to accurately represent the duration of the source samples.

Frank

[1] https://wiki.xiph.org/MatroskaOpus


On Wed, May 22, 2013 at 3:15 PM, Frank Galligan <frankgalligan at gmail.com>wrote:

> I just realized I sent a reply only to Ralph on 4/12. I'm copying the
> reply below, but I have since changed my position. I will follow up  in
> another email.
>
> I updated the wiki (https://wiki.xiph.org/MatroskaOpus) with options that
> I have seen for handling pre-skip.
>
>
> On Fri, Apr 12, 2013 at 12:15 PM, Ralph Giles <giles at thaumas.net> wrote:
>
> On 13-04-12 10:35 AM, Frank Galligan wrote:
>>
>> >     First, the number of samples to be skipped is not an integer
>> multiple of
>> >     compressed packets, so this isn't actually possible without clipping
>> >     valid audio from the start of the stream.
>> >
>> > Ahh, this was one of my earlier questions.
>> >
>> > [...]
>> > I don't think we should worry about a decoder that ignores CodecPrivate.
>> > Current decoders must handle CodecPrivate, so I think we can treat
>> > decoders that ignore CodecPrivate as broken.
>>
>> Well sure, but they're less broken with Opus than with e.g. Vorbis,
>> which can't work at all. For mono and stereo Opus files, the only thing
>> that needs to be signaled outside the data stream is exactly the preskip
>> value. So ignoring CodecPrivate for Opus is no worse than not
>> implementing any hypothetical preskip element.
>>
> The output gain too.
>
>
>>
>> >     Having to feed and then discard output from special data in
>> CodecPrivate
>> >     is moving away from a general container-level solution to this
>> >     requirement,
>> >
>> > I agree.
>> >
>> >     which is generally useful for other codecs as well, to
>> >     implement trimming.
>> >
>> > Yes, I was leaning to include the data in CodecPrivate so we didn't need
>> > to change all players to handle this feature, as we could potentially
>> > hide it in the specific decoder.
>>
>> Where is the specific decoder going to live though? Are you planning to
>> distribute a wrapper with which accepts the CodecPrivate data? Do you
>> think we should add that to the libopus API?
>>
> If we added it to the libopus API that would be the easiest. Otherwise
> it would definitely have to be some type of wrapper on top of libopus. We
> had to something like this for Vorbis.
>
>
>>
>> > If we truly think this will be useful
>> > to other codecs (currently or in the future) then we can try and
>> > generalize this feature.
>>
>> Maybe it's helpful to think of video here. If I press 'record' in the
>> middle of a WebRTC session, how do we reprent the start point which
>> won't generally fall on a keyframe?
>>
> This has been handled for years already. Those frames are not marked with
> a key frame. So players have thrown out those frames until the first
> keyframe (or rendered garbage).
>
> I wouldn't recommend changing that to something that adds a pre-skip value
> to the Track header, because you will have to add latency of starting to
> write the data until the recorder sees the first keyframe. Or if we decided
> on setting invisible flag or TimeToDiscard then the recorder could start
> writing to disk right away.
>
>>
>> > Maybe we can add an element to the Block element, TimeToDiscard in
>> > nanoseconds. A value of -1 would not render the whole Block, which would
>> > have the same effect as setting the invisible bit. Otherwise the
>> > player would need to discard TimeToDiscard time. This should satisfy
>> > "preskip data does not have to be an integer multiple of compressed
>> > packets", while also preserving the timestamp of the Block matches the
>> > timestamp of the playback position.
>>
>> Or under the TrackEntry element? Since it only happens at the start of
>> the track in the use cases I can think of.
>>
> I was trying to generalize it further so future codecs could take
> advantage of "decoded data, that may have a duration attached to it,
> but should not be rendered", within any part of the stream. If we did this
> we could change how we handle VP8 altref frames (highly doubt most players
> would though).
>
>>
>> I think you're still missing part of why the Ogg mapping shifts the
>> timestamp though. Part of what pre-skip is for is to account for
>> algorithmic delay. The encoder has some. If the original input isn't 48
>> kHz, then it went through a resampler, which can also have some. So
>> shifting the timecode is _necessary_ for sync. Without it, a peak in the
>> output won't align with a peak in the input.
>>
> I understand it, but I don't think the timeshift is necessary to be muxed
> into Matroska files (actually I think this will have major consequences
> later as we are fundamentally changing how time is handled within
> Matroska.)  Adding duration to the pre-skip data was a design choice.
> The algorithmic delay (and any other data) could have been easily handled
> within the codec if the bitstream was defined differently.
>
> I'm not advocating players/decoders do not decode the pre-skip data. I
> understand that the output may not align with the input if the decoder is
> not primed with pre-skip data, well it will align after SeekPreRoll time
> has passed. I'm just trying to come up with a solution that does not offset
> all timestamps within the file, as no other codec (that I know of) has done
> this. And this at a minimum will force all muxer/demuxers to
> handle their timing differently. But I think this will actually cause
> problems later that we are not currently thinking of.
>
> I think we can generalize the pre-skip data by adding the TimeToDiscard (or
> SamplesToDiscard, DataToDiscard ) to the Block or to the TrackEntry (but
> I think it will be cleaner if it is added to the Block) and still keep the
> timestamps == playback position. Am I mistaken?
>
>
> On Fri, Apr 12, 2013 at 12:15 PM, Ralph Giles <giles at thaumas.net> wrote:
>
>> On 13-04-12 10:35 AM, Frank Galligan wrote:
>>
>> >     First, the number of samples to be skipped is not an integer
>> multiple of
>> >     compressed packets, so this isn't actually possible without clipping
>> >     valid audio from the start of the stream.
>> >
>> > Ahh, this was one of my earlier questions.
>> >
>> > [...]
>> > I don't think we should worry about a decoder that ignores CodecPrivate.
>> > Current decoders must handle CodecPrivate, so I think we can treat
>> > decoders that ignore CodecPrivate as broken.
>>
>> Well sure, but they're less broken with Opus than with e.g. Vorbis,
>> which can't work at all. For mono and stereo Opus files, the only thing
>> that needs to be signaled outside the data stream is exactly the preskip
>> value. So ignoring CodecPrivate for Opus is no worse than not
>> implementing any hypothetical preskip element.
>>
>> >     Having to feed and then discard output from special data in
>> CodecPrivate
>> >     is moving away from a general container-level solution to this
>> >     requirement,
>> >
>> > I agree.
>> >
>> >     which is generally useful for other codecs as well, to
>> >     implement trimming.
>> >
>> > Yes, I was leaning to include the data in CodecPrivate so we didn't need
>> > to change all players to handle this feature, as we could potentially
>> > hide it in the specific decoder.
>>
>> Where is the specific decoder going to live though? Are you planning to
>> distribute a wrapper with which accepts the CodecPrivate data? Do you
>> think we should add that to the libopus API?
>>
>> > If we truly think this will be useful
>> > to other codecs (currently or in the future) then we can try and
>> > generalize this feature.
>>
>> Maybe it's helpful to think of video here. If I press 'record' in the
>> middle of a WebRTC session, how do we reprent the start point which
>> won't generally fall on a keyframe?
>>
>> > Maybe we can add an element to the Block element, TimeToDiscard in
>> > nanoseconds. A value of -1 would not render the whole Block, which would
>> > have the same effect as setting the invisible bit. Otherwise the
>> > player would need to discard TimeToDiscard time. This should satisfy
>> > "preskip data does not have to be an integer multiple of compressed
>> > packets", while also preserving the timestamp of the Block matches the
>> > timestamp of the playback position.
>>
>> Or under the TrackEntry element? Since it only happens at the start of
>> the track in the use cases I can think of.
>>
>> I think you're still missing part of why the Ogg mapping shifts the
>> timestamp though. Part of what pre-skip is for is to account for
>> algorithmic delay. The encoder has some. If the original input isn't 48
>> kHz, then it went through a resampler, which can also have some. So
>> shifting the timecode is _necessary_ for sync. Without it, a peak in the
>> output won't align with a peak in the input.
>>
>>  -r
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.matroska.org/pipermail/matroska-devel/attachments/20130522/f44228ad/attachment-0001.html>


More information about the Matroska-devel mailing list