[Matroska-devel] Opus in Matroksa Cont.

Steve Lhomme slhomme at matroska.org
Thu May 23 09:35:44 CEST 2013


Hi guys,

Glad we're back at this. I saw all the I/O talks on WebM/VP9 and had the
feeling this Opus decision was slowing things down. So we should try to
finalize a solution soon.

I'm not too keen on forcing all demuxers to have to handle a new element.
But since it's only for Opus, if players work on adding support for Opus,
they might as well support for this element too. Plus it's not too much
work to add a shift in the pipeline (at least the frameworks I know). It
will just be a bit more work than just dropping the codec library in there
and plugging it in the framework. But it seems to be the only way to make
it work properly for all use cases.

About the unit, there is currently nothing in Matroska that is accurate to
a sample. On the other hand any other value (average time units) would not
make sense for this. If the value is passed to the codec, then it's codec
specific. If the value is just used by the playback framework then sample
accuracy may not be needed, we don't have it for audio sync anyway (unless
timecodescale values are carefully picked) and a value in timecodescale
would be enough. In the future if we change the timecodescale for more
accuracy, this value will benefit from it too.

About PostPadding, since it's only for the las Block, why not just add it
in the BlockGroup of that lst Block. That information is useless everywhere
else.


On Thu, May 23, 2013 at 12:23 AM, Frank Galligan <frankgalligan at gmail.com>wrote:

> Hello all,
>
> I have changed my position and I'm in favor of 2.1 from the wiki [1],
> which I think is in line with what Ralph and Mosu were advocating. One of
> the biggest issues I had with 2.1, was that I was worried about the unknown
> ramifications of timeshifting all the samples. Well as it turns out,
> I didn't really need to worry as they are already timeshifted. Vorbis is
> shifted 128 samples and aac is shifted by 1024 (with FFmpeg at least). So
> encoders/muxers are already doing this currently, but not explicitly
> representing that in the Matroska file. I think Raplh mentioned that
> earlier.
>
> So I'm advocating 2.1, I.e. add a PreSkip element to the TrackEntry
> element. PreSkip would be a non-mandatory unsigned integer with a default
> value of 0. I agree with Mosu that PreSkip units should be samples wrt
> audio. If we choose another resolution, I just want to make sure we can
> convert exactly to samples.
>
> I would also like to propose adding a new element, PostPadding to the
> TrackEntry element. PostPadding is the number of samples that are added by
> the encoder to the end of the stream. PostPadding would be a non-mandatory
> unsigned integer with a default value of 0. PostPadding units would match
> PreSkip units.
>
> With these 2 new elements, encoded Matroska files should be able
> to accurately represent the duration of the source samples.
>
> Frank
>
> [1] https://wiki.xiph.org/MatroskaOpus
>
>
> On Wed, May 22, 2013 at 3:15 PM, Frank Galligan <frankgalligan at gmail.com>wrote:
>
>> I just realized I sent a reply only to Ralph on 4/12. I'm copying the
>> reply below, but I have since changed my position. I will follow up  in
>> another email.
>>
>> I updated the wiki (https://wiki.xiph.org/MatroskaOpus) with options
>> that I have seen for handling pre-skip.
>>
>>
>> On Fri, Apr 12, 2013 at 12:15 PM, Ralph Giles <giles at thaumas.net> wrote:
>>
>> On 13-04-12 10:35 AM, Frank Galligan wrote:
>>>
>>> >     First, the number of samples to be skipped is not an integer
>>> multiple of
>>> >     compressed packets, so this isn't actually possible without
>>> clipping
>>> >     valid audio from the start of the stream.
>>> >
>>> > Ahh, this was one of my earlier questions.
>>> >
>>> > [...]
>>> > I don't think we should worry about a decoder that ignores
>>> CodecPrivate.
>>> > Current decoders must handle CodecPrivate, so I think we can treat
>>> > decoders that ignore CodecPrivate as broken.
>>>
>>> Well sure, but they're less broken with Opus than with e.g. Vorbis,
>>> which can't work at all. For mono and stereo Opus files, the only thing
>>> that needs to be signaled outside the data stream is exactly the preskip
>>> value. So ignoring CodecPrivate for Opus is no worse than not
>>> implementing any hypothetical preskip element.
>>>
>> The output gain too.
>>
>>
>>>
>>> >     Having to feed and then discard output from special data in
>>> CodecPrivate
>>> >     is moving away from a general container-level solution to this
>>> >     requirement,
>>> >
>>> > I agree.
>>> >
>>> >     which is generally useful for other codecs as well, to
>>> >     implement trimming.
>>> >
>>> > Yes, I was leaning to include the data in CodecPrivate so we didn't
>>> need
>>> > to change all players to handle this feature, as we could potentially
>>> > hide it in the specific decoder.
>>>
>>> Where is the specific decoder going to live though? Are you planning to
>>> distribute a wrapper with which accepts the CodecPrivate data? Do you
>>> think we should add that to the libopus API?
>>>
>> If we added it to the libopus API that would be the easiest. Otherwise
>> it would definitely have to be some type of wrapper on top of libopus. We
>> had to something like this for Vorbis.
>>
>>
>>>
>>> > If we truly think this will be useful
>>> > to other codecs (currently or in the future) then we can try and
>>> > generalize this feature.
>>>
>>> Maybe it's helpful to think of video here. If I press 'record' in the
>>> middle of a WebRTC session, how do we reprent the start point which
>>> won't generally fall on a keyframe?
>>>
>> This has been handled for years already. Those frames are not marked with
>> a key frame. So players have thrown out those frames until the first
>> keyframe (or rendered garbage).
>>
>> I wouldn't recommend changing that to something that adds a pre-skip
>> value to the Track header, because you will have to add latency of starting
>> to write the data until the recorder sees the first keyframe. Or if we
>> decided on setting invisible flag or TimeToDiscard then
>> the recorder could start writing to disk right away.
>>
>>>
>>> > Maybe we can add an element to the Block element, TimeToDiscard in
>>> > nanoseconds. A value of -1 would not render the whole Block, which
>>> would
>>> > have the same effect as setting the invisible bit. Otherwise the
>>> > player would need to discard TimeToDiscard time. This should satisfy
>>> > "preskip data does not have to be an integer multiple of compressed
>>> > packets", while also preserving the timestamp of the Block matches the
>>> > timestamp of the playback position.
>>>
>>> Or under the TrackEntry element? Since it only happens at the start of
>>> the track in the use cases I can think of.
>>>
>> I was trying to generalize it further so future codecs could take
>> advantage of "decoded data, that may have a duration attached to it,
>> but should not be rendered", within any part of the stream. If we did this
>> we could change how we handle VP8 altref frames (highly doubt most players
>> would though).
>>
>>>
>>> I think you're still missing part of why the Ogg mapping shifts the
>>> timestamp though. Part of what pre-skip is for is to account for
>>> algorithmic delay. The encoder has some. If the original input isn't 48
>>> kHz, then it went through a resampler, which can also have some. So
>>> shifting the timecode is _necessary_ for sync. Without it, a peak in the
>>> output won't align with a peak in the input.
>>>
>> I understand it, but I don't think the timeshift is necessary to be muxed
>> into Matroska files (actually I think this will have major consequences
>> later as we are fundamentally changing how time is handled within
>> Matroska.)  Adding duration to the pre-skip data was a design choice.
>> The algorithmic delay (and any other data) could have been easily handled
>> within the codec if the bitstream was defined differently.
>>
>> I'm not advocating players/decoders do not decode the pre-skip data. I
>> understand that the output may not align with the input if the decoder is
>> not primed with pre-skip data, well it will align after SeekPreRoll time
>> has passed. I'm just trying to come up with a solution that does not offset
>> all timestamps within the file, as no other codec (that I know of) has done
>> this. And this at a minimum will force all muxer/demuxers to
>> handle their timing differently. But I think this will actually cause
>> problems later that we are not currently thinking of.
>>
>> I think we can generalize the pre-skip data by adding the TimeToDiscard (or
>> SamplesToDiscard, DataToDiscard ) to the Block or to the TrackEntry (but
>> I think it will be cleaner if it is added to the Block) and still keep the
>> timestamps == playback position. Am I mistaken?
>>
>>
>> On Fri, Apr 12, 2013 at 12:15 PM, Ralph Giles <giles at thaumas.net> wrote:
>>
>>> On 13-04-12 10:35 AM, Frank Galligan wrote:
>>>
>>> >     First, the number of samples to be skipped is not an integer
>>> multiple of
>>> >     compressed packets, so this isn't actually possible without
>>> clipping
>>> >     valid audio from the start of the stream.
>>> >
>>> > Ahh, this was one of my earlier questions.
>>> >
>>> > [...]
>>> > I don't think we should worry about a decoder that ignores
>>> CodecPrivate.
>>> > Current decoders must handle CodecPrivate, so I think we can treat
>>> > decoders that ignore CodecPrivate as broken.
>>>
>>> Well sure, but they're less broken with Opus than with e.g. Vorbis,
>>> which can't work at all. For mono and stereo Opus files, the only thing
>>> that needs to be signaled outside the data stream is exactly the preskip
>>> value. So ignoring CodecPrivate for Opus is no worse than not
>>> implementing any hypothetical preskip element.
>>>
>>> >     Having to feed and then discard output from special data in
>>> CodecPrivate
>>> >     is moving away from a general container-level solution to this
>>> >     requirement,
>>> >
>>> > I agree.
>>> >
>>> >     which is generally useful for other codecs as well, to
>>> >     implement trimming.
>>> >
>>> > Yes, I was leaning to include the data in CodecPrivate so we didn't
>>> need
>>> > to change all players to handle this feature, as we could potentially
>>> > hide it in the specific decoder.
>>>
>>> Where is the specific decoder going to live though? Are you planning to
>>> distribute a wrapper with which accepts the CodecPrivate data? Do you
>>> think we should add that to the libopus API?
>>>
>>> > If we truly think this will be useful
>>> > to other codecs (currently or in the future) then we can try and
>>> > generalize this feature.
>>>
>>> Maybe it's helpful to think of video here. If I press 'record' in the
>>> middle of a WebRTC session, how do we reprent the start point which
>>> won't generally fall on a keyframe?
>>>
>>> > Maybe we can add an element to the Block element, TimeToDiscard in
>>> > nanoseconds. A value of -1 would not render the whole Block, which
>>> would
>>> > have the same effect as setting the invisible bit. Otherwise the
>>> > player would need to discard TimeToDiscard time. This should satisfy
>>> > "preskip data does not have to be an integer multiple of compressed
>>> > packets", while also preserving the timestamp of the Block matches the
>>> > timestamp of the playback position.
>>>
>>> Or under the TrackEntry element? Since it only happens at the start of
>>> the track in the use cases I can think of.
>>>
>>> I think you're still missing part of why the Ogg mapping shifts the
>>> timestamp though. Part of what pre-skip is for is to account for
>>> algorithmic delay. The encoder has some. If the original input isn't 48
>>> kHz, then it went through a resampler, which can also have some. So
>>> shifting the timecode is _necessary_ for sync. Without it, a peak in the
>>> output won't align with a peak in the input.
>>>
>>>  -r
>>>
>>>
>>
>
> _______________________________________________
> Matroska-devel mailing list
> Matroska-devel at lists.matroska.org
> http://lists.matroska.org/cgi-bin/mailman/listinfo/matroska-devel
> Read Matroska-Devel on GMane:
> http://dir.gmane.org/gmane.comp.multimedia.matroska.devel
>



-- 
Steve Lhomme
Matroska association Chairman
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.matroska.org/pipermail/matroska-devel/attachments/20130523/2a622937/attachment-0001.html>


More information about the Matroska-devel mailing list