[Matroska-devel] Several (minor) issues or underspecified areas in the MKV spec

Steve Lhomme slhomme at matroska.org
Sun Oct 25 09:10:37 CET 2015


On 24/10/2015 15:40, Dave Rice wrote:
>
>> On Oct 24, 2015, at 9:28 AM, Steve Lhomme <slhomme at matroska.org
>> <mailto:slhomme at matroska.org>> wrote:
>>
>> 2015-10-24 1:44 GMT+02:00 Michael Bradshaw <mjbshaw at google.com
>> <mailto:mjbshaw at google.com>>:
>>> On Mon, Oct 12, 2015 at 10:10 AM, Michael Bradshaw
>>> <mjbshaw at google.com <mailto:mjbshaw at google.com>>
>>> wrote:
>>>>
>>>> On Sun, Oct 11, 2015 at 12:13 AM, Steve Lhomme <slhomme at matroska.org
>>>> <mailto:slhomme at matroska.org>>
>>>> wrote:
>>>>>
>>>>> 2015-10-05 18:07 GMT+02:00 Michael Bradshaw <mjbshaw at google.com
>>>>> <mailto:mjbshaw at google.com>>:
>>>>>> On Sun, Oct 4, 2015 at 6:43 AM, Steve Lhomme <slhomme at matroska.org
>>>>>> <mailto:slhomme at matroska.org>>
>>>>>> wrote:
>>>>>>>
>>>>>>> On Sep 29, 2015 03:04, "Michael Bradshaw" <mjbshaw at google.com
>>>>>>> <mailto:mjbshaw at google.com>> wrote:
>>>>>>>
>>>>>>>> What’s the point of default values for non-mandatory elements in the
>>>>>>>> MKV
>>>>>>>> spec? Why not make them mandatory if they have a default value?
>>>>>>>
>>>>>>> You could set the element in the file if you want, but if it's the
>>>>>>> default
>>>>>>> value (likely a very common value) you don't have to write it, the
>>>>>>> semantic
>>>>>>> reader will get it.
>>>>>>
>>>>>>
>>>>>> But is there any real difference between a non-mandatory element
>>>>>> with a
>>>>>> default value, and a mandatory element with a default value? The two
>>>>>> seem
>>>>>> effectively equivalent. If there's no meaningful difference, why not
>>>>>> just
>>>>>> make any element with a default value a mandatory element?
>>>>>
>>>>> In the of mandatory, you don't even have to write it in the file at
>>>>> all. Whenever you read the parent element, either it's there and you
>>>>> read the value (if there's one) or you assume you have the default
>>>>> value.
>>>>>
>>>>> If it's not mandatory and not written, then you do not have the
>>>>> element and its value. So if you want the element you have to write at
>>>>> least the ID and a size of 0. Each time you want it.
>>>>
>>>>
>>>> Interesting. The current EBML spec says signed/unsigned integers, dates,
>>>> and float elements have the value 0 if their octet length is zero[1]. To
>>>> make sure I'm understanding you correctly, you're saying that optional
>>>> elements can have a default value (specified by the document spec)
>>>> that is
>>>> used if their octet length is zero (even if their default value is
>>>> not 0),
>>>> correct? If that's the case, could this be clarified in the EBML/MKV
>>>> specs?
>>>> I don't see that behavior mentioned in the MKV spec, and the EBML
>>>> spec makes
>>>> it sound like these elements always get the value of 0 (with no
>>>> indication
>>>> of other default values being possible).
>>>>
>>>> [1]:
>>>> https://github.com/Matroska-Org/ebml-specification/blob/master/specification.markdown#ebml-element-types
>>>
>>>
>>> Any clarifications on this point?
>>
>> This is something that Dave added and that was not specificed in the
>> original spec. I don't remember if it was debated or not to end up
>> like that.
>
> Yes there was some discussion on the PR, see this comment and the
> discussion around it.

I see it now. I'm glad I'm still coherent with what I said back then:
"In general the code was designed to not write the value if the value is 
the default value, regardless of the type. That means size 0 is always 
allowed for elements with a default value."
https://github.com/Matroska-Org/ebml-specification/pull/17#discussion_r34864627

Following the thread it's clear that the reference code (libebml) didn't 
follow this logic and had 0 as the default value for integer/float based 
types.


> My understand was that zero-length values default to zero, but elements
> that non-present and mandatory use their default value (if no default
> value and the element is mandatory and non-present, I suppose it's an
> error).
>
>> That would break Matroska parsers as CueBlockNumber has a default
>> value of 1 and is not mandatory. (CueRefNumber too but is deprecated)
>> TargetTypeValue has a default value of 50.
>
> Is this right?
> CueBlockNumber element is present but with data size of zero indicates
> of CueBlockNumber of 0.
> CueBlockNumber element is not present indicates of CueBlockNumber of 1
> (default value).
>
>> An element like Language has a default value of "eng" and is not
>> mandatory.
>>
>> Dave, can we fix this ?
>
> Sure. Though I may need some clarification at how this thread can
> resolve with the discussion at
> https://github.com/Matroska-Org/ebml-specification/pull/17#discussion_r34961723.
> Dave Rice

Now I remember this thread. It's clear that the code doesn't follow the 
original intent with default values. An element that has a default value 
but it not mandatory gets it's default value useless.

I think it's still possible to fix that. We need to go through the 
elements that are problematic and see the real life impact of that.

- Tags.Tag.Targets.TargetTypeValue = 50
"A number to indicate the logical level of the target (see TargetType)."

I think that's the most problematic as it's potentially used by many 
apps. I don't think there are many apps that can write Matroska tags 
though. I think mkvtoolnix and ffmpeg are among the few. Of those we 
need to check if they use 0 length if they have the default value or if 
it's a 0 value or never. Or more precisely if they write that element 
with 0 length ever, as 0 is not a known value.

http://www.matroska.org/technical/specs/tagging/index.html#targettypes

At least recent mkvtoolnix write the element in full. Also looking at 
the libebml code:
uint64 EbmlSInteger::UpdateSize(bool bWithDefault, bool /* bForceRender */)
{
   if (!bWithDefault && IsDefaultValue())
     return 0;

So we use a 0 length if we opt for not writing data for defaults (as the 
original specs intended). That code also confirms that there's a 
mismatch between how default values are written and how they are read in 
mkvmerge (see discussion on github). While reading is left to the 
interpretation of the parser when the element is not present. This 
libebml code makes it impossible to write a 0 length when the value is 0 
but it's not the default value. And that code has not changed since the 
early days of libebml. So we can assume most files produced were always 
consistent with the original intent of saving space for default values.

- Cues.CuePoint.CueTrackPositions.CueBlockNumber = 1
"Number of the Block in the specified Cluster."

I doubt this one is used. It's used when referencing exact block 
position when seeking. Even if its was written, I don't think any player 
use it when seeking since they need the more generic mechanism to work 
anyway.

The default value of 1 means the first block, it's not written in the 
specs though. We could change it to 0 to also mean the first block, that 
could help.

- Cues.CuePoint.CueTrackPositions.CueReference.CueRefNumber = 1
"Number of the referenced Block of Track X in the specified Cluster."

It's deprecated and likely was never used

> _______________________________________________
> Matroska-devel mailing list
> Matroska-devel at lists.matroska.org
> http://lists.matroska.org/cgi-bin/mailman/listinfo/matroska-devel
> Read Matroska-Devel on GMane: http://dir.gmane.org/gmane.comp.multimedia.matroska.devel
>




More information about the Matroska-devel mailing list