[Matroska-devel] Several (minor) issues or underspecified areas in the MKV spec

Steve Lhomme slhomme at matroska.org
Sun Oct 11 09:54:07 CEST 2015

2015-10-06 15:49 GMT+02:00 Michael Bradshaw <mjbshaw at google.com>:
> On Mon, Oct 5, 2015 at 10:15 AM, Dave Rice <dave at dericed.com> wrote:
>> On Oct 5, 2015, at 12:47 PM, Michael Bradshaw <mjbshaw at google.com> wrote:
>> How should a EBMLMaxSizeLength > 8 be handled if it occurs after the
>> element that needs it (specific edge case: DocType has a size length of 9,
>> but DocType occurs before EBMLMaxSizeLength in the header; how should that
>> be handled?) (alternate edge case: a Void element occurring in (or before)
>> an EBML element with a size length is > 8 and occurring before
>> EBMLMaxSizeLength). Should the spec explicitly require parsers to parse as
>> if EBMLMaxSizeLength is 8 unless and until explicitly told otherwise?
>> Maybe the documentation for EBMLMaxSizeLength should be clarified as
>> EBMLMaxSizeLength=8 does not mean that the payload of the EBML elements is
>> limited to 8 bytes, it means that the size value of the EBML Element itself
>> is restricted to 8 bytes. I believe that an 8 byte size statement provides
>> something like 72 petabytes. I hope there are no docTypes greater than 72
>> petabytes in length ;).
> Yeah, I know EBMLMaxSizeLength refers to the length (in bytes) of the size
> value, and this is where some of that "extremely unlikely to happen but
> still in the realm of possible" applies :). That said, since the size isn't
> required to be trimmed of unnecessary leading bytes (i.e. "5 can be coded
> 0x000000000005 or 0x0005 or 0x05"), it's totally permissible for the encoder
> to set EBMLMaxSizeLength=10 and have some sizes that use all 10 bytes, even
> if the values they store could easily fit in fewer than 8 bytes. For files
> like these, I think it's worth clarifying this part of the spec.

I don't see how that case is undefined. If the EBML Stream (as opposed
to the header) can be 10 bytes and your parser can handle it (ie it
read the EBML header, read that size and didn't leave with "error:
unsupported EBML format"), then if if finds a 10 octets size value, it
can read it. Even if the value is 5 in the end.

>> The EBML spec says that the Reserved ID (all bits set to 1) is the only ID
>> that may change the Length Descriptor (the count of leading zeroes + 1).
>> What exactly does it mean to "change the Length Descriptor?" Does this mean
>> a Length Descriptor can be > 4 (even if EBMLMaxIDLength = 4) iff the ID is
>> the Reserved ID?
>> Good question, though I'm not sure the answer, this is an older part of
>> the EBML spec that pre-dates my work on it. Some related discussions on this
>> are here: https://github.com/Matroska-Org/ebml-specification/pull/15
> Who would be good to ask for clarification? If we can't figure out exactly
> what it means, would it make more sense to just remove it from the spec?
> Thanks!
> --Michael
> _______________________________________________
> Matroska-devel mailing list
> Matroska-devel at lists.matroska.org
> http://lists.matroska.org/cgi-bin/mailman/listinfo/matroska-devel
> Read Matroska-Devel on GMane:
> http://dir.gmane.org/gmane.comp.multimedia.matroska.devel

Steve Lhomme
Matroska association Chairman

More information about the Matroska-devel mailing list