[Matroska-devel] Several (minor) issues or underspecified areas in the MKV spec

Michael Bradshaw mjbshaw at google.com
Mon Oct 12 19:10:08 CEST 2015

On Sun, Oct 11, 2015 at 12:43 AM, Steve Lhomme <slhomme at matroska.org> wrote:

> 2015-10-05 18:47 GMT+02:00 Michael Bradshaw <mjbshaw at google.com>:
> > How should a EBMLMaxSizeLength > 8 be handled if it occurs after the
> element
> > that needs it (specific edge case: DocType has a size length of 9, but
> > DocType occurs before EBMLMaxSizeLength in the header; how should that be
> > handled?) (alternate edge case: a Void element occurring in (or before)
> an
> > EBML element with a size length is > 8 and occurring before
> > EBMLMaxSizeLength). Should the spec explicitly require parsers to parse
> as
> > if EBMLMaxSizeLength is 8 unless and until explicitly told otherwise?
> > Do the limitations of EBMLMaxSizeLength apply to the document
> immediately?
> The values in the EBML Header describe what the EBML parser will need
> to parse the EBML Stream. On the other hand it should always be safe
> to read the EBML Header even if your parser cannot handle the Stream
> due to internal limitations. So we may define in the EBML specs that,
> for the EBML Header, the ID Length must not be longer than 4 and the
> Size Length may never be more than 8, maybe even 4 (I'd favor 4).

That would be great if that could be mentioned in the EBML spec (and I'd
favor 4 as well).

> > Shouldn't EBMLMaxIDLength have a range of > 3 (given that the EBML
> element
> > has an ID length of 4)?
> Not necessarily, as small EBML Doctypes may not need that much and
> favor saving container space. As said above, the values in the EBML
> Header describe the Doctype, the EBML Stream. Not the EBML Header
> itself. We should definitely clarify that in the specs.

This too would be great to have in the specs.

> Typo in the EBML spec in the Length definition for the Binary data type:
> “A
> > Master-element” should be “A Byte Element”
> Which document? I could not find this.

The spec at:

It's in the portion of the EBML spec that defines the Length for the
"Element Data Type: Binary".

> > The EBML spec says that the Reserved ID (all bits set to 1) is the only
> ID
> > that may change the Length Descriptor (the count of leading zeroes + 1).
> > What exactly does it mean to "change the Length Descriptor?" Does this
> mean
> > a Length Descriptor can be > 4 (even if EBMLMaxIDLength = 4) if the ID is
> > the Reserved ID?
> I think it doesn't make sense as it is. I think it refers to the fact
> that IDs should always be coded in their lowest form. But when all set
> to 1 or 0, there's no default size.

I personally don't think it makes sense to consider IDs with all VINT_DATA
bits set to 1 as being the same ID. I consider them all as distinct IDs
(which makes sense if you think of reading them like an unsigned variable
sized integer; they're IDs that represent the integer values 0, 127, 16383,
2097151, and 268435455). I read the EBML spec reserving these IDs as a
means of forward compatibility, reserving 5 IDs* for future revisions to
the spec. Should future revisions to EBML require new elements, it can
safely draw from these 5 IDs.

If that's not the purpose of these reserved IDs, then I don't think it
makes sense to reserve them in the first place (and documents should be
free to use them, especially the very valuable Class A elements).

*The IDs (including the VINT_WIDTH and VINT_MARKER bits):
(etc. for longer IDs)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.matroska.org/pipermail/matroska-devel/attachments/20151012/1ecec4b4/attachment.html>

More information about the Matroska-devel mailing list