[Matroska-devel] Representation of payload for SeekHead entries

Steve Lhomme slhomme at matroska.org
Sat Nov 19 14:23:36 CET 2011

PS: Also as you can see in the specs, IDs are only supported up to 4
bytes and not 8.

IDs are not to be seen as integer but as a whole "marker". The UTF-8
like encoding is just there to tell the size of the ID. There is no
reason to use more bytes for a given ID. Even though some smart
parsers may be able to handle it (libebml can do it but up to 4 bytes

Now to confuse you even more, all IDs have been chosen so that their
"integer value" (like you use in your code) do not collide with
another element of a different size.

On Sat, Nov 19, 2011 at 2:17 PM, Steve Lhomme <slhomme at matroska.org> wrote:
> OK, here I am. I haven't read the whole thread but I think I
> understand where the confusion is about the Seek ID.
> It's in binary, so the content format has nothing to do with how you
> interpret an integer, even though by some stretching you know it
> represents an integer.
> But in Matroska and EBML in general, Element IDs are not exactly
> integers. They are 1, 2, 3 or 4 bytes. Unlike integers you can't use a
> "4 bytes" to represent a "2 bytes" ID. I think most parsers would not
> be able to cope with it.
> So in the end the ID is stored in the exact same binary form it
> appears in the file/stream. That makes it a lot easier to compare.
> On Fri, Nov 18, 2011 at 9:58 PM, Matthew Heaney
> <matthewjheaney at gmail.com> wrote:
>> On Fri, Nov 18, 2011 at 3:39 PM, Moritz Bunkus <moritz at bunkus.org> wrote:
>>> The only thing that still comes to mind is that reading an element ID
>>> and an element size is not the same, nor is the result the same from
>>> the application's perspective (the "application" being the layer above
>>> EBML, in this case Matroska). If an EBML parser reads a single byte
>>> "0x81" as an element ID then it has to pass "0x81" to the layer above.
>>> If it reads that same single byte "0x81" as the element's size then it
>>> only passes "0x01" to the layer above.
>> Well in my case, it passes 0x01 to the layer above.
>>> a) The byte sequence "0x40 01" represents a different EBML ID than the
>>> byte sequence "0x81" does or
>>> b) An EBML parser has to normalize element IDs to their shortest
>>> possible representation before passing it upstream in which case "0x40
>>> 01" and "0x81" would be the same ID.
>> This (b) is my assumption.  (The WebM parser passes 0x01 upstream.)
>>> If the WebM parser already normalizes upon reading then I'd say just
>>> leave it like it is. Accept as much weird cases as possible but only
>>> write the byte sequences explicitly listed in the specs.
>> Agreed.
>>>> Can the value for a Cluster ID to be represented in the stream using
>>>> more than 4 bytes?  Forget about what the Matroska spec says.  Is it
>>>> valid, for example, for a Cluster ID to be represented as 0x01 00 00
>>>> 00 0F 34 B6 75", if the EBML header says that element IDs are 8 bytes
>>>> or less?
>>> Valid to what? Either I should forget about the specs in which case I
>>> don't have any basis to decide whether or not something is valid or I
>>> can say it is valid (or not) according to the specs ;) Just
>>> nitpicking.
>> There is no such thing as "according to the specs".  Specs don't exist
>> in some Platonic realm: they are written and interpreted by humans,
>> and so there can be ambiguity in their meaning and interpretation.
>> My argument (perhaps incorrect) is that the values listed in the spec
>> itself are non-normalized, and that in an actual file, an ID having
>> any representation consistent with the max length value in the EBML
>> header is valid.  IMHO it would be dangerous for a parser to make any
>> other assumption, but that's just me.  8^)
>> Thanks for the info.
>> Regards,
>> Matt
>> <mailto:matthewjheaney at google.com>
>> _______________________________________________
>> Matroska-devel mailing list
>> Matroska-devel at lists.matroska.org
>> http://lists.matroska.org/cgi-bin/mailman/listinfo/matroska-devel
>> Read Matroska-Devel on GMane: http://dir.gmane.org/gmane.comp.multimedia.matroska.devel
> --
> Steve Lhomme
> Matroska association Chairman

Steve Lhomme
Matroska association Chairman

More information about the Matroska-devel mailing list