[Matroska-devel] Representation of payload for SeekHead entries

Steve Lhomme slhomme at matroska.org
Sat Nov 19 14:23:36 CET 2011


PS: Also as you can see in the specs, IDs are only supported up to 4
bytes and not 8.

IDs are not to be seen as integer but as a whole "marker". The UTF-8
like encoding is just there to tell the size of the ID. There is no
reason to use more bytes for a given ID. Even though some smart
parsers may be able to handle it (libebml can do it but up to 4 bytes
only).

Now to confuse you even more, all IDs have been chosen so that their
"integer value" (like you use in your code) do not collide with
another element of a different size.

On Sat, Nov 19, 2011 at 2:17 PM, Steve Lhomme <slhomme at matroska.org> wrote:
> OK, here I am. I haven't read the whole thread but I think I
> understand where the confusion is about the Seek ID.
>
> It's in binary, so the content format has nothing to do with how you
> interpret an integer, even though by some stretching you know it
> represents an integer.
>
> But in Matroska and EBML in general, Element IDs are not exactly
> integers. They are 1, 2, 3 or 4 bytes. Unlike integers you can't use a
> "4 bytes" to represent a "2 bytes" ID. I think most parsers would not
> be able to cope with it.
>
> So in the end the ID is stored in the exact same binary form it
> appears in the file/stream. That makes it a lot easier to compare.
>
> On Fri, Nov 18, 2011 at 9:58 PM, Matthew Heaney
> <matthewjheaney at gmail.com> wrote:
>> On Fri, Nov 18, 2011 at 3:39 PM, Moritz Bunkus <moritz at bunkus.org> wrote:
>>>
>>> The only thing that still comes to mind is that reading an element ID
>>> and an element size is not the same, nor is the result the same from
>>> the application's perspective (the "application" being the layer above
>>> EBML, in this case Matroska). If an EBML parser reads a single byte
>>> "0x81" as an element ID then it has to pass "0x81" to the layer above.
>>> If it reads that same single byte "0x81" as the element's size then it
>>> only passes "0x01" to the layer above.
>>
>> Well in my case, it passes 0x01 to the layer above.
>>
>>
>>> a) The byte sequence "0x40 01" represents a different EBML ID than the
>>> byte sequence "0x81" does or
>>
>>> b) An EBML parser has to normalize element IDs to their shortest
>>> possible representation before passing it upstream in which case "0x40
>>> 01" and "0x81" would be the same ID.
>>
>> This (b) is my assumption.  (The WebM parser passes 0x01 upstream.)
>>
>>
>>> If the WebM parser already normalizes upon reading then I'd say just
>>> leave it like it is. Accept as much weird cases as possible but only
>>> write the byte sequences explicitly listed in the specs.
>>
>> Agreed.
>>
>>
>>>> Can the value for a Cluster ID to be represented in the stream using
>>>> more than 4 bytes?  Forget about what the Matroska spec says.  Is it
>>>> valid, for example, for a Cluster ID to be represented as 0x01 00 00
>>>> 00 0F 34 B6 75", if the EBML header says that element IDs are 8 bytes
>>>> or less?
>>>
>>> Valid to what? Either I should forget about the specs in which case I
>>> don't have any basis to decide whether or not something is valid or I
>>> can say it is valid (or not) according to the specs ;) Just
>>> nitpicking.
>>
>> There is no such thing as "according to the specs".  Specs don't exist
>> in some Platonic realm: they are written and interpreted by humans,
>> and so there can be ambiguity in their meaning and interpretation.
>>
>> My argument (perhaps incorrect) is that the values listed in the spec
>> itself are non-normalized, and that in an actual file, an ID having
>> any representation consistent with the max length value in the EBML
>> header is valid.  IMHO it would be dangerous for a parser to make any
>> other assumption, but that's just me.  8^)
>>
>> Thanks for the info.
>>
>> Regards,
>> Matt
>>
>> <mailto:matthewjheaney at google.com>
>> _______________________________________________
>> Matroska-devel mailing list
>> Matroska-devel at lists.matroska.org
>> http://lists.matroska.org/cgi-bin/mailman/listinfo/matroska-devel
>> Read Matroska-Devel on GMane: http://dir.gmane.org/gmane.comp.multimedia.matroska.devel
>>
>
>
>
> --
> Steve Lhomme
> Matroska association Chairman
>



-- 
Steve Lhomme
Matroska association Chairman



More information about the Matroska-devel mailing list