[Matroska-devel] Representation of payload for SeekHead entries

Matthew Heaney matthewjheaney at gmail.com
Fri Nov 18 21:58:44 CET 2011


On Fri, Nov 18, 2011 at 3:39 PM, Moritz Bunkus <moritz at bunkus.org> wrote:
>
> The only thing that still comes to mind is that reading an element ID
> and an element size is not the same, nor is the result the same from
> the application's perspective (the "application" being the layer above
> EBML, in this case Matroska). If an EBML parser reads a single byte
> "0x81" as an element ID then it has to pass "0x81" to the layer above.
> If it reads that same single byte "0x81" as the element's size then it
> only passes "0x01" to the layer above.

Well in my case, it passes 0x01 to the layer above.


> a) The byte sequence "0x40 01" represents a different EBML ID than the
> byte sequence "0x81" does or

> b) An EBML parser has to normalize element IDs to their shortest
> possible representation before passing it upstream in which case "0x40
> 01" and "0x81" would be the same ID.

This (b) is my assumption.  (The WebM parser passes 0x01 upstream.)


> If the WebM parser already normalizes upon reading then I'd say just
> leave it like it is. Accept as much weird cases as possible but only
> write the byte sequences explicitly listed in the specs.

Agreed.


>> Can the value for a Cluster ID to be represented in the stream using
>> more than 4 bytes?  Forget about what the Matroska spec says.  Is it
>> valid, for example, for a Cluster ID to be represented as 0x01 00 00
>> 00 0F 34 B6 75", if the EBML header says that element IDs are 8 bytes
>> or less?
>
> Valid to what? Either I should forget about the specs in which case I
> don't have any basis to decide whether or not something is valid or I
> can say it is valid (or not) according to the specs ;) Just
> nitpicking.

There is no such thing as "according to the specs".  Specs don't exist
in some Platonic realm: they are written and interpreted by humans,
and so there can be ambiguity in their meaning and interpretation.

My argument (perhaps incorrect) is that the values listed in the spec
itself are non-normalized, and that in an actual file, an ID having
any representation consistent with the max length value in the EBML
header is valid.  IMHO it would be dangerous for a parser to make any
other assumption, but that's just me.  8^)

Thanks for the info.

Regards,
Matt

<mailto:matthewjheaney at google.com>



More information about the Matroska-devel mailing list