[Matroska-devel] EBML Schema

Dave Rice dave at dericed.com
Fri Aug 28 16:52:02 CEST 2015


> On Aug 28, 2015, at 2:50 AM, Moritz Bunkus <moritz at bunkus.org> wrote:
> Hey,
> I have no objections, however I don't know a lot about XML schemas in
> the first place (neither about DTDs, to be honest).

Honestly, I know a lot more about XML Schemas than I do about DTDs. As wikipedia mentions at https://en.wikipedia.org/wiki/Document_type_definition <https://en.wikipedia.org/wiki/Document_type_definition>, DTDs have largely been superseded by XML Schemas. And at this point I think that XML Schemas may be a more familiar analogy to use.

I think XML Schemas also share more in common with specdata.xml than DTDs do. Schemas use the <element> node and have maxOccurs and minOccurs attributes (specdata has semantically the same thing with mandatory and multiple), they both have a similar declaration of element type, element name and element description. Actually I think a semantically equivalent version of specdata.xml could be written as an XML Schema.

XML Schemas also offer a few advantages for machine readable expressions; for instance XML Schemas can mandate a particular pattern or regex for a value.

>> I propose the specdata.xml file here
>> https://github.com/Matroska-Org/foundation-source/blob/master/spectool/specdata.xml
>> <https://github.com/Matroska-Org/foundation-source/blob/master/spectool/specdata.xml>
>> is a good basis for the consideration of an EBML Schema. From what I
>> can see, specdata.xml is an expression of the EBML + Matroska
>> specifications to support automated creation of documentation, but the
>> structure of this already shares a lot of similarity to XML Schemas.
> For both documentation (e.g. the table on the matroska.org specs page is
> generated from this file) and code (libMatroska's class hierarchy is
> generated automatically from this file) actually.

Does specdata.xml play a role in mkvalidate? I'm thinking of the potential to have an ebmlvalidator where you can provide the EBML Schema to validate particular EBML docType.

>> Is there a preference in handling the standardization of Matroska:
>> documenting it in a similar fashion to our work in the EBML spec or to
>> define what an EBML Schema is and consider matroska an expression of
>> it?
> I'm not sure whether or not I understand the implications. But my gut
> feeling is that having a definition for an EBML Schema would benefit
> other formats than Matroska, too, therefore the latter seems the way to
> go.

I have the same feeling:
- document EBML as a specification that includes rules for defining a docType in the form of an EBML Schema
- write an EBML Schema (updated specdata.xml) for Matroska and maybe webM

>> Are some changes to specdata.xml acceptable? Such as a filename change
>> or changing the name of the <table> element of some attributes?
> Well, like I said above the specdata.xml is used for generating both
> documentation and code. Both should stay viable. If changes to it are
> made then the accompanying tools must be updated as well.
>> Neither the current EBML specs nor the specdata.xml specifically refer
>> to the hierarchical arrangement of the elements, but this could be
>> presumed by their ordering. For instance, could any level 3 element be
>> a child of any level 2 Master-element? I presume not, but I don't
>> think it's clear anywhere what parent-child relationships are
>> feasible. Possibly specdata.xml and/or the EBML Schema Definition
>> could define the relationship between levels of related elements
>> similar to how an XML Schema (XSD) does.
> So far it is understood that an element not marked as a global element
> must only occur as a child of its parent. Its parent is the last element
> located before the child element in the specdata file with a lower level
> than the child element. Or something like that.

This will need some documentation. That's how I've understood the mkv spec as well but the definition for how an EBML Schema works should be explicit about this.


Best Regards,
Dave Rice

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.matroska.org/pipermail/matroska-devel/attachments/20150828/01e3dd5b/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.matroska.org/pipermail/matroska-devel/attachments/20150828/01e3dd5b/attachment.sig>

More information about the Matroska-devel mailing list