[Matroska-devel] State of the libebml / libmatroska API and codebase

Moritz Bunkus moritz at bunkus.org
Sun Sep 16 11:14:40 CEST 2012


note: I'm not the original author, but MKVToolNix relies heavily on
both libraries and I've done bug fixes and other things on the

On Sun, Sep 16, 2012 at 10:52 AM, Arnavion <arnavion at gmail.com> wrote:

> Apart from the technical specs on the website about how EBML and Matroska
> files are laid out, and the documentation in the headers, is there no other
> documentation?


> The code seems to be... severely broken in places.

That is correct. I only correct stuff that I stumble upon during the
development of MKVToolNix. The only other application outside of the
Matroska project itself that uses libebml & libmatroska that I know of
is VLC. That's also the reason no one is working on those libraries:
almost no one uses them.

> Is my understanding correct? If so, then what exactly is the
> user of this API expected to do w.r.t. the argument he passes in and the
> element he gets back?

Quite possible.

> I understand that the API is deliberately low-level, but are there no
> abstractions for this? No GetTracks method, for example?

You're correct, it is low-level, and no, there are no higher-level functions.

> (Coding standards nitpick) libmatroska's headers have "using namespace
> libebml;" in them, which means any file which includes a libmatroska header
> can use libebml classes without prefixing the namespace. Headers should not
> have "using namespace" declarations.

Absolutely correct, but uncorrectable without breaking backwards compatibility.

> None of the functions have remotely correct documentation.


> If one is to take functions such as FindNextID and FindNextElement at face value, what
> then is the purpose of SkipData?

SkipData skips the element's data (meaning it seeks in the file) after
the header of an element has been read. For example, FindNextID only
reads the EBML ID and the length field but not the data (otherwise you
would suck the whole segment into memory if you were to read the
segment's ID). SkipData can then be called if you're not interested in
an ID that was found.

> What is the state of these two projects?

Pretty much unmaintained. I only fix issues that I come upon, and
libmatroska receives new classes for newly added Matroska elements.
That's about all.

Two weeks ago I added some more convenience functions in libebml
(GetValue(), SetValue(), SetValueUTF8() for EbmlUnicodeString) so that
you don't have to use the "operator ...()" functions provided by the
classes for storing values. Those changes are in SVN only so far, and
I'm still in the process of transforming MKVToolNix accordingly.

> These TODOs and broken behavior have survived for a long time; will they ever be fixed,

Highly unlikely. Especially things that would break API compatibility
will never get fixed or applications would get broken.

> or are users expected to work around them as it seems they have been doing so far?


> My apologies that this sounds so inflamatory. I'm just bewildered that these
> libraries are in this state.

No problem at all, you're completely right.

However, there is light at the end of the tunnel. In this case this
means that there are alternatives:

- Steve Lhomme has re-written libebml and libmatroska as pure C
libraries called libebml2 and libmatroska2. His intention was to
maintain those actively, though I don't know what their state is at
the moment.
- There's also Haali's Matroska access library that he uses in his
splitter. His library (written in C if I'm not mistaken) is available
to people who email him; see http://haali.su/mkv/
- You could also decide to use another more high-level framework like
gstreamer or ffmpeg. Both contain Matroska readers.

Kind regards,

More information about the Matroska-devel mailing list