[Matroska-devel] 2 problems in libebml

Cyrius suiryc at yahoo.com
Sat Jan 10 13:02:21 CET 2004


Hi

People are experiencing problems with the files produced by the latest
versions of mkvmerge. And it seems related to a few problems inside
libebml that I spotted when debugging.

First let's see how those files are written by mkvmerge (using mkvinfo)
:

+ EBML head at 0
+ Segment at 24
|+ Seek head at 36
|+ EbmlVoid (size: 4029) at 103
|+ Segment information at 4135
| + Muxing application: libebml v0.6.3 + libmatroska v0.6.2 at 4140
| + Writing application: mkvmerge v0.8.1 at 4178
| + Duration: 2.520s at 4196
| + Date: Sat Jan 10 09:40:56 2004 UTC at 4203
| + Segment UID: 0x16 0x69 0x29 0x3e 0x06 0x0c 0x03 0x28 0xa6 0xb0 0xa0
0x4f 0x47 0x52 0xd4 0x75 at 4214
|+ Segment tracks at 4233
|+ EbmlVoid (size: 1024) at 4398
|+ Cluster at 5425
| + Cluster timecode: 0.000s at 5432
| + Block group at 5435
...
|+ Cues at 569825
|+ Seek head at 569859

The problem comes from the EbmlVoid written just after the KaxTracks
element. I guess Mosu added this void element in recent versions since
there was no troubles before.

So what happens bad due to this element inside VirtualDubMod, and due
to libebml (there is a workaround that could be used in VDM code, but I
think that libebml is mostly guilty here :p)?
Everything is fine up to KaxTracks. Then I ask libebml to give me the
next element (in the loop processing KaxTracks). As I can't know
beforehand if the next element will be inside KaxTracks, or an upper
element, I use the size of the KaxTracks as maximum size for the next
element to get (MaxDataSize), and say to don't get dummy elements
(AllowDummyElt=false).
So the first problem arise: libebml find the EbmlVoid element, but this
element size is 1024 bytes, while the size of KaxTracks is around 150
bytes. Since an EbmlVoid can be found at any level, libebml thinks it
is inside KaxTracks (while actually it is at the same level than
KaxTracks) and of course libebml is unhappy to see that the size of
this element (1024) doesn't fit inside the maximum size I gave (~150
bytes, i.e. the size of KaxTracks).

So libebml think that this EbmlVoid is not a valid element and decide
to not return it to me (VDM) and keep on searching the next element.

And here is where arise the second problem (yeah, all would have been
fine if there was no other problem :>). libebml search for the next
element ... the problem is that en EbmlVoid generally contains only
0x00 bytes ;). When libebml think it found an element (generally when
reading the bytes corresponding to the EbmlVoid coded size) it then
tries to get the size of the element ... and the 0x00 bytes make it
read 8 bytes from the file (note that the current libebml code doesn't
somehow 'flush' this buffer afterwards).
libebml then keep on trying to get the next ebml element ...
Finally the code find the KaxCluster :)
The problem is that the code already buffered more than 8 bytes. In
other words libebml already read more than the Ebml ID and the coded
size of the element, and the code doesn't seek back in the file.
So when you ask again libebml to get the next element, it won't start 
right inside the KaxCluster, but a few bytes off and thus you don't get
the first element inside the cluster (the first element generally being
the timecode).

To sum up, there are 2 problems:
1. with multi-level elements, when specifying a maximum size for the
element to find, if the size of the multi-level element (that actually
could belong to an upper level) is bigger than the specified size then
libebml b0rks
2. when searching the next Ebml element, libebml sometimes buffer too
much data ahead, and don't seek back in the file when it find a valid
ebml element (making the very first bytes of this element lost).


I guess that the size of multi-level elements shouldn't get tested
against the maximum size specified since there is no way to know the
level of such elements.
I don't think that buffering too much data is a real trouble for 2.
Indeed this only happens when the code couldn't find a valid element
right ahead. But the code should seek back in the file if it buffered
too much data ;)


Best regards
Cyrius

__________________________________
Do you Yahoo!?
Yahoo! Hotjobs: Enter the "Signing Bonus" Sweepstakes
http://hotjobs.sweepstakes.yahoo.com/signingbonus



More information about the Matroska-devel mailing list