[Matroska-devel] Question about Test Suite

Steve Lhomme slhomme at matroska.org
Thu Jul 30 08:05:37 CEST 2015


When encountering invalid data you should read untilyou find something
valid for that level, reach the end of the current Master or is the
start of an upper level.

Also note that invalid data is different than unknown data. Unknown
data would be valid EBML ID + Length but not known to your EBML
semantic. You should should ignore and skip such data.

On Mon, Jul 27, 2015 at 9:36 PM, Vignesh Venkatasubramanian
<vigneshv at google.com> wrote:
> On Mon, Jul 27, 2015 at 12:14 PM, wm4 <nfxjfg at googlemail.com> wrote:
>> On Mon, 27 Jul 2015 09:58:40 -0700
>> Vignesh Venkatasubramanian <vigneshv at google.com> wrote:
>>
>>> On Fri, Jul 24, 2015 at 12:00 AM, Moritz Bunkus <moritz at bunkus.org> wrote:
>>> > Hey,
>>> >
>>> >> How does mkvinfo get to 451452 to begin reading the next element
>>> >> instead?
>>> >
>>> > mkvinfo uses libebml's EbmlMaster::Read() function for reading a cluster
>>> > en bloc and then outputs which elements libebml has found.
>>> >
>>> > EbmlMaster::Read() uses EbmlEmenet::FindNextElement() under the hood
>>> > for finding the next ID. And that function reads as many bytes as
>>> > needed for forming a valid ID.
>>> >
>>> > So if the byte at position 451451 happens to be 0 (which I haven't
>>> > verified, mind you) then it will probably be skipped as an EBML ID's
>>> > first byte is never 0. Similar reasoning for other values for the
>>> > first byte, e.g. anything smaller than 0x10 would be invalid as EBML
>>> > IDs can only be four bytes long, and the position of first bit set to
>>> > 1 determines the size of the EBML ID.
>>>
>>> Thanks for the explanation. But the problem is, the byte at position
>>> 451451 is 0xEA which is a valid EBML ID right? How does mkvinfo skip
>>> just that byte? FYI, here are the 100 bytes starting from position
>>> 451417: http://pastebin.com/jKxthKwF
>>>
>>> I ran the file through ffmpeg's matroska parser. That resyncs to the
>>> next level 1 element as soon as it sees the invalid track number (i.e.
>>> position 451518) by doing a byte-by-byte search. So it merely ignores
>>> the value at 451451 as it is not a valid level 1 id.
>>
>> Yes, it probably tries "resyncing" by trying the next cluster. Some
>> examples:
>>
>> http://git.videolan.org/?p=ffmpeg.git;a=blob;f=libavformat/matroskadec.c;h=b54679811854ba7fb42225bf086d8b737d0f2d4a;hb=HEAD#l661
>>
>> https://github.com/mpv-player/mpv/blob/master/demux/ebml.c#L225
>>
>> http://git.1f0.de/gitweb?p=ffmpeg.git;a=blob;f=libavformat/MatroskaParser.c;h=094fc4f282562702ddafb8fc5765e6facd7ce28c;hb=8b3deb5221b265723fc356add5d06a066bc8e50e#l2482
>>
>
> At what point should i give up and try re-syncing? As soon as i see
> the invalid track number 87 (at position 451418)? It looks like that's
> what ffmpeg does. Whereas mkvinfo does not re-sync to the next Cluster
> at all. It somehow manages to show the other (albeit unknown/dummy
> IDs) which i think should be the intended behavior rather than
> resync'ing to the next Cluster.
>
>> _______________________________________________
>> Matroska-devel mailing list
>> Matroska-devel at lists.matroska.org
>> http://lists.matroska.org/cgi-bin/mailman/listinfo/matroska-devel
>> Read Matroska-Devel on GMane: http://dir.gmane.org/gmane.comp.multimedia.matroska.devel
>
>
>
> --
> Vignesh
> _______________________________________________
> Matroska-devel mailing list
> Matroska-devel at lists.matroska.org
> http://lists.matroska.org/cgi-bin/mailman/listinfo/matroska-devel
> Read Matroska-Devel on GMane: http://dir.gmane.org/gmane.comp.multimedia.matroska.devel



-- 
Steve Lhomme
Matroska association Chairman


More information about the Matroska-devel mailing list