[Matroska-devel] Question about Test Suite

Vignesh Venkatasubramanian vigneshv at google.com
Fri Jul 31 19:20:37 CEST 2015


On Wed, Jul 29, 2015 at 11:05 PM, Steve Lhomme <slhomme at matroska.org> wrote:
> When encountering invalid data you should read untilyou find something
> valid for that level, reach the end of the current Master or is the
> start of an upper level.
>
> Also note that invalid data is different than unknown data. Unknown
> data would be valid EBML ID + Length but not known to your EBML
> semantic. You should should ignore and skip such data.

Right, that's exactly my point. When you begin parsing at 451451, you
see 0XEA which is a valid EBML ID (albeit unknown). So the parser
would naturally try to read the length of the unknown EBML element
from position 451452 until it finds a valid length. Correct? But
that's not what mkvinfo seem to do here. Somehow it skips 451451 and
parses the next EBML ID at 451452.

>
> On Mon, Jul 27, 2015 at 9:36 PM, Vignesh Venkatasubramanian
> <vigneshv at google.com> wrote:
>> On Mon, Jul 27, 2015 at 12:14 PM, wm4 <nfxjfg at googlemail.com> wrote:
>>> On Mon, 27 Jul 2015 09:58:40 -0700
>>> Vignesh Venkatasubramanian <vigneshv at google.com> wrote:
>>>
>>>> On Fri, Jul 24, 2015 at 12:00 AM, Moritz Bunkus <moritz at bunkus.org> wrote:
>>>> > Hey,
>>>> >
>>>> >> How does mkvinfo get to 451452 to begin reading the next element
>>>> >> instead?
>>>> >
>>>> > mkvinfo uses libebml's EbmlMaster::Read() function for reading a cluster
>>>> > en bloc and then outputs which elements libebml has found.
>>>> >
>>>> > EbmlMaster::Read() uses EbmlEmenet::FindNextElement() under the hood
>>>> > for finding the next ID. And that function reads as many bytes as
>>>> > needed for forming a valid ID.
>>>> >
>>>> > So if the byte at position 451451 happens to be 0 (which I haven't
>>>> > verified, mind you) then it will probably be skipped as an EBML ID's
>>>> > first byte is never 0. Similar reasoning for other values for the
>>>> > first byte, e.g. anything smaller than 0x10 would be invalid as EBML
>>>> > IDs can only be four bytes long, and the position of first bit set to
>>>> > 1 determines the size of the EBML ID.
>>>>
>>>> Thanks for the explanation. But the problem is, the byte at position
>>>> 451451 is 0xEA which is a valid EBML ID right? How does mkvinfo skip
>>>> just that byte? FYI, here are the 100 bytes starting from position
>>>> 451417: http://pastebin.com/jKxthKwF
>>>>
>>>> I ran the file through ffmpeg's matroska parser. That resyncs to the
>>>> next level 1 element as soon as it sees the invalid track number (i.e.
>>>> position 451518) by doing a byte-by-byte search. So it merely ignores
>>>> the value at 451451 as it is not a valid level 1 id.
>>>
>>> Yes, it probably tries "resyncing" by trying the next cluster. Some
>>> examples:
>>>
>>> http://git.videolan.org/?p=ffmpeg.git;a=blob;f=libavformat/matroskadec.c;h=b54679811854ba7fb42225bf086d8b737d0f2d4a;hb=HEAD#l661
>>>
>>> https://github.com/mpv-player/mpv/blob/master/demux/ebml.c#L225
>>>
>>> http://git.1f0.de/gitweb?p=ffmpeg.git;a=blob;f=libavformat/MatroskaParser.c;h=094fc4f282562702ddafb8fc5765e6facd7ce28c;hb=8b3deb5221b265723fc356add5d06a066bc8e50e#l2482
>>>
>>
>> At what point should i give up and try re-syncing? As soon as i see
>> the invalid track number 87 (at position 451418)? It looks like that's
>> what ffmpeg does. Whereas mkvinfo does not re-sync to the next Cluster
>> at all. It somehow manages to show the other (albeit unknown/dummy
>> IDs) which i think should be the intended behavior rather than
>> resync'ing to the next Cluster.
>>
>>> _______________________________________________
>>> Matroska-devel mailing list
>>> Matroska-devel at lists.matroska.org
>>> http://lists.matroska.org/cgi-bin/mailman/listinfo/matroska-devel
>>> Read Matroska-Devel on GMane: http://dir.gmane.org/gmane.comp.multimedia.matroska.devel
>>
>>
>>
>> --
>> Vignesh
>> _______________________________________________
>> Matroska-devel mailing list
>> Matroska-devel at lists.matroska.org
>> http://lists.matroska.org/cgi-bin/mailman/listinfo/matroska-devel
>> Read Matroska-Devel on GMane: http://dir.gmane.org/gmane.comp.multimedia.matroska.devel
>
>
>
> --
> Steve Lhomme
> Matroska association Chairman
> _______________________________________________
> Matroska-devel mailing list
> Matroska-devel at lists.matroska.org
> http://lists.matroska.org/cgi-bin/mailman/listinfo/matroska-devel
> Read Matroska-Devel on GMane: http://dir.gmane.org/gmane.comp.multimedia.matroska.devel



-- 
Vignesh


More information about the Matroska-devel mailing list