[Matroska-devel] Hi, question about the MKV tags

Santiago Jimeno sjimeno at ya.com
Sat Feb 5 17:28:25 CET 2011


In the first place I should clarify some things:
I am writing a tag editor for all systems of existent Tags and Matroska tags 
is only one of them.
The link I put to MatroskaFileList contains only an extract (some classes) 
of the whole editor.
I don't have in Matroska Tags a special interest. I have the same interest 
that with any other Tags system (I don't have preferences).

I wrote indicating, mistaken or not, what I proposed to improve the Tags 
edition. But the decision corresponds to Matroska team that are who can want 
a bigger success for the Tags. If Matroska Tags is successful I will 
maintain the edition in my program, anyone are the norms. If they are not 
successful I will remove it. For your tranquility I already commented the 
problems of FLAC and WMA-ASF. WMA-ASF write Tags and Pictures in 5 file 
blocks different. An complete example of complexity.
None of my messages contained personal criticizes and my opinions were based 
on data of my experience. I don't have necessity to support personal 
comments and without arguments of somebody as the such Boris that doesn't 
know me of anything.
For that reason, if you have some interest in continuing talking about this 
topic I suggest that we pass to private messages.

As for your message. I don't have problem if Tags or Attachments are at the 
front or at the end. But they are really at file middle and this is what 
originates big empty spaces and other problems. So if they are really at 
front it will be necessary to take them out of Segment and to put them in 
Header block to make them independent of Cluster blokcs. But this will vary 
the current structure of the container, with incompatibilty back or al least 
important changes. For that reason I insisted on putting them at end (that 
doesn't modify the container structure) and not for any other interest. But 
in spite of the flexibility, the best thing as much for reading than for 
writing is to always find the blocks in the same place. If Tags continues at 
file middle it will be necessary to endow them of the appropriate padding, 
like in SeekHead. But which is the appropriate padding to add Pictures at 
Segment middle? One megabyte or more?

There is not problem to write SeekHead if there is space void behind. All 
the systems of Tags demand some modification type when being edited (blocks 
size, Seekhead or what they request) and in some cases a simple word 
modification supposes to write sizes in multiple places. But anyway writing 
SeekHead is not complex. I change its content in a dynamic way during the 
edition. Then, when concluding it's only necessary to write Seek block on 
its place (if there is space void like in mkvmerge, or even smaller void 
size).

As for LYRICS there is two posiblidades. To show them in real time and this 
would be a subtitle in a player. As my program it's not a player I only 
think about to add it like TagString in a SimpleTag and with TargetType = 30 
(track).

Regards. Santiago

----- Original Message ----- 
From: "Steve Lhomme" <slhomme at matroska.org>
To: "Discussion about the current and future development of Matroska" 
<matroska-devel at lists.matroska.org>
Sent: Saturday, February 05, 2011 1:59 PM
Subject: Re: [Matroska-devel] Hi, question about the MKV tags


On Thu, Feb 3, 2011 at 1:34 PM, Santiago Jimeno <sjimeno at ya.com> wrote:
> If we want Matroska Tags are successful (up to now very little) we should
> think on all aspects: file reading, tags writing when one creates the file
> and edition easiness. And we should give practical solutions to all them
> without preferences for none, since the end users will be who imposes 
> their
> preferences. But actually on last two aspects nobody doesn't take care.

This is precisely why the case of Tags is taking particular care in
the "ordering guideline". In an ideal world, tags should always be at
the front. After all ID3v2 is always at the front and I haven't seen
much people complain. Sometimes when you edit such files you have to
rewrite the whole file. This is OK for audio files, but terrible for
video files. So at least we provide the option, emphasing that
whenever possible they should be at the front. Because it's easier for
streaming or scanning files.

> For take care on last two aspects I offered those three proposals: 1-
> Appropriate Void space behind SeekHead. 2- Tags always at the end. 3-
> Possibility to include Picture inside Tags besides the possibility that 
> they
> are also included in Attachments according the file target or file use.
> These three proposals don't modify the global structure of the container.
> Only are modified Order Guidelines on blocks location. How this to make?
> It's very easy and I see for your answers that you are already beginning 
> to
> meditate on it.

I will add #1 soon to the ordering guideline.
#2 is not a good option
#3 still thinking about it.

> If they stay like they are now, I have already exposed the current
> situation: they will always finish at the end with big empty and useless
> spaces in the middle. Respecting the container norms to the maximum I 
> don't
> have more remedy than to always put Tags and Pictures at the end when I
> write editing code. The best thing would be then Order Guidelines 
> specifies
> definitive positions for both blocks from the same moment of file 
> creation,
> and if they form one only block at the end would be still better.

You have a good (negative) point here. Even when they are at the end,
if attachments and tags are modified at the same time, you still need
to update the positions in the SeekHead at the front of the file. So
whether they were at the front of the back still requires the ability
to edit the SeekHead. Which is the most "complex" part when having to
move tags from the front to the back.

> Be which is team decision, I think it's necessary to thoroughly meditate 
> on
> above-mentioned and on Order Guidelines locations. Maybe other Matroska
> programmers could also give an opinion on this.

I agree.

> A less important question: Where we put LYRICS? Is it possible to add an
> official SimpleTag with this Tagname?

Lyrics should be a subtitle track IMO. But there should be a way to
tell that a particular subtitle track is for lyrics (or karaoke). I'll
have to check the specs if there's already something like that.

Steve

> Regards. Santiago
>
> ----- Original Message ----- From: "Steve Lhomme" <slhomme at matroska.org>
> To: "Discussion about the current and future development of Matroska"
> <matroska-devel at lists.matroska.org>
> Sent: Thursday, February 03, 2011 7:27 AM
> Subject: Re: [Matroska-devel] Hi, question about the MKV tags
>
>
> On Wed, Feb 2, 2011 at 11:23 PM, Santiago Jimeno <sjimeno at ya.com> wrote:
>>
>> Only some puntualización:
>>
>> Steve said: "In the system you propose, adding one byte in the existing
>> tags
>> would make a big empty void in the file"
>> Not if they are already at the end of the file, a block is substituted 
>> for
>> other and the file only increases 1 byte, without empty spaces, because
>> there is not any other block at the end (a simple file append).
>
> Yes but again. Your reason to put it at the end is for editing. My
> reason to put it at the wrong is for better reading. For example
> seeking in HTTP is slow. It's much better if everything is at the
> front.
>
>> Regarding picture structure in Attachments, anyone can realize that it's
>> almost identical to APIC ID3. Here a comparison of both structures.
>>
>> ID3 MATROSKA
>> ---- FileName
>> Mime type FileMimeType
>> Picture type ----
>> Description FileDescription
>> Picture data FileData
>>
>> Lastly, as I said in my message, I only do suggestions, the decisions
>> belong
>> to you. Now I should work according to the current norms of the 
>> container.
>> I
>> will try to finish the editor when I can (I work in the one in my free
>> time).
>>
>> Have me to the current of your opinion if you are able to install the
>> program and to test it
>
> Not yet. I will try but I don't know when.
>
>> Regards. Santiago
>>
>> ----- Original Message ----- From: "Steve Lhomme" <slhomme at matroska.org>
>> To: "Discussion about the current and future development of Matroska"
>> <matroska-devel at lists.matroska.org>
>> Sent: Wednesday, February 02, 2011 9:09 PM
>> Subject: Re: [Matroska-devel] Hi, question about the MKV tags
>>
>>
>> On Wed, Feb 2, 2011 at 3:27 AM, Santiago Jimeno <sjimeno at ya.com> wrote:
>>>
>>> Thank you to accept my suggestion regarding the VOID space behind
>>> SeekHead.
>>>
>>> Regarding installation error see you this post:
>>> http://blog.colinmackay.net/archive/2007/06/21/36.aspx
>>>
>>> As for covertart, including it in audio files was, as you know, an idea
>>> of
>>> ID3 that integrated it in Tags block and with binary code. All Tags
>>> systems
>>> adapted as better they could this idea. Generally transcribing the same
>>> APIC
>>> ID3 format almost literally.
>>
>> But none of them had proper file attachment support. That's why they
>> used this hack.
>>
>>> All the current systems (ID3, iTunes-MP4, WMA-ASF, APE, VORBIS) 
>>> integrate
>>> coverart inside the Tags block. This is fundamentally for effectiveness
>>> when
>>> writing and editing. There are only two exceptions FLAC and MATROSKA.
>>> Sometimes WMA, if there is one Picture of big size, put it outside of 
>>> the
>>> Tags block. But this is due to a space problem caused by a design error
>>> (there is only an Int16 to specify Data Length, with what the maximum
>>> size
>>> for each picture is 65535 bytes).
>>
>> I understand that cover art may be considered as a metadata.
>>
>>> FLAC works with Vorbis Comments Tags system. Previously it included
>>> coverart
>>> in a PICTURE block separated since Vorbis had not decided where to
>>> include
>>> it. When Vorbis decided to include coverart put it inside the Tags 
>>> block.
>>> Now FLAC can write coverart in two places: in the old place and inside
>>> Vorbis Comments (??). Of course, FLAC won't change neither this as 
>>> Martin
>>> Leese of FLAC team told me.
>>>
>>> Steve said "I still don't know what's wrong with attachments as they are
>>> now."
>>> There is not wrong with Attachments just as they are. It's a good idea
>>> for
>>> streaming. But for all the other cases it's more effective and more
>>> rational
>>> to include them in the Tags block. When all Tags systems make it this 
>>> way
>>> for something will be. I only can add my suggestions for my experience
>>> working with Tags. Maybe if you try to write code to edit Tags and
>>> covertart
>>> you coincide with me. As for the blocks location it's also ineffective 
>>> to
>>
>> The thing is, cover art is usually for the entire segment, whereas in
>> Matroska tags are often not for the whole segment. That's why I think
>> it's good that it's easy and straightforward to be able to read cover
>> art without having to support all the tag system. Attachments are flat
>> and very easy to parse. Tags have different level and signification
>> depending on some values. So anything could be easily misunderstood as
>> cover art if not implemented properly.
>>
>>> locate them in the file middle knowing that when being edited they will
>>> be
>>> placed at the file end. When editing Attachments and Tags in
>>> cover_art.mkv
>>> file and are transferred at the file end leaves a excessive VOID space 
>>> of
>>> aprox. 416000 bytes (very possibly useless). On the other hand the 
>>> normal
>>> thing is not that the pictures are already inserted and we don't need to
>>> change them, the normal thing is that there is not any and we have to 
>>> put
>>> them.
>>
>> That's actually a good point to separate the cover art in attachment
>> and tags that are usually edited. In the system you propose, adding
>> one byte in the existing tags would make a big empty void in the file.
>> That's ugly. Like I said, cover art are rarely edited. They are either
>> added or removed, but rarely modified. And as the "ordering guideline"
>> suggests, there are many good reasons why all these metadata should be
>> at the front whenever possible.
>>
>>> Steve said: "Adding binary in tags should be avoided, because it's not
>>> implied how to interpret it. For pictures it could be JPEG, PNG, SVG,
>>> etc.
>>> Attachments have a MIME type for that, tags don't. You can notice in the
>>> specs the very few formats with binary values are those with only one
>>> possible format."
>>> The APIC ID3 format, the same one also adopted by Matroska in
>>> Attachments,
>>> includes all necessary data to interpret the binary code that goes next
>>> and
>>> one can put it perfectly inside the Tags block. When all Tags are
>>> declared
>>> as UTF8 string, like in Vorbis case, what they have made with covertart
>>> is
>>> to translate the whole package to Base68 that is compatible with UTF8.
>>
>> We're certainly not going to use Base64 when our format is good at
>> handling binary data. ID3 is badly designed and basically a hack over
>> something missing in MP3 and then other formats. There's no reason to
>> copy its flaws.
>>
>>> Lastly, like FLAC case, I can only do suggestions. The decisions and the
>>> future of Tags depend on Matroska team if one wants that the Tags system
>>> is
>>> something more than a project. I'm collaborating to improve the
>>> development
>>> of Matroska Tags with my 'almost' completed editor (I think it's the 
>>> only
>>> one at the moment). But as you say at the beginning of your message: "So
>>> there is just one system to deal with everything"
>>
>> Well, there has many many "only one moment" in the past about tag
>> changes. I think we have reached a good consensus on everything that
>> is possible. Maybe adding a MIME type next to binary data in tags
>> could be a good move. But the recommended/default cover art should be
>> in attachments in my opinion. That also makes it easier to make tools
>> to scan for "files" inside matroska files if they are all in the same
>> place.
>>
>> --
>> Steve Lhomme
>> Matroska association Chairman
>> _______________________________________________
>> Matroska-devel mailing list
>> Matroska-devel at lists.matroska.org
>> http://lists.matroska.org/cgi-bin/mailman/listinfo/matroska-devel
>> Read Matroska-Devel on GMane:
>> http://dir.gmane.org/gmane.comp.multimedia.matroska.devel
>>
>>
>> _______________________________________________
>> Matroska-devel mailing list
>> Matroska-devel at lists.matroska.org
>> http://lists.matroska.org/cgi-bin/mailman/listinfo/matroska-devel
>> Read Matroska-Devel on GMane:
>> http://dir.gmane.org/gmane.comp.multimedia.matroska.devel
>>
>
>
>
> --
> Steve Lhomme
> Matroska association Chairman
> _______________________________________________
> Matroska-devel mailing list
> Matroska-devel at lists.matroska.org
> http://lists.matroska.org/cgi-bin/mailman/listinfo/matroska-devel
> Read Matroska-Devel on GMane:
> http://dir.gmane.org/gmane.comp.multimedia.matroska.devel
>
>
> _______________________________________________
> Matroska-devel mailing list
> Matroska-devel at lists.matroska.org
> http://lists.matroska.org/cgi-bin/mailman/listinfo/matroska-devel
> Read Matroska-Devel on GMane:
> http://dir.gmane.org/gmane.comp.multimedia.matroska.devel
>



-- 
Steve Lhomme
Matroska association Chairman
_______________________________________________
Matroska-devel mailing list
Matroska-devel at lists.matroska.org
http://lists.matroska.org/cgi-bin/mailman/listinfo/matroska-devel
Read Matroska-Devel on GMane: 
http://dir.gmane.org/gmane.comp.multimedia.matroska.devel





More information about the Matroska-devel mailing list