[Matroska-devel] muxed usf

Christian HJ Wiesner chris at matroska.org
Sat Jan 24 08:51:58 CET 2004


 Hi,

sorry to bring this up again guys, but i feel we would have to undertake 
something here, or all the hard work invested into USF is maybe in vain. 
I heard it through the grapevine that at least some of the guys making 
nice USF editors are planning to convert them to output SSA instaed, as 
USF is still not usable in MKV or OGM.

Toff invested a lot of time into the USF specs, i feel there is not so 
much necessary to be done still, and we have them finished and USF 
working in MKV. As it seems impossible to motivate somebody to make USF 
muxing with EBML, i vote for muxing it the 'normal' way, means to put 
all the XMl data into a matroska block, with a timestamp and a duration.

unmei, could you think about talking to jcsston about using his new MKV 
muxer plugin ( or maybe even spyder's modified MILK version ) and add 
direct USF in MKV muxing to your great editor ? If we have this done, it 
should be quite easy for Mosu to read those streams in mkvmerge, so we 
can mux them with video.

Please guys, lets get this moving, or we risk to make ourselves look 
like complete idiots. USF has been around since more than one year now, 
and there are *5!* editors for it, and some of them should even be 
portable to Linux i heard !

Christian

unmei wrote:

> some random thoughts and approaches for muxing usf into matroska. 
> please keep in mind i have not much experience what is hard to 
> implement or takes much cpu power. also i am not implementing mkv, so 
> i dont know the internals -> that's why it's just a buch of thoughts 
> :) (RFC)
>
>------------------------------------------------------------------------
>
>unmei 2003-12-11
>
>usf in matroska (EBML)
>--------------------------
>TAGS
>*all tags are translated into EBML IDs
>**tags that are inevitable get the lowest ID integer
>**tags occuring the most often get as low ID integer as possible
>**tags that denote tags requiring much processing get high IDs
>**as we can assume we have all the most basic and skeleton tags today, new tags fall into the category "seldom used" or "high processing" and it is perfectly OK to assign them a high ID - like its inevitable as we assigned the low IDs already.
>**also i think most new elements in USF will not be tags, but attributes
>**hardware player support a subset of IDs starting at the lowest and going up to some specific ID . When they encounter a higher ID, they know its a rather special feature and can savely skip the entire entry or ignore the ID and process only its conten
>t as if the high ID did not exist. The same applies to playback filters.
>
>EMBED
>**Fonts embedded in XML-USF should be muxed into the MKV file as attachment file (and not stored in a <embed> in EBML-USF).
>**Pictures (embedded in XML-USF or external to XML-USF) should be inlined base64 encoded. This will eventually cause picture to be stored multiple times, but it avoids the necessity to either have all embeds loaded into RAM during the entire playback or
> seek to the embed during playback. 
>**Eventually the RAM chached method is faster and if the additional RAM used is no concern it were optimal since the embeds can be decoded before even start to playback (avoiding a CPU peak when encountering a picture). But embed without caching (read: 
>seek to the embed section and decode WHILE playback is in progress) is most likely going to kill the playability.
>**with the first picture method, there is no <embed> in EBML-USF, with the second picture method we do need a <embed>. Consequently the issue which method for pictures should be used must be decided before EBML-USF is implemented since <embed> is a root
>-level tag we need to know whether we need to assign a ID for it (assigning one now is safer than having to assign a higher ID later).
>
>OTHER POINTS
>*are subtitle streams laced (interleaved with audio and video steams) ? I guess not...anyway EBML-USF should use the same method as used for other subtitle formats.
>*a muxer must sort the <subtitle> by ascending start time. USF allows wild order, but this is sure not optimal for playback (u96 does order on save if you tell it to do so, but its not a must).
>*XML-USF files containing multiple <subtitles> sections are threaten like multiple subtitle files (each gets a stream, with metadata,styles... doubled).
>*If the subtitle streams are laced, a different approach were to interleave the <subtitles> also by ascending start time, howeve this way the playback filter must determine for each <subtitle> whether it belongs to a "stream" that has to be displayed --
>eek. If subtitles are not laced, its preferred to simple order "per-stream".
>
>
>PROPOSED TAG ID ORDER 
>top=low ID, bottom=high ID, 
>for shortness, some are on the same line: left is lower than right
>
>VERSION 1 : grouped
>
>USFSubtitles and its attributes ->"codec private", not assigning a ID
>
>subtitles
>subtitle
>text, br, b, i, u, font
>karaoke, k
>picture
>
>styles
>style
>fontstyle, position
>
>metadata
>title, language, lanuageext, date, comment
>author
>name, email, url, task
>
>effects
>effect
>keyframes
>keyframe
>
>embedded
>b64file
>
>shapes
>shape
>point, polygon, polyline, csg-union, csg-diff, csg-inter, bspline
>
>[new tags]
>
>
>
>VERSION 2 : by nesting level
>
>subtitles
>styles
>metadata
>effects
>embedded
>shapes
>
>subtitle
>style
>effect
>b64file
>shape
>
>text, karaoke, picture
>fontstyle, position
>title, language, lanuageext, date, comment, author
>keyframes
>
>br, b, i, u, font, k
>name, email, url, task
>keyframe
>
>point, polygon, polyline, csg-union, csg-diff, csg-inter, bspline
>
>[new tags]
>
>
>
>VERSION 3 : root-level, then frequency above everything
>
>subtitles, styles, metadata, effects, embedded, shapes
>
>subtitle
>text
>br, font, b, i, u
>karaoke, k
>style
>fontstyle, position
>
>picture
>b64file
>effect, keyframes, keyframe
>
>title, language, languageext, date, comment, author
>name, email, url, task
>
>shape
>point, polygon, polyline, csg-union, csg-diff, csg-inter, bspline
> 
>-----------------------------------
>personally i prefer version 3, this is the one i really tried to make a thight order. Version 1 and 2 are rather drafts to start with in case version 3 is widely rejected :)
>
>i've put the root-level on top in version 3 because i think it would be too confusing to encounter a root level element whose id were 30 or higher. A parser should in any case know what the root level-elements mean in order to decide whether the content
> can be throw away as a whole.
>
>don't be confused by embedded, b64file, point, polygon.... they are not in the specs ;)
>  
>
>------------------------------------------------------------------------
>
>_______________________________________________
>Matroska-devel mailing list
>Matroska-devel at lists.matroska.org
>http://lists.matroska.org/cgi-bin/mailman/listinfo/matroska-devel
>  
>





More information about the Matroska-devel mailing list