[Matroska-devel] S_DVBSUB

Dan Haddix dan6992 at hotmail.com
Fri Feb 18 23:19:41 CET 2011

> Date: Fri, 18 Feb 2011 14:08:52 +0100
> Subject: Re: [Matroska-devel] S_DVBSUB
> From: slhomme at matroska.org
> To: dan6992 at hotmail.com
> CC: matroska-devel at lists.matroska.org
> On Thu, Feb 17, 2011 at 10:26 PM, Dan Haddix <dan6992 at hotmail.com> wrote:
> >> On Wed, Feb 16, 2011 at 10:57 PM, Dan Haddix <dan6992 at hotmail.com> wrote:
> > The PID in TS is basically a track ID. Each TS packet contains a PID as part
> > of it's header so that the demuxer knows which stream the packet belongs to.
> > The Program Map Table, aka PMT, is a special packet inserted several time
> > per second which contains a map of these PIDs and information about the
> > streams they point to. Just like MKV, a TS file can hold pretty much any
> > audio/video format, even if it's unknown to the demuxing application. It
> > does this by using a unique descriptor tag in the PMT and then putting any
> > addition information needed to properly parse the format in to the
> > descriptor buffer, which is basically akin to the the CodecPrivate buffer in
> > MKV. If the demuxer understands the descriptor tag then it parses the
> > associated packets, if not then it ignores them. DVB subtitles are stored in
> > a sort of unique way. If there are multiple subtitle tracks which share data
> > then they are stored in the same PID, but each demuxed data payload contains
> > another header, specific to the DVB subtitle format, which contains a page
> > ID value. This page ID value tells the decoder which sub stream the packet
> > belongs to and the decoder decides whether or not to display it based on the
> > user selection. If the subtitle streams do not share data then they are
> > simply stored as separate PIDs. In the vast majority of situations each
> > subtitle stream is stored in a unique PID and the data stored in the
> > descriptor buffer is mostly unnecessary. However on some occasions there
> > will be two tracks attached to a single PID. This typically happens when
> > there is a normal subtitle track, which only displays spoken words, and
> > another for the hearing impaired which also describes ambient noises like
> > phones ringing, dogs barking, etc... They do this to save space, since the
> > majority of the data is shared (i.e. the spoken words) and only a small
> > subset of the data is specific to the hearing impaired track. The decoder
> > knows which track is selected and displays all packets designated as
> > specific to that track or shared amongst all tracks, the rest are simply
> > ignored.
> Thanks for the detailed description. At least what I read before about
> the format and my interpretation wasn't wrong :)
> > Now just to be clear I'm not purposing we store any part of the TS packet,
> > or the PES packet, in the MKV container. I'm only suggesting that we store
> > the data payload of the PES packets, which is the DVB subtitle data, in the
> > MKV chunk. I then suggest that we store the data from the TS PMT descriptor
> > buffer, which describes any sub-streams, in the CodecPrivate portion of the
> > track. This way the playing app can simply pass the CodecPrivate portion of
> > the track along to the decoder so that it knows if this particular track
> > contains multiple sub-streams or not. The demuxer itself does not need to
> > know about, or care about, what is stored in the CodecPrivate portion of the
> > track. It is up to the the decoder or playing app to parse that and display
> > the track selection to the user as necessary. And if it's missing or the
> > playing app fails to parse it then the decoder will still default to
> > displaying all common packets and simply ignore those intended for a
> > specific sub-stream.
> But you are missing the part when the user selects which stream to
> play. Maybe in the TS world it's less an issue as the streams
> constantly change. But this is not how Matroska and most other
> formats. You read a header, one time, and you know all the streams you
> have and so you can tell the user which are possible.
> While supporting your proposal is fine for single stream PIDs, it is a
> lot more complicated when streams are combined. Just the startup would
> be changed a lot, you would not be able to tell how many tracks there
> are until you have parsed the CodecPrivate. This is not the case
> nowadays. Then you'd have to have special access to the codec to tell
> it to use one stream or the other, by passing the usual track
> selection. It may work in some players, but I expect it to be close to
> impossible in DirectShow. Especially since filters have a tendency to
> have proprietary APIs for various novelty they add and usually not
> made public to favor their own player. I don't think there is
> currently any standard way to select only a part of a stream directly
> in a codec (it's usually the job of the parser to do the data
> filtering).
> > Here's the deal... I work as a developer for VideoReDo, a relatively popular
> > video editing application which is specifically designed to edit TV
> > programs. I have already added support for DVB subtitles in MKV using the
> > method I've outlined and it works really well. I've also looked at the
> > source code for VLC and I believe it will be relatively simple to add
> > support for this method there as well. (I actually already wrote the code,
> > but I can't get VLC to build so I haven't been able to test it yet) So
> > basically if you go with my suggestion then there will be at least one app
> > to create these files and one to play them back immediately. Using the
> > TrackOperation method would require changes to the MKV muxers/demuxer in
> > both applications, which I'm not capable of making, and would also require
> > special processing of the DVB subtitle packets themselves during
> > reading/writing which would require additional work to handle in VideoReDo
> > and I'm not sure when/if I'd be able to add support.
> Yes, but having one reader and one writer doesn't prove it's possible
> to handle everywhere. How do you handle the stream selection in VLC
> when 2 sub-streams exist ? I'd be curious to know if what you proposal
> would work in GStreamer or Perian and I have serious doubt about
> DirectShow. Of course a proprietary hack is always possible.
> > I know that the whole sub-stream portion of DVB subs is not congruent with
> > how things are normally done in MKV, but this is a unique case where you'll
> > be supporting an established format in a way that applications designed to
> > handle it will already be setup to understand. In fact I'd argue that using
> > the TrackOperation method is actually worse since it will require the
> > demuxer to recognize the format and recombine the sub-streams back into the
> > established DVB subtitle data format. Where as doing it my way would put
> > that burden on the decoder, which is most likely already designed to handle
> > it.
> But it doesn't break any design of how players usually handle stream
> selection. It is easy on the demuxer side to recombine data to make a
> virtual track. In fact the outside world of the container doesn't even
> need to know the track is a virtual one. It's only internal cuisine.
> And of course that solution is not tied to a single codec (DVBSUB). So
> once you support it, it works for everything. If another combined
> codec comes, you don't have to support yet another codec oddity.
> As for reusing DVBSUB support that's already existing. Adding a fake
> payload at the front with a Page ID is trivial too. We already use
> something similar for header stripping (again, transparent outside of
> the container). Except in this case it doesn't need to be put inside
> the file. It's only specific to how the decoder works.
> -- 
> Steve Lhomme
> Matroska association Chairman

I understand that you want to express every "track" in the file at the container level so that programs can know which tracks are available by simply parsing the MKV header. However I'd argue that the sub-streams in a DVB subtitle streams are not really separate "tracks". It's more like a flag in each frame which tells the decoder to display or ignore each packet depending on a filter. If the decoder had no knowledge of the sub-streams it wold still simply display all the common frames and ignore all frames which belonged to a sub-stream.

Another problem I have with separating the packets into multiple MKV "tracks" is that the sub-streams have no meaning without the master stream. So if an application was to remove the master track, but not the sub-stream track, then sub-stream track would basically be useless. There is a reason these sub-streams are coupled together into a single PID in the TS format, because they are not really independent streams.

If you really insist on having separate tracks for every sub-stream then I suggest we store a copy of all common packets in both tracks, rather then using TrackOperation method. This will create some redundant data but at least each stream will be self contained and can be added/removed from the MKV file without effecting the other streams. Plus doing this would require no special modification to the muxers/demuxers currently in use. It would only require the creation application to copy the common packets to all tracks and flip a few bits on the sub-stream packets so they they would then appear as if they were common packets for that track. To the reading application they would simply appear as separate PIDs with no sub-streams.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.matroska.org/pipermail/matroska-devel/attachments/20110218/57c8c184/attachment.html>

More information about the Matroska-devel mailing list