[Matroska-devel] PGS subtitle questions

John Stebbins stebbins at jetheaddev.com
Mon Nov 5 15:06:43 CET 2012

On 11/05/2012 02:38 PM, John Stebbins wrote:
> On 11/05/2012 01:53 PM, Moritz Bunkus wrote:
>> Hey,
>> On Mon, Nov 5, 2012 at 1:35 PM, John Stebbins <stebbins at jetheaddev.com> wrote:
>>> In order to not invalidate existing streams, you could make the new
>>> element (or codec private data) optional and only necessary when scaling
>>> of PGS by the playback software is desired.
>> Even that might make existing files invalid. Also it's dangerous; you
>> cannot extract that codec private information properly. Also no other
>> software expects such data.
> I'm surprised that the matroska specification doesn't say something to
> the effect that unrecognized EBML id's should be ignored.  You would
> think this would be required for an extensible format. I would hope the
> majority of players are smart enough to skip unrecognized data since the
> only alternative is to choke and die. But I've seen plenty of hardware
> players that do exactly that when they see something unexpected.  So I
> understand your point.
>>> I have no preference for whether it is a new element or codec private
>>> data.  But since there is already a format for vobsub that has the
>>> necessary information, would it be a good idea to reuse that format?
>> I don't think so, no. It's a text-based format. That's bad for at
>> least two reasons:
>> - it's easy to get the format wrong (is it "custom colors"? or
>> "custom_colors"? or "custom colours"?)
>> - it's difficult to parse (or at least it requires yet another parser
>> on the reader's side)
>> Another reason it's a bad idea (at least in my opinion) is that there
>> are tons of potential details inside those .idx files that might serve
>> no purpose with PGS subs: palette information, color, alignment,
>> fading... Note that I don't know much about PGS specs, so this might
>> be inaccurate.
>> Putting the information into Matroska elements makes them
>> a) official
>> b) easy to parse (they're simply two or more additional elements; I'm
>> thinking about keeping offset as well as size)
>> c) easier to create for existing tools, too, I guess
> I thought you might say this and I have no objections at all. 
> I would also like to suggest that whatever is added for this, it should
> encompass cropping of the original image as well.  This is something the
> current vobsub idx does not handle and results in improper scaling of
> vobsubs by some players since they scale when they should really just be
> repositioning the sub.  They assume that an image that went from 720x480
> to 720x386 was scaled rather than cropped to remove letterboxing.
> So what I envision is something that contains original width and height
> plus the amount cropped (top, bottom, left, and right) from the original
> image (i.e. measured in original image pixels).

I just did some checking to see if all this is really needed.  I figured
someone *must* have given this at least a little thought previously.  It
turns out there is a "video descriptor" required in the PGS subtitle
data that defines the width and height "of the associated HDMV stream". 
So my request for original width and height is unnecessary.

Now I'm rethinking this a little.  Cropping information would still be
helpful as a hint to a player about how to position the subtitles. 
Also, original pixel aspect ratio would be helpful in cases where the
video has be anamorphically scaled during transcoding.  I looked and
found nothing about pixel aspects in the PGS data.

More information about the Matroska-devel mailing list