[Matroska-devel] Storage of WebVTT subtitles in Matroska

Moritz Bunkus via Matroska-devel matroska-devel at lists.matroska.org
Fri Apr 1 20:19:58 CEST 2016


Hey,

> Why not put the identifier and style inside the block addition keeping
> the same formalism?
…
> or even better (imo) swap line 1 and 2 since what interests the player
> is the style and not the id and the notes.
…
> This would really allow a player to interpret the whole WebVTT stuff
> as srt without too much effort and then add in a second time the
> support of style

Valid points.

Another update to the proposal:

--start-----------------------------------------------------
(A) CodecID: S_TEXT/WEBVTT

(B) Matroska CodecPrivate: This element contains all global blocks
    before the first subtitle entry. This starts at the "WEBVTT" file
    identification marker but excludes the optional byte order mark.

(C) Non-global WebVTT blocks (e.g. "NOTE") before a WebVTT Cue Text are
    stored in Matroska's BlockAddition element together with the
    Matroska Block containing the WebVTT Cue Text these blocks precede
    (see below for he actual format).

(D) Matroska Blocks: Each WebVTT Cue Text is stored directly in the
    Matroska Block.

    A muxer must change all WebVTT Cue Timestamps present within the Cue
    Text to be relative to the Matroska Block's timestamp.

    The Cue's start timestamp is used as the Matroska Block's timestamp.

    The difference between the Cue's end timestamp and its start
    timestamp is used as the Matroska Block's duration.

(E) Matroska BlockAdditions: each Matroska Block may be accompanied by
    one BlockAdditions element. Its format is as follows:

    The first line contains the WebVTT Cue Text's optional Cue Settings
    List followed by one line feed character (U+0x000a). The Cue
    Settings List may be empty in which case the line consists of the
    line feed character only.

    The second line contains the WebVTT Cue Text's optional Cue
    Identifier followed by one line feed character (U+0x000a). The line
    may be empty indicating that there was no Cue Identifier in the
    source file in which case the line consists of the line feed
    character only.

    The third and all following lines contain all WebVTT Comment Blocks
    that precede the current WebVTT Cue Block. These may be absent.

    If there is no Matroska BlockAddition element stored together with
    the Matroska Block then all three components (Cue Settings List, Cue
    Identifier, Cue Comments) must be assumed to be absent.
--end-------------------------------------------------------

Rationale for the changes:

(A) Consistency: most text subtitle formats in Matroska use S_TEXT/….

(B) Again keeping as much data as possible. In WebVTT the file signature
    may be followed by additional data. So let's just keep that data
    intact. It doesn't cost much, and a demuxer could feed CodecPrivate
    directly into a WebVTT parser.

(C), (D) and (E) have been changed according to Denis' proposal to store
the non-text components in BlockAdditions elements.

Example. WebVTT source file:

--start-----------------------------------------------------
WEBVTT with text after the signature

STYLE
::cue {
  background-image: linear-gradient(to bottom, dimgray, lightgray);
  color: papayawhip;
}
/* Style blocks cannot use blank lines nor "dash dash greater than" */

NOTE comment blocks can be used between style blocks.

STYLE
::cue(b) {
  color: peachpuff;
}

REGION
id:bill
width:40%
lines:3
regionanchor:0%,100%
viewportanchor:10%,90%
scroll:up

NOTE
Notes always span a whole block and can cover multiple
lines. Like this one.
An empty line ends the block.

hello
00:00:00.000 --> 00:00:10.000
Example entry 1: Hello <b>world</b>.

NOTE style blocks cannot appear after the first cue.

00:00:25.000 --> 00:00:35.000
Example entry 2: Another entry.
This one has multiple lines.

00:01:03.000 --> 00:01:06.500 position:90% align:right size:35%
Example entry 3: That stuff to the right of the timestamps are cue settings.

00:03:10.000 --> 00:03:20.000
Example entry 4: Entries can even include timestamps.
For example:<00:03:15.000>This becomes visible five seconds
after the first part.
--end-------------------------------------------------------

Resulting CodecPrivate element:

--start-----------------------------------------------------
WEBVTT with text after the signature

STYLE
::cue {
  background-image: linear-gradient(to bottom, dimgray, lightgray);
  color: papayawhip;
}
/* Style blocks cannot use blank lines nor "dash dash greater than" */

NOTE comment blocks can be used between style blocks.

STYLE
::cue(b) {
  color: peachpuff;
}

REGION
id:bill
width:40%
lines:3
regionanchor:0%,100%
viewportanchor:10%,90%
scroll:up

NOTE
Notes always span a whole block and can cover multiple
lines. Like this one.
An empty line ends the block.
--end-------------------------------------------------------

Example Cue 1: timestamp 00:00:00.000, duration 00:00:10.000, Block's
content:

--start-----------------------------------------------------
Example entry 1: Hello <b>world</b>.
--end-------------------------------------------------------

BlockAddition's content:

--start-----------------------------------------------------

hello
--end-------------------------------------------------------

Example Cue 2: timestamp 00:00:25.000, duration 00:00:10.000, Block's
content:

--start-----------------------------------------------------
Example entry 2: Another entry.
This one has multiple lines.
--end-------------------------------------------------------

BlockAddition's content:

--start-----------------------------------------------------


NOTE style blocks cannot appear after the first cue.
--end-------------------------------------------------------

Example Cue 3: timestamp 00:01:03.000, duration 00:00:03.500, Block's
content:

--start-----------------------------------------------------
Example entry 3: That stuff to the right of the timestamps are cue settings.
--end-------------------------------------------------------

BlockAddition's content:

--start-----------------------------------------------------
position:90% align:right size:35%

--end-------------------------------------------------------

Example Cue 4: timestamp 00:03:10.000, duration 00:00:10.000, Block's content:

--end-------------------------------------------------------
Example entry 4: Entries can even include timestamps.
For example:<00:00:05.000>This becomes visible five seconds
after the first part.
--end-------------------------------------------------------

This Block does not need a BlockAddition as the Cue did not contain an
Identifier, nor a Settings List, and it wasn't preceded by Comment
blocks.

Kind regards,
mosu
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: not available
URL: <http://lists.matroska.org/pipermail/matroska-devel/attachments/20160401/f7f600f3/attachment.sig>


More information about the Matroska-devel mailing list