[matroska-devel] Re: [Fwd: Re: Hosting of MPC]

Frank Klemm pfk at fuchs.offl.uni-jena.de
Sat Jan 25 22:51:23 CET 2003


On Sat, Jan 25, 2003 at 05:04:14PM +0100, Steve Lhomme wrote:
> >
> > ------------------------
> >
> > SamplingFrequency	4	[B5]	-	>0	-	float		Sampling frequency in Hz.
> > Channels		4	[9F]	-	not 0	2	u-integer	Numbers of channels in the track.
> >
> > ------------------------
> >
> > würde ich auf ( i would change to )
> >
> > ------------------------
> > SamplingFrequency	4	[B5]	-	>0	8000.	float		Sampling frequency in Hz.
> > Channels		4	[9F]	-	not 0	1	u-integer	Numbers of channels in the track.
> > ------------------------
> >
> > 1. halte ich nicht sonderlich viel von solchen impliziten Werten,
> >     man spart ein paar Bytes in einer Multimegabyte-Datei und handelt
> >     sich vorhersehbaren Ärger ein. Bin genug häufig in so was
> >     selbst reingetappt, das letzte mal gestern.
> >
> > 2. Von Interesse sein können die Einsparungen von ein paar Bytes bei
> >     Datenströmen mit sehr geringer Datenrate und Qualität. Das ist dann
> >     meistens Mono (und nicht Stereo) und 8 kHz Abtastfrequenz.
> 
> Sorry, my german is not good enough to understand this, and I have no 
> offline translator ;)
> As I said on the EBML website (http://ebml.sourceforge.net/) EBML is 
> very verbose (as is XML) so it is convenient to have good default values 
> so that the general-case-file will not take too much space. And I think 
> the general case for digital audio is still 44100/stereo.
>
Using implicit channel count and implicit sample frequency saves some bytes
(I think it's 9 bytes). Such implicit values often make touble, so it's not
worth to save these 9 bytes.

The only exception are very low bitrate/low quality streams, where are 9
bytes are not astronomic smaller than the size of the whole stream.
It such cases it may makes sense to use such implicit values, not when
storing CD tracks in transparent quality on multi gigabyte harddisks.

These very low bitrate/low quality streams are typically 8 kHz (and not
44.1 kHz or 48 kHz or 96 kHz or whatever) and monophonic (an not 
2 channel stereo or 5.1 channel stereo).

Candidates are for instance LP10, CELP and Speex.
At 2400 bps...4800 bps 72 bits are the data for 15...7.5 ms.


> I don't know much people using 8000/mono... Now of course that would save
> space in the most space limited case. Which approach is better ?
>
I would say the most often used low bitrate configuration.
AFAIK this is 1 channel/8000 Hz.

Ogg Vorbis has made some test with 1 channel/6000 Hz, but I don't think this
helps so much to preserve "quality" at 3 kbps and also it is not very
common.


> >
> > =============================================================================
> >
> > ------------------------------------------------------------------------
> > PixelAspectRatio	4	[41][52]	-	>0	1.333333	float	Pixel aspect ratio of
> > the pixels.
> > ------------------------------------------------------------------------
> >
> > [Frank]Dieses Beispiel ist irreführend. Es gibt kein Movieformat, was ein
> > Pixelseitenverhältnis von 4:3 verwendet. Last das so schnell wie möglich
> > verschwinden, sonst gibt es so viele fehlerhafte Implementierungen, so daß
> > es am Ende einfacher ist den Standard zu ändern als die Fehler zu
> > fixen.[/Frank]
> >
> > This example is not good, there is no such thing as a 4:3 AR. I'd remove
> > that or errors will be the result.
> 
> Uh ? So we should put 1.00000 ?
>
I think 1.000000000 would be a useful default.
For normal TV there are too much different pixel ARs and for low bitrate
Video I think the AR is often 1.000000000 by using computer monitor
resolutions and parts of it. There are also TV resolutions and parts of it
in use, but there's no "natural" setting, but too much candidates.

I think 1.00000000 is a good value. The other values for the 4 DVD formats
should be mentioned, because you don't find these values in the internet
directly, so I think it is useful to mention the right values.


> > [Frank]Ich hatte die Werte schon man geschickt, zweiter Versuch, einen
> > dritten gibt
> > es nicht:[/Frank]
> >
> > I sent these values already, i wont send them another time :
> >
> > [Frank]        horizontal:vertikal
> > HDTV:            1:1
> > PAL:            16:15
> > PAL anamorph:   64:45
> > NTSC:            8:9
> > NTSC anamorph:  32:27
> 
> Is that the pixel aspect-ratio or the display aspect-ratio? 
>
Pixel aspect ratio.


> I think the most logical for a computer format is 1.0000 PAR. 
>
It is logical, but seldom used ;-)

PAL:                16:15
PAL anamorph:       64:45
NTSC:                8:9
NTSC anamorph:      32:27
Computer 1280x1024: 16:15
HDTV and other Computer formats except CGA, EGA and VGA text and relatives:
                     1:1


> What do the experts think ?
>
You can find image ARs and resolution in the internet/books/... .
The rest is simple math.

> 
> > Ich würde das Verhältnis als zwei teilerfremde Ganzzahlen abspeichern.
> > Gleitkommazahlen sind erstens ungenauer und zweitens verleiten sie zur
> > sogenannten Gleitkomma-Schmiererkrankung. Man speichert ungefähr richtige
> > Werte ab und überläßt es dem Programmierer, diesen Schlamassel durch
> > etwas heuristischen Code meistens unsichtbar zu machen.[/Frank]
> >
> > I would store the values as two integer numbers. Floats are more
> > unprecise and often handled incorrectly.
> 
> Well, what would be the difference with the Height and Width of the 
> movie ? One would be the encoding and the other the display ?
> 
> > [Frank]Wenn z.B. "NTSC anamorph" abgespeichert, ist was ungefähr
> > 1:1.18518518518518518518
> > ist, dann wird man verschiedene Werte in dem Tag finden:
> >
> >    1.2
> >    1.18
> >    1.19
> >    1.185
> >    1.1852
> >    1.185185185
> >
> > Und der Decoder muß dann wieder raten, was gemeint ist.
> > Wenn dagegen { 32, 27 } zu stehen hat, dann ist eindeutig alles andere
> > FALSCH. Keine Gleitkommaschmiererkrankung und keiner der fluchen muß,
> > daß sein AR von exakt 1:1.2 auf 1:1.185185185 getweakt wird und die
> > Software damit für seine Meßzwecke unbrauchbar ist.
> > Frank Klemm[/Frank]
> >
> > E.g, ,these are the values you will find for the NTSC anamorph standard
> > if you specify floats. In the end the decoder will have to make a 
> > guess again ...
> 
> 
> So can we go for the encoding Width/Height and the display Width/Height ?
> 
You can store:

[Proposal 1]

  - horizontal pixel count [u-integer]	1) 
  - vertical   pixel count [u-integer]	1)
  - aspect ratio of the pixel as a:b, where are a and b are nonzero [u-integer]s

    Examples:  hor_res =  720, vert_res =  576, pixel_AR = {16,15}
               hor_res = 1280, vert_res =  720, pixel_AR = {1,1}
               hor_res = 1920, vert_res = 1080, pixel_AR = {1,1}
               hor_res =  352, vert_res =  288, pixel_AR = {16,15}

    disadvantage: "unusual" number for the AR
    advantage:    cropping an image don't affect the pixel_AR

    1) please don't use the word "horizontal resolution", because it isn't a
       resolution in the physical sense. Such dirtiness in word selection is
       one reason why are standards are often so difficult to read.

[Proposal 2]

  - horizontal pixel count [u-integer]
  - vertical   pixel count [u-integer]
  - aspect ratio of the display as a:b, where are a and b are nonzero [u-integer]s

    Examples:  hor_res =  720, vert_res =  576, display_AR = {4,3}
               hor_res = 1280, vert_res =  720, display_AR = {16,9}
               hor_res = 1920, vert_res = 1080, display_AR = {16,9}
               hor_res =  352, vert_res =  288, display_AR = {176,135}

    advantage:    "usual" number for the AR
    advantage:    cropping an image affects display_AR, and this, and in
                  reality this is nearly never corrected.
                  
[Remark 3]
    
   both ARs can be stored using two different ways:

   - stored as float [usually 4 bytes are more than enough]

     + you calculate and store the AR with an accuracy of less than 3*10^-8.
     - you find seldom correct values in floating point entities, I expect
       more incorrect values when stored as FP value.

   - stored as 2 u-integers (size of each is half the size of the item)
     - to store a calculated value is more difficult, you must search two a
       and b's which do approximate AR=a/b very well.
     - it is easier to force correct ARs, because you can say what is right
       and what is very likely wrong.


Hope this info is enough to make a wise decision.

-- 
Frank Klemm

http://matroska.org



More information about the Matroska-devel mailing list