[Matroska-devel] Would more StereoMode values be useful or needed?

Yann Renard yann.renard-mailing-lists at tiscali.fr
Mon Jan 8 16:03:15 CET 2007

David Duffy wrote:
> Currently StereoMode supports mono, right, left, and both.
> Should there be either another element or more values to designate over/under vs. side by side vs. alternating frames (and of course which comes first, left or right) when the setting is "both eyes"?
> Would you consider that information to be useless to a reader since perhaps the video codec(s) should be responsible...?
> I can see a use for designating what the stereoscopic format is beyond just "both eyes" because then the following would be possible:
>   - demux could specify two video out pins and send alternating frames on alternating pins just as if there had been two streams one for left and one for right (I know, the user "shouldn't" encode that way perhaps but it would be doable and there could be reasons for doing it).
>   - in systems where the video decoder could be "notified" of the format it would be advantageous.  I can't speak for VFW off hand but I could certainly use it in the hardware player stuff I'm working on (currently I'm encoding as side by side left/right for efficiency and assuming that I will receive left/right but the option to do over/under or right/left and know which it was for sure without making assumptions would be nice).
>   - it is more efficient for encoding/decoding to have combined frames (either over/under or side by side) so as not to have to create two decoder instances.  So app writers could use whatever the latest and greatest video codec is (or whichever they want) without needing any special 3D support in the codec itself; and then because they know from the Matroska file that the frames are a certain orientation they would know how to interpret the decoded frame to get the desired 3D effect.
> I think it might be "cleaner" to have an optional (sub) element to designate the frame orientation for combined "both eyes" frames but it would be easier to just add more values to StereoMode.
> What do you all think?
> Thanks,
> David

Dear David and mailing list, first happy new year every one !

Now, I am not sure this email will answer your initial question, but I
hope it will help the discussion in some way. I have been working on 3D
visualisation for several years and have had the chance to test both
software and hardware 3D visualisation tools. Concerning what you call
stereo mode, I suggest to expand the concept of stereoscopy to relief
perception in visualisation context.

Let me explain : relief visualisation consists in generating a picture
for each eye with correct perspective correction so the viewer can feel
the depth. Stereoscopy consists in generating *two* pictures, one for
the left eye, one for the right eye. Given this two pictures, the
probleme is to know how one can get the left picture on the left eye and
on this eye only, and the same for the right picture/eye. Usually two
approaches exist with common hardware :
- left/right or top/left positionning of the picture with overlapped
polarized pictures and glasses or head mounted device
- alternated left then right picture with shutter glasses (this needs
quadbuffering but this is now "common" on PCs).

Now some more recent techniques exist to output these two pictures such
as autostereoscopy <http://en.wikipedia.org/wiki/Autostereoscopy> .
Brief explanation is stereoscopy without glasses. The display is
combined with some kind of filter so visulisation cones are projected to
the user. The user then places himself so the left eye is in a left
picture cone, and the right eye is in a right picture cone. The filter
can be based on lenticular lenses or parralax barrier for example. Such
stereoscopic displays have advantages of no glass but have the problem
of the user to be placed correctly regarding the projection surface. For
3D interactive visualisation, this is the state of the art... But this
technology is already applied to fixed posters and should be easily
applied to videos and later to 3D interactive visualisation with more
precision. The idea is to generate more viewpoints  so the user can move
wider with his eyes staying in a correct pair of cones/pictures. For
printed posters, I have seen some 64 viewpoints photos that looked
really impressives ! Of course, the viewpoints are interlaced in a
specified way and this is no more a matter of top/bottom nor left/right
projection !

Ok now, to go back on the initial topic, with such visualisation method,
you understand that relief perception of a video may consist in multiple
streams (I mean really more than two). Matroska may take this into
account but I don't know how exactly ;) Would this be storing the
different locations of the viewpoints in a specified 3D space ?

Hope this helps a bit, sorry for my english, it was hard to explain all
this in english but I did my best ! No time now to add pointers on
reference websites...

Best regards
Yann Renard

More information about the Matroska-devel mailing list