From musafir_86 at yahoo.com Tue May 2 06:51:41 2006 From: musafir_86 at yahoo.com (MusafirKelana_86) Date: Tue, 2 May 2006 05:51:41 +0100 (BST) Subject: [Matroska-devel] Matroska without Thumbnail Support. In-Reply-To: <4453A00E.8090903@free.fr> Message-ID: <20060502045141.11808.qmail@web52812.mail.yahoo.com> -Sorry for the late reply; I haven't been on the Net for a while. Yes, I'm using the installer in silent mode (part of another package made by me). -But, even in interactive mode & the thumbnail (TN) option turn off (in the setting dialog), the problem persist. As I noted before, this only happen *AFTER* Haali Splitter includes TN support for MKV. Maybe there's some registry entries that conflicts/overtaking the reg entries I made, or maybe the filter (.ax) or any .dll that has specific call causes this - I have no idea. -Well, maybe somebody can provide the source code to me ;-) Thanks. MusafirKelana_86. Steve Lhomme wrote: MusafirKelana_86 wrote: > Hello everyone, > > -Can I get the latest binary *WITHOUT* thumbnail support for Matroska? > > -Previously, I've used thumbnail support (& other support like WMP- > playlist/burnlist, etc) for OGG/OGM,MP4,MKV using AVI-related settings in > registry (copy related keys, CLSIDs). So far, it works okay, until Haali > Splitter includes thumbnail support - my method no longer works. Just OGG/OGM > & MP4 remains the same. > > -So, can somebody compile the latest code *WITHOUT* thumbnail support, and > send it to me? I thought Haali's installer had the option to disable thumbnails. Are you using the installer in silent mode ? Steve -- robUx4 on blog Send instant messages to your online friends http://uk.messenger.yahoo.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From haessije at eps.e-i.com Wed May 3 16:08:00 2006 From: haessije at eps.e-i.com (HAESSIG Jean-Christophe) Date: Wed, 3 May 2006 16:08:00 +0200 Subject: [Matroska-devel] EBML Namespaces Message-ID: <2684397F36DC8849A9BF842433B784360215D81C@GZI-VM01.cm-cic.fr> > > Examples with 3 namespaces : > > 0 : EBML > > 10 : NSX > > 11 : NSY > > Would you have a variable length size or use 00, 01, 10, 11 ? No, it's really 0, 10 and 11, which is variable size. > The number of namespaces in the file is written in the EBML > which is mandatory to read. So it's known beforehand how many > bits will be needed to read the namespace. And therefore I It's not that simple, the best place is, for sure, to declare namespaces in the EBML header, but there will be other places where namespaces can be defined e.g. for local switching purposes. > think it's better to have a fixed length, it will use less > space. The impact on backward compatibility Really, I don't think a fixed length will be smaller. Fixed width codes are valid prefix codes, and therefore variable width codes will at worst be as good as fixed width ones. > Now as we're probably heading for this compatibility issue Maybe I shouldn't have emphasized that aspect so much... In fact, before speaking about modifying anything we should discuss what things MUST stay, what SHOULD be the same, what MAY change, etc. So far, I believe the EBML header for EBML files using namespaces must be compatible with the current EBML spec, that is : * The four first bytes are [1A][45][DF][A3] * The header must include required elements, as per the spec * It is allowed to add new sub-elements in the header, since they are ignored. This allows old ebml parsers to read the header and see they are obsolete (using the EBMLVersion element) so they can warn the user their software has to be upgraded. Moreover, it *might* be possible, with some little additions, to make old files valid namespace-aware files. This could be very useful for namespace-aware parsers, since no special code would be required to handle old files. > (only for files including more than 1 namespace) I'd also > like to introduce the EbmlMaster bit in the ID header. So > that it's possible for an EBML parser to parse the whole tree > without knowing anything about the semantic. This is a must-have. With namespaces, we will regularly have files containing data from both known and unknown namespaces. Currently an unknown element can't be parsed securely, because it is hard to know if its contents are raw data or sub-elements. This could lead to the loss of valueable elements hidden in unparseable data. However, even if I thought using one bit to set the EbmlMaster property on an element was a good idea, it's not sufficient : some elements are allowed to have an undetermined size (size=all 1's). Matroska deals with this using the semantic linked to the elements because most elements can appear only at a given depth. This is specific to Matroska and not generally applicable because formats may want to have elements available at any level. These elements would occasionnally be reported at the wrong level. Using an 'end-of-children' marker solves this problem. When it encounters this marker, which can be a standard EBML element, the parser knows that the parent is closed and that the next element is at the upper level. I also insist on pulling in 'mixed elements', that is, elements with both sub-elements and data, to avoid the replication of the XML mistake that allows attributes. It would be very useful, if not essential to all extension formats that annotate their parent elements and need to be able to insert data inside any element. This includes namespace declarators (in my proposal). Also, An 'end-of-children' tag would be useful for mixed elements and to detect data-only (unparseable) elements. Elements that contain sub-elements and binary data would include this element as their last child, if necessary; and elements that contained only binary data would include it as their only child, again only if necessary. In original EBML, there is nothing to say "raw data starts here" so it must be possible to omit this tag in most cases. I already posted a word about this sometimes ago. I noticed that trying to interpret the data contained in EBML elements without knowing whether it was data or sub-elements was very successful. Raw data has a good chance to be interpreted as EBML elements that are way too big to fit in their parent element, and can therefore be identified as raw data. Only the data sets that would yield false positives should be prepended with the end-of-children element. However, using such an heuristic to determine the type of the contents of elements may incur computing overhead, but I believe it will not be too high. > Yes, there is no point at a level X to use different > namespaces in random order as their semantic is orthogonal. > So the elements will probably end up being grouped by > namespace. Given that it means Atamido's proposition for a > new special tag might be enough ! But it should be a Class-A I expect various usage patterns for namespaces. One of those is embedding of sub-documents, where vocabularies are locally grouped. Another usage is annotation where elements from another vocabulary are scattered around the master document and included into elements from other namespaces, which may not be understood by the annotating program. > tag to avoid overhead. In this case it would be 3 octets for > each added namespace: the ID, the size of the namespace > (could be more than one if that element is an EbmlMaster), > the namespace value. Another option would be to use a new I do not exactly see how this should work. From what I can understand, there will be some elements in the header to link NS-keys (the URLs in XML namespaces) and NS-values (the value used to refer to the namespace). Then, inside the EBML content one would use some sort of wrapper tag to enclose all the elements which belong to another namespace. But the value of the namespace we want to refer to must be specified somewhere ? So there should be a second 3-byte element to set it ? Or is there no wrapper and all the elements following the NS-switching element are affected ? Anyway, for annotation purposes the proposal of a 'wrapper ns-switching' element works but is highly sub-optimal. For each lonely element from some foreign namespace one should include 5 extra bits (at least) in the wrapper case (1 byte wrapper ID, 1 byte wrapper size, 1 byte ns-switch ID, 1 byte ns value size, 1 byte ns value), or 6 bytes in the switch-tag case (3 bytes to set the NS to the correct value before the element and 3 bytes after the element to reset the namespace to the previous value. Moreover, the NS-related tag(s) should still be global and therefore cannot be used by applications. To be able not to change namespaces too often, the EBML private tags (void, signature, etc...) should be left global. In my proposal I intended to give EBML its own namespace providing total isolation from other vocabularies, which I felt cleaner. > Seeking anywhere in the file (at the EBML level) is still a > problem (in all cases proposed) as we are unable to recover > the complete namespace context. It could only work for 0 or 1 > namespace in the chain. The next design phase for a namespace-aware EBML is a generic seeking and indexing infrastructure. It is orthogonal to the namespace concern, yet it is essential to make EBML files editable without being required to understand all vocabularies used in one file. > Another problem if we don't have the notion of default > namespace is to seek in matroska (semantic level) because > that means the level 0 would need to be contained in the > default namespace, and therefore would need to be prepended > with that new ID. That means it's not backward compatible at all. Are you referring to the problem of knowing which namespace rules are to be applied at the root ? I don't really understand this. In the realm of scoping rules for namespaces, I've always had an XMLish approach, since I think namespaces in XML are done quite well. In XML the namespace is a property of the element object, and as such is included into the object. So are local namespace declarations that must be applied to a particular tag (xmlns:ns=""). Without focusing on the particular syntactic rules to write namespaces in the EBML elements, here are some ideas for my proposal : * Namespace declarations for an element can only be found into the first child of that element, mainly for performance purposes. * Namespace declarations can be done in the following elements : - A specific NS-declaration-holder element - The EBMLHead element (practical since it is the first element below the root and therefore can be used to set the namespace rules for the root). * Namespace rules are inherited by children from their parent element, but children can override the rules from their parent with the appropriate declarations. * An empty NS-declaration-holder overrides the inherited rules and discards the usage of namespaces, therefore allowing only EBML elements (other ones are up to the host app). * To retain compatibility with the original EBML header, namespace-aware files include an empty NS-declaration-holder in their header, therefore making the header pure EBML without namespaces. These are only preliminary ideas, and my thinking is not totally finished. Especially there is a slight theoretical problem in these five rules (will you find it ? ;) and I might need the already allocated 0x80 Class-ID... > Now chaining namespaces may not be so clean. Imagine you have > a namespace for "comments" and a namespace for "signature". > You can put comments anywhere in your file (discarded in > matroska) and you can put signature anywhere in the file. But > if what you add a comment in a signature or a signature in a > comment ? How to interpret "signature::comment" or > "comment::signature" IDs ? In that case we only need to use > "comment" or "signature"... So is namespace chaining a > feature we want ? Or we always revert to the last seen > namespace ? (in which case seeking becomes easy) I think that the interpretation of the elements should be left to the various applications. Whether or not element X means the same thing when it is included in element A or element B is of no interest to the EBML parser IMO. Likewise, elements in NS A can contain elements from NS B, and we should not enforce that "A::B" has any special meaning. The processing should be something like : 1 the parser finds element X from NS A -> inform application A that an X element was found. then the app asks EBML to parse the contents of this element 2 the parser finds element Y from NS B -> inform application B that an Y element was found. From now on it is up to app B to decide what to do with this element. It can either : * do some processing regardless of the context in which the Y element was found * inspect the parent elements from element Y, and take specific action depending on the parent elements. How the meaning of an element within other elements is handled may however affect the DTD language. > Yes, I've been dreaming about a DB that would be EBML based > or a file system. And given custom attributes are present in > modern file systems it could be an option. Seeking in a file > system would also be very important. I also thought of an EBML-based filesystem. It could become a very good extensible, general-purpose FS. JC From haessije at eps.e-i.com Thu May 4 09:44:57 2006 From: haessije at eps.e-i.com (HAESSIG Jean-Christophe) Date: Thu, 4 May 2006 09:44:57 +0200 Subject: [Matroska-devel] EBML Namespaces Message-ID: <2684397F36DC8849A9BF842433B784360215D8A3@GZI-VM01.cm-cic.fr> Earlier, I wrote : >> Now instead of using another byte (or more) instead of >> splitting the IDs we could reuse bits in the data length. It >> would be no more backward compatible as using bits in the IDs >> but there would be more room for improvement (as the size is >> known to be encoded in different byte sizes). > > Yes, this is more or less what I've thought for class-IDs. But > unlike class-ids, the size field is vital to parse the file > (a 'dumb' parser that doesn't interpret the meaning of the > elements wouldn't be affected by a change in the IDs but would > stop working if the encoding of sizes changed). This was plain stupid. I totally overlooked that idea, which is ideed a very good one. First, I mentioned dumb parsers, but they are worthless, so there is no point supporting them, moreover as an EBML parser they should stick to the spec and understand the basic EBML headers. Second, as I pointed out in my last post, clever rules about NS declarations can preserve the compatibility of the header (to make it look like a plain old EBML header, so old parsers can read it). Also, as Steve noted it is far easier to grow the length of the size field, and we can take as many bits as we need, without worries. Thanks to this free space, we can encode extra information in the elements (aside from namespaces) in a very flexible way. I especially think of Steve's EBMLMaster bit. My orignial proposal made a pain of choosing Class-IDs because the format writer wouldn't know exactly how on much bytes the ID would be coded (the namespaces in use could modify it) and it messed a little with reserved class IDs. I've got very precise ideas on how the things could work now. I'll came up later with a more detailed post. JC From steve.lhomme at free.fr Sat May 6 14:00:27 2006 From: steve.lhomme at free.fr (Steve Lhomme) Date: Sat, 06 May 2006 14:00:27 +0200 Subject: [Matroska-devel] EBML Namespaces In-Reply-To: <2684397F36DC8849A9BF842433B784360215D81C@GZI-VM01.cm-cic.fr> References: <2684397F36DC8849A9BF842433B784360215D81C@GZI-VM01.cm-cic.fr> Message-ID: <445C8FDB.4060301@free.fr> HAESSIG Jean-Christophe wrote: >> The number of namespaces in the file is written in the EBML >> which is mandatory to read. So it's known beforehand how many >> bits will be needed to read the namespace. And therefore I > > It's not that simple, the best place is, for sure, to declare > namespaces in the EBML header, but there will be other places > where namespaces can be defined e.g. for local switching > purposes. I don't think it would be OK to put elements from namespaces that were not declared before in the file. That would be quite a PITA for the parser if it would have to look for handler of namespace XYZ during parsing. This would better be done during the init of the EBML file/stream reading. That is when declared in the header. That also means we can find out about false (damaged) namespace headers. Now with Atamido's solution or yours, it's possible to embedded a namespace declaration within the stream itself. But I'm not sure it's necessary and a good thing. > So far, I believe the EBML header for EBML files using > namespaces must be compatible with the current EBML spec, > that is : > * The four first bytes are [1A][45][DF][A3] > * The header must include required elements, as per the spec > * It is allowed to add new sub-elements in the header, since > they are ignored. Right now no parser would ignore extra bits put in the EBML header. The specs were quite clear on how to intrepret each bits. There was no bit left... > This allows old ebml parsers to read the header and see they > are obsolete (using the EBMLVersion element) so they can warn > the user their software has to be upgraded. Moreover, it Now that's something that can be considered. I don't think the current parsers check that version number (some are already checking the DocTypeVersion though) but they can be declared /invalid/ if they don't. Which would allow for /legal/ extension to appear and force an update of old parsers (all open source AFAIK). > *might* be possible, with some little additions, to make old > files valid namespace-aware files. This could be very useful > for namespace-aware parsers, since no special code would be > required to handle old files. Yes. On the other hand Atamido's proposal (a global EBML element that includes a namespace declaration the the related elements) would not break backward compatibility. So while EBMLVersion would be increase, EBMLReadVersion could remain the same. >> (only for files including more than 1 namespace) I'd also >> like to introduce the EbmlMaster bit in the ID header. So >> that it's possible for an EBML parser to parse the whole tree >> without knowing anything about the semantic. > > This is a must-have. With namespaces, we will regularly have > files containing data from both known and unknown namespaces. > Currently an unknown element can't be parsed securely, because > it is hard to know if its contents are raw data or > sub-elements. This could lead to the loss of valueable > elements hidden in unparseable data. Unfortunately, as said above, adding bits anywhere in the element header would be backward incompatible. So maybe she should improve EBML in 2 versions. That would be EBML 1.1 (Atamido's extension) and EBML 2 (your proposal). In EBML version IDs that would be EBMLVersion 2 with EBMLReadVersion = 1 and EBMLVersion 3 (or more to leave improvement for version 2) with EBMLReadVersion = 3. This way we can proceed in 2 phases. Make our best in EBMLVersion 2 and then add the necessary backward incompatible changes in the other version based on EBMLVersion 2. That will leave us some time to gradually update parsers for namespaces in current parsers. I still think using a bit for EBMLMaster would be a very nice addition, so that would be scheduled for phase 2. *note for all readers*: As we might introduce profound changes to the EBML layer, it might be a good idea to do it at the Matroska level too. We might define Matroska2 or MatroskaNSed with changes, removal, additions here and there. We might also move each orthogonal part of the format into different namespaces. For example seeking would be part of its own namespace (or maybe at the EBML level), chapters too, attachments too (making room for the mkr archive format). > However, even if I thought using one bit to set the EbmlMaster > property on an element was a good idea, it's not sufficient : > some elements are allowed to have an undetermined size > (size=all 1's). Matroska deals with this using the semantic > linked to the elements because most elements can appear only > at a given depth. This is specific to Matroska and not > generally applicable because formats may want to have > elements available at any level. These elements would > occasionnally be reported at the wrong level. Yes, I haven't thought about the 'undefined' size of EBML elements. This feature was defined for streaming, where you don't know the size of the Matroska Segment or Cluster in advance when writing to the stream. Right now it's handled by the semantic. But it's true that it's not a clean design from the EBML point of view. And if we find another cleaner solution that would be great. > Using an 'end-of-children' marker solves this problem. When it > encounters this marker, which can be a standard EBML element, > the parser knows that the parent is closed and that the next > element is at the upper level. The 'end-of-children' (global EBML element) would only work with the EBMLMaster bit, this way we can keep count of successive levels and mark them as finished when needed. This element would only make sense for 'undefined sized' elements as the other already know their own size. > I also insist on pulling in 'mixed elements', that is, > elements with both sub-elements and data, to avoid the > replication of the XML mistake that allows attributes. It > would be very useful, if not essential to all extension > formats that annotate their parent elements and need to > be able to insert data inside any element. This includes > namespace declarators (in my proposal). I don't really get the point here. If you need extra data in an element, you can add a child element. It can also be considered as an attribute. > Also, An 'end-of-children' tag would be useful for mixed > elements and to detect data-only (unparseable) elements. > Elements that contain sub-elements and binary data would > include this element as their last child, if necessary; > and elements that contained only binary data would > include it as their only child, again only if necessary. As said above, it only makes sense for element without a known size. otherwise the size implies the end. > In original EBML, there is nothing to say "raw data starts > here" so it must be possible to omit this tag in most cases. In EBML it's very clear where the data are: [ID][size][data] [ID] and [size] have known sizes when they are parsed. And as said above each bit is already interpreted in a specific way. > I already posted a word about this sometimes ago. I noticed > that trying to interpret the data contained in EBML elements > without knowing whether it was data or sub-elements was very > successful. Raw data has a good chance to be interpreted as > EBML elements that are way too big to fit in their parent > element, and can therefore be identified as raw data. > Only the data sets that would yield false positives should be > prepended with the end-of-children element. > > However, using such an heuristic to determine the type of > the contents of elements may incur computing overhead, > but I believe it will not be too high. I don't see the goal of this. You can't predict how to parse a file given the data it contains. That would be a very bad design. >> tag to avoid overhead. In this case it would be 3 octets for >> each added namespace: the ID, the size of the namespace >> (could be more than one if that element is an EbmlMaster), >> the namespace value. Another option would be to use a new > > I do not exactly see how this should work. From what I can > understand, there will be some elements in the header to link > NS-keys (the URLs in XML namespaces) and NS-values (the value > used to refer to the namespace). Then, inside the EBML content > one would use some sort of wrapper tag to enclose all the > elements which belong to another namespace. But the value > of the namespace we want to refer to must be specified > somewhere ? So there should be a second 3-byte element to > set it ? Or is there no wrapper and all the elements > following the NS-switching element are affected ? In the EBML header you have something like: EBML (Master) * EBMLNamespaces (Master) ** EBMLNamespace (Master) *** EBMLNamespaceID (String) *** EBMLNamespaceValue (integer) Then in the file you would have : - EBMLNamespaceSwitch (Master) -* EBMLNamespaceSwitchValue (integer) -* That's [ID][size][ID][size][integer][...] So 5 bytes minimum. If we define a new EBML type for Namespaces it would become : - EBMLNamespaceSwitch (Namespace) -* The Namespace type would be like a Master but with the namespace value (set in the EBML header) before the other elements. [ID][size][value][...] So 3 bytes minimum. This new type could also include some of the features we want for MatroskaNS. > Anyway, for annotation purposes the proposal of a 'wrapper > ns-switching' element works but is highly sub-optimal. > For each lonely element from some foreign namespace one > should include 5 extra bits (at least) in the wrapper case > (1 byte wrapper ID, 1 byte wrapper size, 1 byte > ns-switch ID, 1 byte ns value size, 1 byte ns value), or > 6 bytes in the switch-tag case (3 bytes to set the NS to > the correct value before the element and 3 bytes after the > element to reset the namespace to the previous value. I'm sure that adding a byte for *all* elements will always be bigger than adding 3 bytes here and there. I don't think having a lot of namespaces mixed in a place of the format would occur a lot. If you're thinking about annotation, then it should be at the expense of a big overhead. And for such a purpose I think it's OK. > Moreover, the NS-related tag(s) should still be global and > therefore cannot be used by applications. To be able not to > change namespaces too often, the EBML private tags (void, > signature, etc...) should be left global. In my proposal I > intended to give EBML its own namespace providing total > isolation from other vocabularies, which I felt cleaner. That's very tricky IMO. We still need some EBML global elements. And losing some space for something that is mandatory anyway is not good IMO. I think that both EBML and the DocType shouldn't require special namespace handling. But if we have MatroskaNS and EBML v3 very incompatible with the past, it may be a solution. We'd have to test different scenarii and see which one gives the best result. Given for MatroskaNS we might change IDs we could have a rule that fits your 'extra' bit usage scheme. >> Seeking anywhere in the file (at the EBML level) is still a >> problem (in all cases proposed) as we are unable to recover >> the complete namespace context. It could only work for 0 or 1 >> namespace in the chain. > > The next design phase for a namespace-aware EBML is a generic > seeking and indexing infrastructure. It is orthogonal to the > namespace concern, yet it is essential to make EBML files > editable without being required to understand all vocabularies > used in one file. Yes, that would definitely go in EBML v3. And that will probably the trickiest part of all. >> Another problem if we don't have the notion of default >> namespace is to seek in matroska (semantic level) because >> that means the level 0 would need to be contained in the >> default namespace, and therefore would need to be prepended >> with that new ID. That means it's not backward compatible at all. > > Are you referring to the problem of knowing which namespace > rules are to be applied at the root ? I don't really understand > this. Yes or any element that can be seeked to directly (Level 0 and Level 1 elements in matroska). If the ID must be intrepreted with data outside the ID (as it is now) then seeking might not work. > In the realm of scoping rules for namespaces, I've always had > an XMLish approach, since I think namespaces in XML are done > quite well. In XML the namespace is a property of the element > object, and as such is included into the object. So are local > namespace declarations that must be applied to a particular > tag (xmlns:ns=""). If it's not too verbose (ie we make sure all IDs in MatroskaNS leave room for 2 or 3 bits in the ID) that would be fine. > Without focusing on the particular syntactic rules to write > namespaces in the EBML elements, here are some ideas for my > proposal : > > * Namespace declarations for an element can only be found into > the first child of that element, mainly for performance > purposes. That's too complicated. Especially if the first child has another namespace as the parent... > * Namespace declarations can be done in the following elements : > - A specific NS-declaration-holder element That would be for EBML v2. > - The EBMLHead element (practical since it is the first > element below the root and therefore can be used to set > the namespace rules for the root). That would be for EBML v2 too. > * Namespace rules are inherited by children from their parent > element, but children can override the rules from their > parent with the appropriate declarations. Sounds logical. > * An empty NS-declaration-holder overrides the inherited rules > and discards the usage of namespaces, therefore allowing > only EBML elements (other ones are up to the host app). I don't get that one. > * To retain compatibility with the original EBML header, > namespace-aware files include an empty NS-declaration-holder > in their header, therefore making the header pure EBML > without namespaces. I don't think this is necessary. The EBML version would provide that info. > These are only preliminary ideas, and my thinking is not > totally finished. Especially there is a slight theoretical > problem in these five rules (will you find it ? ;) and I > might need the already allocated 0x80 Class-ID... Ma langue au chat... >> Now chaining namespaces may not be so clean. Imagine you have >> a namespace for "comments" and a namespace for "signature". >> You can put comments anywhere in your file (discarded in >> matroska) and you can put signature anywhere in the file. But >> if what you add a comment in a signature or a signature in a >> comment ? How to interpret "signature::comment" or >> "comment::signature" IDs ? In that case we only need to use >> "comment" or "signature"... So is namespace chaining a >> feature we want ? Or we always revert to the last seen >> namespace ? (in which case seeking becomes easy) > > I think that the interpretation of the elements should be left > to the various applications. Whether or not element X means the > same thing when it is included in element A or element B is of > no interest to the EBML parser IMO. Likewise, elements in NS A > can contain elements from NS B, and we should not enforce that > "A::B" has any special meaning. The processing should be > something like : > 1 the parser finds element X from NS A -> inform application A > that an X element was found. > then the app asks EBML to parse the contents of this element > 2 the parser finds element Y from NS B -> inform application B > that an Y element was found. From now on it is up to app B to > decide what to do with this element. It can either : > * do some processing regardless of the context in which the Y > element was found > * inspect the parent elements from element Y, and take > specific action depending on the parent elements. Actually I'm not sure about C++ but for A::B::C::D it may be trying to find namespace A::B::C::, if not found B::C:: and if not C::. That could be a logical solution and allow overriding cascading namespaces when needed. > How the meaning of an element within other elements is handled > may however affect the DTD language. > >> Yes, I've been dreaming about a DB that would be EBML based >> or a file system. And given custom attributes are present in >> modern file systems it could be an option. Seeking in a file >> system would also be very important. > > I also thought of an EBML-based filesystem. It could become a > very good extensible, general-purpose FS. Yes, that would be nice. but designing a FS is a very specific and hard time (look at how long it takes to make ReiserFS good). And the main constraints are I/O speed which might not be compatible with EBML. And (de)fragmentation is probably the trickiest part... I'm still dreaming of a filesystem where it would be possible to add content in the middle of a file without rewriting the file from that point. But that would need an API in the OS, which doesn't exist so far... Steve -- robUx4 on blog From paul at msn.com Sun May 7 19:34:14 2006 From: paul at msn.com (Paul Bryson) Date: Sun, 07 May 2006 12:34:14 -0500 Subject: [Matroska-devel] Re: EBML Namespaces In-Reply-To: <445C8FDB.4060301@free.fr> References: <2684397F36DC8849A9BF842433B784360215D81C@GZI-VM01.cm-cic.fr> <445C8FDB.4060301@free.fr> Message-ID: Steve Lhomme wrote: > Yes. On the other hand Atamido's proposal (a global EBML element that > includes a namespace declaration the the related elements) would not > break backward compatibility. So while EBMLVersion would be increase, > EBMLReadVersion could remain the same. I prefer my solution as it doesn't introduce any major changes, but at the same time you have to acknowledge that there is significant overhead for any low bitrate data that you want to add in another namespace. IE, if you just want to add information in another namespace, it will require at least 16 bytes to do. That isn't much unless you just need to store a single byte. So if you are storing a single byte every few KB of data, it adds a lot to the overhead. > Unfortunately, as said above, adding bits anywhere in the element header > would be backward incompatible. So maybe she should improve EBML in 2 > versions. That would be EBML 1.1 (Atamido's extension) and EBML 2 (your > proposal). In EBML version IDs that would be EBMLVersion 2 with > EBMLReadVersion = 1 and EBMLVersion 3 (or more to leave improvement for > version 2) with EBMLReadVersion = 3. > > This way we can proceed in 2 phases. Make our best in EBMLVersion 2 and > then add the necessary backward incompatible changes in the other > version based on EBMLVersion 2. That will leave us some time to > gradually update parsers for namespaces in current parsers. I would think that it would be easier in the long run to introduce just a single major change than to introduce multiple steps like that. But, I'm not the one that has to code it. Atamido From steve.lhomme at free.fr Tue May 9 09:48:59 2006 From: steve.lhomme at free.fr (Steve Lhomme) Date: Tue, 09 May 2006 09:48:59 +0200 Subject: [Matroska-devel] Re: EBML Namespaces In-Reply-To: References: <2684397F36DC8849A9BF842433B784360215D81C@GZI-VM01.cm-cic.fr> <445C8FDB.4060301@free.fr> Message-ID: <4460496B.2070401@free.fr> Paul Bryson wrote: > Steve Lhomme wrote: >> Yes. On the other hand Atamido's proposal (a global EBML element that >> includes a namespace declaration the the related elements) would not >> break backward compatibility. So while EBMLVersion would be increase, >> EBMLReadVersion could remain the same. > > I prefer my solution as it doesn't introduce any major changes, but at > the same time you have to acknowledge that there is significant overhead > for any low bitrate data that you want to add in another namespace. IE, > if you just want to add information in another namespace, it will > require at least 16 bytes to do. That isn't much unless you just need > to store a single byte. So if you are storing a single byte every few > KB of data, it adds a lot to the overhead. Yes, that's why as a temporary version it's OK. But in the long term it's not. >> Unfortunately, as said above, adding bits anywhere in the element >> header would be backward incompatible. So maybe she should improve >> EBML in 2 versions. That would be EBML 1.1 (Atamido's extension) and >> EBML 2 (your proposal). In EBML version IDs that would be EBMLVersion >> 2 with EBMLReadVersion = 1 and EBMLVersion 3 (or more to leave >> improvement for version 2) with EBMLReadVersion = 3. >> >> This way we can proceed in 2 phases. Make our best in EBMLVersion 2 >> and then add the necessary backward incompatible changes in the other >> version based on EBMLVersion 2. That will leave us some time to >> gradually update parsers for namespaces in current parsers. > > I would think that it would be easier in the long run to introduce just > a single major change than to introduce multiple steps like that. But, > I'm not the one that has to code it. Yeah, we'll have to check if phase 1 is really necessary... I think it's good to have a middle step without breaking anything. But in the future that option might be useless and therefore shouldn't make it in the specs in the first place. As always, I'm just throwing my ideas and see what they turn into :) Steve -- robUx4 on blog From haessije at eps.e-i.com Tue May 9 14:36:07 2006 From: haessije at eps.e-i.com (HAESSIG Jean-Christophe) Date: Tue, 9 May 2006 14:36:07 +0200 Subject: [Matroska-devel] EBML Namespaces Message-ID: <2684397F36DC8849A9BF842433B784360215DC78@GZI-VM01.cm-cic.fr> > I've got very precise ideas on how the things could work now. This discussion is becoming more and more interesting. As I said before, I recently changed my mind about what the best place to store namespace information would be. Therefore, I would like to thank Steve and Atamido for participating to the debate and sharing their good ideas. Idea sharing enables us to find even better ideas, which I enjoy very much. As I said, I'll expand on how I see the namespace feature in the size descriptors, but before I'll answer to Steve's last post. > I don't think it would be OK to put elements from namespaces that were > not declared before in the file. That would be quite a PITA for the > parser if it would have to look for handler of namespace XYZ during I don't get for what reason this would be a problem. Perhaps you have a tricky case in mind ? To me, either the reading app understands a number of namespaces and doesn't try do do anything with the elements from unknown namespaces but ignore them. Or, the app is some kind of generic interpreter which dynamically loads libraries during the parsing. If reactivity is important, the loading can even be done in a separate thread. > parsing. This would better be done during the init of the EBML > file/stream reading. That is when declared in the header. That also > means we can find out about false (damaged) namespace headers. Is this really important ? I think if one is concerned with file integrity, he should use the generic CRC abilities of EBML. > Right now no parser would ignore extra bits put in the EBML header. The > specs were quite clear on how to intrepret each bits. There was no bit > left... Well, I tried to add some extra unspecified elements in the header of an mkv video, and the file still played well... With VLC, at least. > DocTypeVersion though) but they can be declared /invalid/ if they don't. Agreed. > Yes. On the other hand Atamido's proposal (a global EBML element that > includes a namespace declaration the the related elements) would not > break backward compatibility. So while EBMLVersion would be increase, > EBMLReadVersion could remain the same. Good point. > Unfortunately, as said above, adding bits anywhere in the element header > would be backward incompatible. So maybe she should improve EBML in 2 > versions. That would be EBML 1.1 (Atamido's extension) and EBML 2 (your > proposal). In EBML version IDs that would be EBMLVersion 2 with > EBMLReadVersion = 1 and EBMLVersion 3 (or more to leave improvement for > version 2) with EBMLReadVersion = 3. Your proposal of having files of a newer version readable by old parsers is interesting, and I think it can be done. However, I disagree if this means releasing 2 incompatible versions of EBML (1.1 and 2.0) using different mechanisms. If we had loads of file formats using EBML, I think I would go for it (but not happily) but it is not the case. The only format that we know of is Matroska, and if in the end the latest version is incompatible with v1 parsers there is no real benefit to release an intermediate version. > I don't really get the point here. If you need extra data in an element, > you can add a child element. It can also be considered as an attribute. My point is : imagine you have an integer element [ID][Size] For a more precise example : [9A][81][80] (Class Id 9A, Size 1 byte, value 0) Then imagine you need to attach a comment (e.g. "john wrote this") to this integer element. This would logically look like : [Value-ID][Size] [Comment-ID][Size] This feature is needed to support what I call "annotation". My original idea was to add something that would tell the parser that the comment element was the last child of the value element and therefore, that the raw integer data starts just after it. Now, some sort of wrapper could be used, as you suggest (this is what I understood when you said "If you need extra data in an element, you can add a child element". The example will become : [Value-ID][Size] [Comment-ID][Size] [Wrapper-ID][Size] This is quite clean, but I need to discuss it further because * It implies changes on the percieved structure of the file * It is somewhat related to Atamido's proposal 1. Changes to the perception of structure Using some wrapper element makes the integer value a child of the wrapper and the wrapper a child of the "value" element. This *has to* be handled by the EBML layer and to be invisible to the user of EBML APIs. The wrapper must remain a technical element that will not be visible and all its contents should be seen as children of the "value" element. As a side note, I want to add that there is a similar issue with the [EC] void element. Should this element be considered to be part of the data tree or just a technical placeholder not visible to higher level parsers ? 2. Link with Atamido's proposal As you also noted, an interesting point in this proposal is its "shielding" effect for contained data. If added in a v1 Matroska file, an element with an unknown class-id will be totally skipped, thus preserving the contained data from being wrongly interpreted. Aside from the use described before the wrapper can also be used to wrap children elements with v1-incompatible headers. It would then become possible to include them in a v1 file with the interesting effect of protecting them from unwanted reading by a v1 app. V2 readers, which would understand the wrapper as being a wrapper ;) would be able to properly parse them and apply extra namespace-related rules. As explained before, the elements contained in the wrapper should be seen as children of the parent element of the wrapper. This mechanism can be used in V2 files labelled as readable whith a V1 parser. It can't be used for annotation in these files. However, it can be used for that purpose in the wrapped and opaque parts of these files and in pure V2 files (only readable with a V2 parser). However, pure V2 files will have no real use of the wrapper element outside annotation. > I'm sure that adding a byte for *all* elements will always be bigger > than adding 3 bytes here and there. I don't think having a lot of > namespaces mixed in a place of the format would occur a lot. If you're > thinking about annotation, then it should be at the expense of a big > overhead. And for such a purpose I think it's OK. And what if the two objectives can be met simultaneously, i.e. being able to associate a specific namespace to every single element (e.g. regularly interleaving elements from 2 or 3 namespaces) AND only adding one byte here and there ? > > Moreover, the NS-related tag(s) should still be global and > > therefore cannot be used by applications. To be able not to > > change namespaces too often, the EBML private tags (void, > > signature, etc...) should be left global. In my proposal I > > intended to give EBML its own namespace providing total > > isolation from other vocabularies, which I felt cleaner. > > That's very tricky IMO. We still need some EBML global elements. And In my proposal EBML has its namespace, and no element is global. (except maybe the reserved FF, 7FFF, 3FFFFF, 1FFFFFFF... ) > losing some space for something that is mandatory anyway is not good > IMO. I think that both EBML and the DocType shouldn't require special > namespace handling. But if we have MatroskaNS and EBML v3 very I treat elements from the EBML namespace and elements from other namespaces equally. But there will be *no* backward-incompatible syntactic modifications to how EBML elements are represented and especially *no* incompatible modification to the header. This may sound weird, but here is the basic idea : * you have an EBML header : [1a][45][df][a3]<93> (19 bytes long) * the namespace is encoded in the value of the size header (this excludes the bits used to set the length of the VINT, that is the '0010011' binary string) * in this example, say the namespace for EBML is expressed as the bit string '00' According to these rules, that valid v1 EBML header is in the EBML namespace, by just doing nothing ;). Elements in other namespaces can use other codes, which will break compatibility with v1 parsers, since the decoded size will be wrong, but this is harmless, since no v1 parser would ever attempt to read such an element, either because the EBMLReadVersion would be set accordingly or the "shielding" elements would be used. The only plausible objection is : what if my header is 51 bytes long and I still need to encode the EBML NS with '00' ? You just need to encode the size on two bytes, like this : [40][33], which is, according to the spec, totally legal (and successfully tested). > > Are you referring to the problem of knowing which namespace > > rules are to be applied at the root ? I don't really understand > > this. > > Yes or any element that can be seeked to directly (Level 0 and Level 1 > elements in matroska). If the ID must be intrepreted with data outside > the ID (as it is now) then seeking might not work. We can not do much for Matroska v1. All programs that would like to add data with the mechanisms I described earlier would have to know how to handle Matroska, at least to maintain the existing cue heads. For an hypothetic MatroskaNS and for all formats interested in indexing and seeking, the new generic infrastructure will need to include for each indexed element : * an application-dependent key (which can be a time offset, coordinates, etc...) * a pointer or a file offset to the element * data to rebuild the context at the element (level and namespaces) > > Without focusing on the particular syntactic rules to write > > namespaces in the EBML elements, here are some ideas for my > > proposal : > > > > * Namespace declarations for an element can only be found into > > the first child of that element, mainly for performance > > purposes. > > That's too complicated. Especially if the first child has another > namespace as the parent... Well, this is no argument, it's an opinion. My opinon is that it isn't complicated enough to restrain from doing it ;). In fact, when examining an element, it is only needed to check the two first contained data bytes to see if there might be a local NS declaration. If not, processing stops there. If these bytes match a NS declaration, one would need to check if the element is from the EBML NS, which should take 8 bytes to read maximum. Next, if the two conditions are met, they should be recursively applied until no NS declaration is found and the NS declaration should be processed. Yes, silly file writers can recursively embed useless NS declarations, but as file sizes are limited, NS declaration depths are too. > > * An empty NS-declaration-holder overrides the inherited rules > > and discards the usage of namespaces, therefore allowing > > only EBML elements (other ones are up to the host app). > > I don't get that one. This just tells that no namespace-related computing should be done in the element. This element and its children can therefore only belong to the EBML namespace. > > * To retain compatibility with the original EBML header, > > namespace-aware files include an empty NS-declaration-holder > > in their header, therefore making the header pure EBML > > without namespaces. > > I don't think this is necessary. The EBML version would provide that info. Yes, if the default elements implied by v2 parsers are different from v1 this can be omitted. > > These are only preliminary ideas, and my thinking is not > > totally finished. Especially there is a slight theoretical > > problem in these five rules (will you find it ? ;) and I > > might need the already allocated 0x80 Class-ID... > > Ma langue au chat... The problem was the status of namespaces inside a namespace declaration element. Elements inside that element would obey to namespace rules yet to be defined, which raises an egg-and chicken problem, or "namespace no man's land" ;). > Yes, that would be nice. but designing a FS is a very specific and hard > time (look at how long it takes to make ReiserFS good). And the main > constraints are I/O speed which might not be compatible with EBML. And > (de)fragmentation is probably the trickiest part... I/O speed is mainly achieved thanks to block alignment IMO. Of course it would be silly to design a FS where the block structure of disk devices wouldn't be taken in account. This is good to keep in mind for the future seeking infrastructure. It should allow to express offsets as block counts. Block alignment requires placeholders. The [EC] element just fits in here. However, this element can be at least 2 bytes long, and it might be needed to have a placeholder for 1 byte only. This could be solved by using the reserved [FF]. Now, let's see in detail how I intend to use the size headers. >From the beginning, I thought it should be possible to express namespace associated with an element on less than 1 byte. I figured out that we could have something more generic than namespaces. I will later refer to this feature as "Short Property". The data that falls into that category are namespaces and the Master flag, but nothing prevents the addition of other properties. To define which Short Properties are available in an element, the following elements are defined : Element-name Valid-parents Class-ID Element-type SP * Any * [53][50] sub-elements SPItem EBML,SP [C9] sub-elements The SP element can contain one declaration OR any number of SPItem elements. The SPItem element can only contain one declaration. A declaration includes an element that contains the bit-string inserted in the leftmost bits of the size header. This string is a key that indentifies the SP declaration. Element-name Valid-parents Class-ID Element-type SPCode SP,SPItem [C0] Bit-image The Bit-image type represents an arbitrary length bit string. Each bit is encoded in two bits entries, for example the string 0b001 is encoded 0b01011100 (0x5C). The first bit of the entry is the real value, the second bit is a flag that indicates if the entry stands for a real bit (1) or is padding (0) to meet the required byte-alignment. The declaration also includes data that will be associated with that SP. Namespaces and the Master flag belong to this kind of data: Element-name Valid-parents Class-ID Element-type EBMLNS SP,SPItem [B5] string IsMaster SP,SPItem [AC] void NotMaster SP,SPItem [DC] void For example, if a declaration includes the two elements: [C0]<81>|DC| (0b101) [B5]<91>"example-namespace" If the following element is in the scope of this declaration: [CD]"some-element" it would be seen as belonging to the "example-namespace" NS. The size header (0xDC) is 0b11011100 LNNNVVVV The namespace (bits over NNN) is 101 The processing to get the size of the element is resumed on the remaining bits (1100) which translates to 12. For completeness, I include the two following elements: Element-name Valid-parents Class-ID Element-type LastChild SP,SPItem [FC] void NotLastChild SP,SPItem [8C] void These elements were used to mark the last child of an element before I considered the shielding wrapper thing. They are less useful now, but they illustrate the flexibility of the small property feature. We still need a way to end unbounded master elements and the "LastChild" property could be useful here. It could also be used as an end-of-file marker if included in the last child of the root. Examples : A : Standard EBML header that can be used with Matroska [1A][45][DF][A3]<9B> [C9]<82> (1) [C0]<80> (2) [42][86]<81>|82| [42][F7]<81>|81| [42][82]<88>"matroska" [42][87]<81>|81| Comments: 1. This is a namespace declaration item. There can be more than one in the EBML element. This declaration will set the namespace rules for the root and all its children, the EBML element being one of them 2. The declared code is empty. This disables namespaces in the described scope, so it is useless to specify any of the other elements (B5,AC,DC...) B : EBML header for a file using namespaces. The header remains understandable for v1 parsers. [1A][45][DF][A3] [53][50]<82> (1) [C0]<80> (2) [C9] (3) [C0]<81>|50| (4) [B5]<9E>"ExtensibleBinaryMarkupLanguage" [C9]<99> [C0]<81>|74| (5) [B5]<92>"SomeOtherNamespace" [AC]<80> [C9]<99> [C0]<81>|7C| (6) [B5]<92>"SomeOtherNamespace" [DC]<80> [42][86]<81>|82| [42][F7]<81>|8x| (7) [42][82]<88>"abcdefgh" [42][87]<81>|81| Comments: 1. This element sets the namespace rules for the EBML element. 2. Here, namespaces are disabled 3. This element contains one namespace declaration It applies to the root and all its children. However, these rules do not apply in the header because they are overriden by a more specific declaration (1). 4. The bit string '00' will be used (coded 01010000) It will represent the ns "ExtensibleBinaryMarkupLanguage" 5. The bit string '010' will be used (coded 01110100) It will represent the ns "SomeOtherNamespace" This code must be used for master elements 6. The bit string '011' will be used (coded 01111100) It will represent the ns "SomeOtherNamespace" for non-master elements 7. The header is readable by v1 parsers, the content of the file may be, or not. C : Same header, but namespaces are activated inside the header [1A][45][DF][A3]<4078> (1) [C9]<4023> (2) [C0]<81>|50| [B5]<9E>"ExtensibleBinaryMarkupLanguage" [C9]<99> (3) [C0]<81>|74| [B5]<92>"SomeOtherNamespace" [AC]<80> [C9]<99> [C0]<81>|7C| [B5]<92>"SomeOtherNamespace" [DC]<80> [42][86]<81>|82| [42][F7]<81>|8x| [42][82]<88>"abcdefgh" [42][87]<81>|81| Comments: 1. According to the following rules, the size header must belong to the EBML NS. Encoding the size as F8 (11111000) does not work, because the leftmost bits of the value must be '00'. Therefore, the size is 4078(0100000001111000) which is correct and still valid v1 EBML. 2. This declaration must also use a wider size 3. This declaration is OK on 1 byte With this syntax, elements from other namespaces could be used inside the header, but these elements should be included in a wrapper for the header to remain readable for v1 parsers. I'm still not 100% happy with the C0 element. Fiddling with bit strings is OK, but the probability to use elements from the same namespace as the parent element is high, therefore the codes should be adapted to be shorter, but this would get quite complicated ;) JC From haessije at eps.e-i.com Tue May 9 14:48:44 2006 From: haessije at eps.e-i.com (HAESSIG Jean-Christophe) Date: Tue, 9 May 2006 14:48:44 +0200 Subject: [Matroska-devel] EBML Namespaces : my mistake Message-ID: <2684397F36DC8849A9BF842433B784360215DC80@GZI-VM01.cm-cic.fr> I just saw that I VINT-encoded the integer values, which is incorrect. Please replace these bits with the correct ones while you read. Thanks, JC From kurtnoise at free.fr Thu May 11 10:12:28 2006 From: kurtnoise at free.fr (Kurtnoise) Date: Thu, 11 May 2006 10:12:28 +0200 Subject: [Matroska-devel] Matroska QuickTime Component Message-ID: <4462F1EC.7010201@free.fr> Hi guys, Just to tell you that there is now a Matroska QT component developped on SF by David Conrad : https://sourceforge.net/projects/matroskaqt/ . This is currently available via SVN. But only for Mac OS for the moment...maybe a Windows port could be great ? What do you think about that ? ++ Kurtnoise From unmei at matroska.org Thu May 11 18:42:44 2006 From: unmei at matroska.org (unmei) Date: Thu, 11 May 2006 18:42:44 +0200 Subject: [Matroska-devel] Re: Matroska QuickTime Component In-Reply-To: <4462F1EC.7010201@free.fr> References: <4462F1EC.7010201@free.fr> Message-ID: Any additionally matroska-enabled application is of course good news. Esp. as i assume Quicktime on Mac has a large user/fanbase (on windows however, IMO, one must be quite masochistic to use it - don't blame me, you asked about thoughts ;) Kurtnoise wrote: > > But only for Mac OS for the moment...maybe a Windows port could be great > ? What do you think about that ? > From mikokong at gmail.com Tue May 16 03:00:34 2006 From: mikokong at gmail.com (miko kong) Date: Tue, 16 May 2006 09:00:34 +0800 Subject: [Matroska-devel] I want to continue the JEBML's writing function. Message-ID: I found JEBML's writing function "addFrame" is not implemented, I want to continue. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris at matroska.org Tue May 16 19:37:54 2006 From: chris at matroska.org (Christian HJ Wiesner) Date: Tue, 16 May 2006 19:37:54 +0200 Subject: [Matroska-devel] I want to continue the JEBML's writing function. In-Reply-To: References: Message-ID: <446A0DF2.3090309@matroska.org> miko kong schrieb: > I found JEBML's writing function "addFrame" is not implemented, I want > to continue. Hi Miko, you are welcome. Just send the patches in, we will gladly upload them to our SVN. A bit lateron, we can also discuss about giving you direct writing access to our SVN, if we know each other a little bit better. Regards Christian matroska project admin From yann.renard.mailing-lists at tiscali.fr Wed May 17 15:14:01 2006 From: yann.renard.mailing-lists at tiscali.fr (Yann Renard) Date: Wed, 17 May 2006 15:14:01 +0200 Subject: [Matroska-devel] Using matroska as generic container Message-ID: <446B2199.2000508@tiscali.fr> Hello everyone, I'm new to matroska and my question may sound quite uncommon, so please apologize if I'm totally out of the topic. I'm currently working on a project that needs to manage multiple streams. The different streams will have to be muxed/demuxed sometimes, that's pretty sure. However, those streams have nothing related to audio nor video. Their content may be anything, ranging from discrete dated events to continuous dated data stream. My question is could and should matroska be used as a container for arbitrary streams ? Would you think this is a good idea ? The reason why I'm thinking to matroska is because it's known to be robust and very efficient as far as I know, and quite open to evolutions and maybe exotic use cases like this one. Thank you all for your information, Best regards, Yann Renard From mike at po.cs.msu.su Wed May 17 21:07:31 2006 From: mike at po.cs.msu.su (Mike Matsnev) Date: Wed, 17 May 2006 23:07:31 +0400 Subject: [Matroska-devel] Using matroska as generic container In-Reply-To: <446B2199.2000508@tiscali.fr> References: <446B2199.2000508@tiscali.fr> Message-ID: <446B7473.9050503@po.cs.msu.su> Yann Renard wrote: > My question is could and should matroska be used as a container for > arbitrary streams ? Would you think this is a good idea ? While matroska was designed mostly for A/V use, nothing prevents you from defining your own track types and codec IDs. You can use it for most timestamped data if it's what your application needs. From steve.lhomme at free.fr Thu May 18 09:10:38 2006 From: steve.lhomme at free.fr (Steve Lhomme) Date: Thu, 18 May 2006 09:10:38 +0200 Subject: [Matroska-devel] Using matroska as generic container In-Reply-To: <446B2199.2000508@tiscali.fr> References: <446B2199.2000508@tiscali.fr> Message-ID: <446C1DEE.8090308@free.fr> Yann Renard wrote: > Hello everyone, > > I'm new to matroska and my question may sound quite uncommon, so please > apologize if I'm totally out of the topic. > > I'm currently working on a project that needs to manage multiple > streams. The different streams will have to be muxed/demuxed sometimes, > that's pretty sure. However, those streams have nothing related to audio > nor video. Their content may be anything, ranging from discrete dated > events to continuous dated data stream. As long as you are using streams with timestamps it's fine. > My question is could and should matroska be used as a container for > arbitrary streams ? Would you think this is a good idea ? Yes and yes. It has always been our goal to allow anything to be put in Matroska, even scientific data that can have timestamps up to a nanosecond. > The reason why I'm thinking to matroska is because it's known to be > robust and very efficient as far as I know, and quite open to evolutions > and maybe exotic use cases like this one. Exactly. To do this I suggest you to have a look at mkvtoolnix to mux/demux your streams. Feel free to ask here or Mosu for tips about the code. For playback (if you need that too) you can ask Haali a copy of his DShow filter or extend the MKV demuxer in VLC. The latter uses libmatroska as mkvtoolnix so you'll be familiar with the code. Steve From yann.renard.mailing-lists at tiscali.fr Thu May 18 11:10:42 2006 From: yann.renard.mailing-lists at tiscali.fr (Yann Renard) Date: Thu, 18 May 2006 11:10:42 +0200 Subject: [Matroska-devel] Using matroska as generic container In-Reply-To: <446C1DEE.8090308@free.fr> References: <446B2199.2000508@tiscali.fr> <446C1DEE.8090308@free.fr> Message-ID: <446C3A12.3060909@tiscali.fr> Steve Lhomme wrote: > Yann Renard wrote: >> Hello everyone, >> >> I'm new to matroska and my question may sound quite uncommon, so >> please apologize if I'm totally out of the topic. >> >> I'm currently working on a project that needs to manage multiple >> streams. The different streams will have to be muxed/demuxed >> sometimes, that's pretty sure. However, those streams have nothing >> related to audio nor video. Their content may be anything, ranging >> from discrete dated events to continuous dated data stream. > > As long as you are using streams with timestamps it's fine. > >> My question is could and should matroska be used as a container for >> arbitrary streams ? Would you think this is a good idea ? > > Yes and yes. It has always been our goal to allow anything to be put in > Matroska, even scientific data that can have timestamps up to a nanosecond. > >> The reason why I'm thinking to matroska is because it's known to be >> robust and very efficient as far as I know, and quite open to >> evolutions and maybe exotic use cases like this one. > > Exactly. > > To do this I suggest you to have a look at mkvtoolnix to mux/demux your > streams. Feel free to ask here or Mosu for tips about the code. > > For playback (if you need that too) you can ask Haali a copy of his > DShow filter or extend the MKV demuxer in VLC. The latter uses > libmatroska as mkvtoolnix so you'll be familiar with the code. > > Steve Thank you very much for those infos !! Best regards, Yann From zeploum at gmail.com Fri May 19 12:44:35 2006 From: zeploum at gmail.com (Lionel Dricot) Date: Fri, 19 May 2006 12:44:35 +0200 Subject: [Matroska-devel] Multi-segments sample file ? In-Reply-To: <442196FF.6030308@po.cs.msu.su> References: <16a711710603220520n71a90726teb2d7347c0671b47@mail.gmail.com> <44215203.4080502@hrz.tu-chemnitz.de> <16a711710603221018l5f128e4cw8d4bbb9aae47c882@mail.gmail.com> <442196FF.6030308@po.cs.msu.su> Message-ID: <16a711710605190344y7424730dt3d1f4a704f3b8044@mail.gmail.com> > > But, even if I had the right to share it, a 4Go file is somewhat > > difficult to share. Has nobody a smaller file that he can send me ? > > > > I don't have Windows, so I cannot make my own file. It seems that > > tools for multi-segments files are windows only ATM. > No. They are produced by the same mkvmerge. But playback is indeed > supported on windows only. > Can someone point me on a documentation telling how to make a simple multi-segment file using mmg ? And is this possible to split an actual mutli-segment file using mmg ? I think it would be great if this support can be added to GStreamer. From insomniacdarling at yahoo.com Sun May 21 22:40:56 2006 From: insomniacdarling at yahoo.com (insomniac) Date: Sun, 21 May 2006 20:40:56 +0000 (UTC) Subject: [Matroska-devel] moving/renaming avis hindered by haali bundle? Message-ID: each time i right-click an avi to rename/move it i get 100% cpu usage by explorer.exe and i need to wait for like 20-30 sec. for it to calm down to perform the required action . sometimes it doesn't calm down at all and i need to restart the machine to get access to the file and still i have to wait a good number of seconds before explorer allows me to do anything with the file. it looks kind of like that notorious problem with shmedia.dll that explorer would load against an avi if a user watches it and then it becomes impossible to do anything with this file for a long long time. while i was using the stock windows avi splitter i would just disable shmedia in registry and that solved all of my problems. now that i'm using haali's filters same prob is back again. i've used them from the very beginning (end of 2004) and didn't experience this issue until about half a year ago (maybe some explorer integration features were added). and although i have all such explorer related features disabled in config dialog i still cannot get rid of this problem. it doesn't occur with haali bundle uninstalled. could you help Mike? thanks a lot in advance. From mike at po.cs.msu.su Mon May 22 01:58:53 2006 From: mike at po.cs.msu.su (Mike Matsnev) Date: Mon, 22 May 2006 03:58:53 +0400 Subject: [Matroska-devel] moving/renaming avis hindered by haali bundle? In-Reply-To: References: Message-ID: <4470FEBD.60200@po.cs.msu.su> insomniac wrote: > each time i right-click an avi to rename/move it i get 100% cpu usage by > explorer.exe and i need to wait for like 20-30 sec. for it to calm down to > perform the required action . sometimes it doesn't calm down at all and i need > to restart the machine to get access to the file and still i have to wait a > good number of seconds before explorer allows me to do anything with the file. > > it looks kind of like that notorious problem with shmedia.dll that explorer > would load against an avi if a user watches it and then it becomes impossible > to do anything with this file for a long long time. while i was using the > stock windows avi splitter i would just disable shmedia in registry and that > solved all of my problems. > > now that i'm using haali's filters same prob is back again. i've used them > from the very beginning (end of 2004) and didn't experience this issue until > about half a year ago (maybe some explorer integration features were added). > and although i have all such explorer related features disabled in config > dialog i still cannot get rid of this problem. > > it doesn't occur with haali bundle uninstalled. could you help Mike? thanks a > lot in advance. Did you enable thumbnails extraction in splitter settings? It's off by default and it's the only thing that can cause such delay. Also you can uncheck 'Enable shell extension' when installing. From liaokai.cn at gmail.com Fri May 26 12:00:00 2006 From: liaokai.cn at gmail.com (Liao Carl) Date: Fri, 26 May 2006 18:00:00 +0800 Subject: [Matroska-devel] MKV GlobalTimecode() return messy data in VLC Message-ID: <8e2c5420605260300k67abe9c9l659275d242df3a17@mail.gmail.com> Hi All, I am using libebml-0.7.7 and libmatroska-0.8.0 under VLC-0.8.4a. The target platform is a MIPS Au1200 board. It cannot properly playback .MKV files. It seems that my MIPS version of KaxBlock::GlobalTimecode()/KaxSimpleBlock::GlobalTimecode() in VLC's mkv.cpp return messy data, such as: [00000338] mkv demuxer: GlobalTimecode = 0x0 [00000338] mkv demuxer: GlobalTimecode = 0x0 [00000338] mkv demuxer: GlobalTimecode = 0x5f5e10000 [00000338] mkv demuxer: GlobalTimecode = 0x1f78a4000 [00000338] mkv demuxer: GlobalTimecode = 0x3fe56c000 [00000338] mkv demuxer: GlobalTimecode = 0xfffffffc2f700000 [00000338] mkv demuxer: GlobalTimecode = 0xfffffffca9820000 [00000338] mkv demuxer: GlobalTimecode = 0xfffffff8ab2b4000 [00000338] mkv demuxer: GlobalTimecode = 0xfffffffab1f7c000 [00000338] mkv demuxer: GlobalTimecode = 0x29f724240 While I could properly playback the same .MKV file under RedHat WS 4.0 using the same versions of libebml/libmatroska/VLC, it dumps the following: [00000189] mkv demuxer: GlobalTimecode = 0x0 [00000189] mkv demuxer: GlobalTimecode = 0x0 [00000189] mkv demuxer: GlobalTimecode = 0x5f5e100 [00000189] mkv demuxer: GlobalTimecode = 0x1f78a40 [00000189] mkv demuxer: GlobalTimecode = 0x3fe56c0 [00000189] mkv demuxer: GlobalTimecode = 0xb71b000 [00000189] mkv demuxer: GlobalTimecode = 0xbebc200 [00000189] mkv demuxer: GlobalTimecode = 0x7ed6b40 [00000189] mkv demuxer: GlobalTimecode = 0x9f437c0 [00000189] mkv demuxer: GlobalTimecode = 0x11e1a300 I wonder whether there are special hardware dependancies for libebml to run under a MIPS platform. It seems that the GlobalTimecode is 8 bit left-shifted and sometimes mssed up under MIPS. My libebml CXX flags are: -O2 -Wall -Wno-unknown-pragmas -ansi -fno-gnu-keywords -Wshadow. And my libmatroska CXX flags are: -O2 -Wall -Wno-unknown-pragmas -ansi -fno-gnu-keywords -Wshadow -D_GNU_SOURCE Could you do me a favor to give some hints? Thanks a lot! Carl From chris at matroska.org Fri May 26 21:52:52 2006 From: chris at matroska.org (Christian HJ Wiesner) Date: Fri, 26 May 2006 21:52:52 +0200 Subject: [Matroska-devel] FAAD2 / CoreAAC bug Message-ID: <44775C94.3080004@matroska.org> http://www.hydrogenaudio.org/forums/index.php?showtopic=44970&hl=mka+mkv+matroska From seelie at faireal.net Sat May 27 01:32:20 2006 From: seelie at faireal.net (Liisachan) Date: Fri, 26 May 2006 23:32:20 +0000 Subject: [Matroska-devel] FAAD2 / CoreAAC bug In-Reply-To: <44775C94.3080004@matroska.org> References: <44775C94.3080004@matroska.org> Message-ID: <44779004.50506@faireal.net> Christian HJ Wiesner wrote: > http://www.hydrogenaudio.org/forums/index.php?showtopic=44970&hl=mka+mkv+matroska > > _______________________________________________ > Matroska-devel mailing list > Matroska-devel at lists.matroska.org > http://lists.matroska.org/cgi-bin/mailman/listinfo/matroska-devel > Read Matroska-Devel on GMane: > http://dir.gmane.org/gmane.comp.multimedia.matroska.devel > Celtic_druid and I have good reasons to beleive sbr_qmf.c revision 1.29 is buggy (can't decode some HE-AAC properly) and 1.28 is a good one. The situation is still unclear and hypothetical but the best guess so far is: menno tried to make it faster by changing 32-point DCT-IV + DST-IV to DCT-IV 2 times (simpler - more optimizable), but he forgot to change the positive/negative. Just changing k -> 31-k is not enough. Signs should have been properly modified. (I dont know what I'm talking about; I'm just repeating what I was told by someone who knows better.) There were also compiler-side problems too in: http://ffdshow.faireal.net/mirror/Misc/CoreAAC-1.2.0.575.exe http://ffdshow.faireal.net/mirror/Misc/CoreAAC-1.2.0.575-rev.exe This is the best version so far, compiler-side pbs fixed, and sbr thing back to 1.28. http://ffdshow.faireal.net/mirror/Misc/CoreAAC-1.2.0.575-rev2.exe From jcsston at jory.info Sun May 28 17:43:23 2006 From: jcsston at jory.info (Jory Stone) Date: Sun, 28 May 2006 10:43:23 -0500 Subject: [Matroska-devel] IRC Down? Message-ID: <4479C51B.8030503@jory.info> Is anyone else having trouble connecting to the corecodec irc server? Thanks, Jory -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 542 bytes Desc: OpenPGP digital signature URL: From chris at wiesneronline.net Sun May 28 23:54:22 2006 From: chris at wiesneronline.net (Christian HJ Wiesner) Date: Sun, 28 May 2006 23:54:22 +0200 Subject: [Matroska-devel] IRC Down? In-Reply-To: <4479C51B.8030503@jory.info> References: <4479C51B.8030503@jory.info> Message-ID: <447A1C0E.5040502@wiesneronline.net> Jory Stone schrieb: >Is anyone else having trouble connecting to the corecodec irc server? >Thanks, >Jory > I am connected just fine, like everybody else ! Are you trying to connect to irc.corecodec.com or irc.corecodec.org ? Christian From paul at msn.com Mon May 29 04:22:06 2006 From: paul at msn.com (Paul Bryson) Date: Sun, 28 May 2006 21:22:06 -0500 Subject: [Matroska-devel] Re: IRC Down? In-Reply-To: <4479C51B.8030503@jory.info> References: <4479C51B.8030503@jory.info> Message-ID: Jory Stone wrote: > Is anyone else having trouble connecting to the corecodec irc server? Use the ever popular commo.corecodec-irc.net