[Matroska-devel] EBML specification component for review - Variable size integer

Steve Lhomme slhomme at matroska.org
Sat May 2 17:06:32 CEST 2015


Also the special case of infinite/unknown size (all x set to 1) is not
mentioned.

On Sat, May 2, 2015 at 5:00 PM, Steve Lhomme <slhomme at matroska.org> wrote:

>
>
> On Thu, Apr 30, 2015 at 5:56 AM, Erik Piil <piil.erik at gmail.com> wrote:
>
>> This discussion relates to the “Variable size integer” portion of the
>> earlier EBML RFC Draft for revision/incorporation into the final EBML
>> specification.
>>
>>
>> From the RFC Draft:
>>
>>
>> Variable size integer
>>
>>
>> For both element ID and size descriptor EBML uses a variable size
>> integer, coded according to a schema similar to that of UTF-8 [UTF-8]
>> encoding. The variable size integer begins with zero or more zero bits to
>> define the width of the integer. Zero zeroes means a width of one byte, one
>> zero a width of two bytes etc. The zeroes are followed by a marker of one
>> set bit and then follows the actual integer data. The integer data consists
>> of alignment data and tail data. The alignment data together with the width
>> descriptor and the marker makes up one ore more complete bytes. The tail
>> data is as many bytes as there were zeroes in the width descriptor, i.e.
>> width-1.
>>
>
> Although I know EBML. I don't fully understand the wording. It's very
> convoluted. From what I understand "alignment" is the 0 padding between the
> "width marker" and the "integer value". You may rename "integer data" to
> "integer value" by the way.
>
>
>> VINT = VINT_WIDTH VINT_MARKER VINT_DATA
>>
>> VINT_WIDTH = *%b0
>>
>> VINT_MARKER = %b1
>>
>> VINT_DATA = VINT_ALIGNMENT VINT_TAIL
>>
>> VINT_ALIGNMENT = *BIT
>>
>> VINT_TAIL = *BYTE
>>
>>
>> An alternate way of expressing this is the following definition, where
>> the width is the number of levels of expansion.
>>
>>
>> VINT = ( %b0 VINT 7BIT ) / ( %b1 7BIT )
>>
>>
>> Some examples of the encoding of integers of width 1 to 4. The x:es
>> represent bits where the actual integer value would be stored.
>>
>>
>> Width  Size  Representation
>>
>> 1  2^7  1xxx xxxx
>>
>> 2  2^14  01xx xxxx xxxx xxxx
>>
>> 3  2^21  001x xxxx xxxx xxxx xxxx xxxx
>>
>> 4  2^28  0001 xxxx xxxx xxxx xxxx xxxx xxxx xxxx
>>
>>
>> Any thoughts are most appreciated.
>>
>>
>> The 0 padding should follow the 1 to show that the actual value is in the
> xxx. Even though 00010 has the same value of 10 anyway. So the other
> alternative is not to mention padding at all. It's self explanatory. But
> for a full spec that may not be good enough.
>



-- 
Steve Lhomme
Matroska association Chairman
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.matroska.org/pipermail/matroska-devel/attachments/20150502/6e1cd836/attachment.html>


More information about the Matroska-devel mailing list