[Matroska-devel] EBML specification component for review - Variable size integer

Steve Lhomme slhomme at matroska.org
Sat May 2 17:00:37 CEST 2015


On Thu, Apr 30, 2015 at 5:56 AM, Erik Piil <piil.erik at gmail.com> wrote:

> This discussion relates to the “Variable size integer” portion of the
> earlier EBML RFC Draft for revision/incorporation into the final EBML
> specification.
>
>
> From the RFC Draft:
>
>
> Variable size integer
>
>
> For both element ID and size descriptor EBML uses a variable size integer,
> coded according to a schema similar to that of UTF-8 [UTF-8] encoding. The
> variable size integer begins with zero or more zero bits to define the
> width of the integer. Zero zeroes means a width of one byte, one zero a
> width of two bytes etc. The zeroes are followed by a marker of one set bit
> and then follows the actual integer data. The integer data consists of
> alignment data and tail data. The alignment data together with the width
> descriptor and the marker makes up one ore more complete bytes. The tail
> data is as many bytes as there were zeroes in the width descriptor, i.e.
> width-1.
>

Although I know EBML. I don't fully understand the wording. It's very
convoluted. From what I understand "alignment" is the 0 padding between the
"width marker" and the "integer value". You may rename "integer data" to
"integer value" by the way.


> VINT = VINT_WIDTH VINT_MARKER VINT_DATA
>
> VINT_WIDTH = *%b0
>
> VINT_MARKER = %b1
>
> VINT_DATA = VINT_ALIGNMENT VINT_TAIL
>
> VINT_ALIGNMENT = *BIT
>
> VINT_TAIL = *BYTE
>
>
> An alternate way of expressing this is the following definition, where the
> width is the number of levels of expansion.
>
>
> VINT = ( %b0 VINT 7BIT ) / ( %b1 7BIT )
>
>
> Some examples of the encoding of integers of width 1 to 4. The x:es
> represent bits where the actual integer value would be stored.
>
>
> Width  Size  Representation
>
> 1  2^7  1xxx xxxx
>
> 2  2^14  01xx xxxx xxxx xxxx
>
> 3  2^21  001x xxxx xxxx xxxx xxxx xxxx
>
> 4  2^28  0001 xxxx xxxx xxxx xxxx xxxx xxxx xxxx
>
>
> Any thoughts are most appreciated.
>
>
> The 0 padding should follow the 1 to show that the actual value is in the
xxx. Even though 00010 has the same value of 10 anyway. So the other
alternative is not to mention padding at all. It's self explanatory. But
for a full spec that may not be good enough.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.matroska.org/pipermail/matroska-devel/attachments/20150502/fafbe552/attachment.html>


More information about the Matroska-devel mailing list