[Matroska-devel] SimpleBlock payload header size: 4 or 6 bytes?

Daniel Neugebauer mailinglists at energiequant.de
Mon Dec 30 19:56:38 CET 2013


I wrote my own Matroska encoder from scratch following the official
specifications and ran into an interpretation issue with the SimpleBlock

The specification says: (see
http://www.matroska.org/technical/specs/index.html#simpleblock_structure )

"Size = 1 + (1-8) + 4 + (4 + (4)) octets. So from 6 to 21 octets."

It's obvious that bytes 0..3 are mandatory, but what about 4..5? The
table says "Lace (when lacing bit is set)" although the above
calculation implicates that those bytes are mandatory? I only count 4
bytes for the SimpleBlock header if lacing information can be omitted if
no lacing is used.

I discovered that because FFMPEG appears to expect a 4 byte header, not
6 bytes, and spat error messages about invalid PCM audio packet sizes (2
byte instead of 4 bytes for 16 bit stereo). When I changed my bytes 4..5
from 0x00 0x00 to 0x00 0b10000000 (valid EBML encoding for 0 lace
length) I got crackling sound played back, so FFMPEG indeed took the
lacing header for payload data. I don't see any errors if I remove the
lacing header when playing back with FFMPEG. What interpretation of the
specs is correct or how is the above size calculation to be understood?

Who misread the specification, is it me or FFMPEG? Should the
SimpleBlock header on the EBML payload be 4 or 6 bytes long? If it's
supposed to be 6 bytes long, how should that header look like? Or did
someone combine the EBML node header with the SimpleBlock payload header
structure (one byte element sequence + 1..8 bytes for coding the payload
length)? In that case, the above calculation starts to make sense but
the offsets presented in the table do not make much sense as they always
require at least 2 bytes offset which isn't being mentioned?

It would be great if you could clear my confusion and maybe update the
specification to make it clearer how that header should look like.

My code in C++: (EBMLTreeNode is used to wrap EBML structures)

void MatroskaEncoder::writeSimpleBlock(TimedPacket *timedPacket,
unsigned int timecodeRelativeToCluster, unsigned char trackNumber) {
    // SimpleBlock needs prefixed header
    unsigned long long len = timedPacket->dataLength + 4;
    unsigned char *out = new unsigned char[len];
    memcpy(out + 4, timedPacket->data, timedPacket->dataLength);

    // see:
    out[0] = 0b10000000 | (trackNumber & 0b01111111); // track number
encoded like EBML data size, thus prefixed MSB 1 for 7-bit values
    out[1] = (unsigned char) ((timecodeRelativeToCluster >> 8) & 0xFF);
// timecode upper byte in BE
    out[2] = (unsigned char)  (timecodeRelativeToCluster       & 0xFF);
// timecode lower byte in BE
    out[3] = 0b10000001; // keyframe, not invisible, no lacing, discardable
    //out[4] = 0x00; // no frames in lace
    //out[5] = 0b1000000; // lace size (EBML-encoded 0 because we use none)

    EBMLTreeNode *node = new
    node->setBinaryContent(out, len);
    unsigned char* simpleBlockContent = node->getOuterContent();
    fwrite(simpleBlockContent, node->getOuterSize(), 1, fh);
    delete simpleBlockContent;

    delete node; // also deletes out


More information about the Matroska-devel mailing list