[matroska-devel] Re: File ID / Track ID

Steve Lhomme steve.lhomme at free.fr
Mon Jan 20 14:45:11 CET 2003


En réponse à ChristianHJW <christian.hj.wiesner at web.de>:
> Now, as we all know its happening pretty often that there are more than
> one version of a file floating around in the internet, people mux
> different subtitles with it, remove subtitles, or will add covers and
> lyrics once matroska will be available to do so. In any case, those mods
> would result in a new file ID number ( correct ? ), while the basic
> content ( = the main video stream ) would be exactly the same.

No, the current FileID is not based on the content anymore. It is just an ID
that should be made as unique as possible (ie as random/large as possible).

SegmentUID 2 [73][A4] - - - binary A unique ID to identify the current segment
between many others (128 bits).

There can also be an optional element to sign the content of an element (from a
Segment to a Cluster) and this one is based on the content.

SignatureSlot 1+ [1B][53][86][67] * - - sub-elements Contain signature of some
(coming) elements in the stream.

So, a Track could be signed this way. But note that the current system is not
fully defined. It uses a public key architecture but doesn't define who owns the
private key.
 
> A matroska ID was best used when there was a central, independant
> database server hosting all ID nrs of VALID ( = no fakes ) files, such
> that users can compare the indicated ID from the p2p app with the one
> given in the database for a specific movie he wants to have. If the 2
> IDs match he can download the file with ( almost ) no risk, except the
> user faking the file did more than just rename it, but edited the ID nr.
> using spyder's or Pamel's XML/EBML tools. While this is of course well
> possible it will certainly help to reduce the number of fakes floating
> around, as today they people faking files simply have to rename them (
> every fool can do that ).
> 
> Now, the big problem i see with that is that pretty fast there will be a
> huge number of valid IDs for a certain movie, given the number of mods
> that are done based on a certain video stream ( different languages,
> subtitles, etc. ). This broght me to the idea of introducing another ID
> number, calculated specifically for the first video stream in each file,
> thus allowing to identify the video stream itself AND NOT the file.

The bigger problem I see is : who is the authority to say a movie with ID
(signature on some content to define) XYZ is valid or not ? The guy making the
fake ? :(

> This  ID should be calculated on the frames itself, and not on the block
> headers, but of course only for a certain number of frames ( 10, 20,
> whatever ). To make sure the trailer at the beginning of the file ( the
> same may be used for many movies ) is not used for this ID the MD5 of
> the COMPLETE number of blocks following block number 200 ( means 201,
> 202, 203, 204, 205, etc. ..210 ) should be calculated and stored in the
> track header as a unique identifier for this particular video stream. If
> the file is shorter than 200 blocks ( = frames ) than simply the last 10
> blocks can be used.

A working system should not have fixed values like this but should work in all
cases.
 
> Of course, it is absolutely clear that when doing so a VALID video
> stream ( no fake ) will get a new ID number when a user decides to cut
> away the first 100 frames or so. This doesnt hurt much, the only problem

If the original file is split in different segments every 100 frames, it would
help the user to split a movie in many "signed" files. But the P2P system would
have to treat matroska in a special way for each segment and signed element. I
don't see such thing happening anytime soon (both peers have to agree on the way
to treat matroska files in such a system !).
http://matroska.org



More information about the Matroska-devel mailing list