[Matroska-devel] matroska and process/history metadata tags

Dave Rice dave at avpreserve.com
Wed Jun 8 17:29:49 CEST 2011

Hi all,
	I've been working on utilizing metadata tags to document the creation process of the file in more detail (especially in environments where the source material is analog tape). I uploaded a sample MKV using these kinds of tags here: http://www.avpreserve.com/pres_metadata_sample_20110608.mkv. Official tags are in ALL CAPS and unofficial tags for documenting creation process metadata are lowercase. For now I added initial_source_timecode as a metadata tag to enable the possibility to relate frames within the MKV to frame of the source analog tape since there doesn't seem to be a better way to do this. Any feedback on the arrangement of metadata tags or this type of use of Matroska is appreciated.

Best Regards,

Dave Rice

On May 23, 2011, at 10:51 AM, Dave Rice wrote:

> Hi all,
> I'm working in a project that is considering the use of Matroska within a digital preservation environment. One of the goals is to enable the files to be highly self-descriptive noting the relationships to source material (in this case analog tapes digitized to Matroska files), technical aspects of the source material itself, and metadata about the process that created the Matroska file (digitization using which hardware, software, etc).
> This material is useful when using the content. For instance this data would allow information on the source object, it's condition, and the setup of the digitization process itself to be documented within the Matroska file, similar to how Broadcast Wave Format uses codingHistory.
> To help enable use of Matroska within video preservation and archival environments, we'd like to propose additional Matroska tags for the following two purposes:
> - Source Object Metadata: allowing more granular details about the source tape (beyond ORIGINAL_MEDIA_TYPE)
> - Process Metadata: allowing a description of the digitization process (especially relevant to working with non-file-based source material).
> To describe the source material, it seems the best use of current nested tags is:
> ORIGINAL_MEDIA_TYPE - to say "CD" or "Betacam"
> ORIGINAL_MEDIA_TYPE/BARCODE - to provide the barcode of the source media ORIGINAL_MEDIA_TYPE/LABEL - to document the labeling of the source media ORIGINAL_MEDIA_TYPE/DATE_RECORDED - date original source media was recorded, if known ORIGINAL_MEDIA_TYPE/MANUFACTURER - manufacturer of source media format
> Since the quality or condition of the source tape affects the resulting file and provides context, these fields could be added:
> ORIGINAL_MEDIA_TYPE/QUALITY - for qualitative statements about the conditions of the source media ORIGINAL_MEDIA_TYPE/QUALITY/TYPE (example: Physical, Picture, Audio, RF Level, Drop Out Activity)
> To document the process that created the file we started with a combination of the 'environment' and 'event' metadata structures from PREMIS (http://www.loc.gov/standards/premis/).
> EVENT="clean, transfer, signal process, edit, physical restoration, etc"
> EVENT/SOFTWARE - name of software used
> EVENT/SOFTWARE/ROLE - role of software, e.g. capture, edit, driver, operating system, etc
> EVENT/SOFTWARE/VERSION - version number of software
> EVENT/HARDWARE - name of hardware used, e.g. playback deck
> EVENT/HARDWARE/ROLE - role of hardware, e.g. analog-to-digital conversion, playback, processor, etc
> EVENT/HARDWARE/MANUFACTURER - manufacturer of hardware
> EVENT/HARDWARE/SERIAL_NO - identifier of hardware
> EVENT/TRANSFER_METHOD - protocol used to transfer audiovisual data, e.g. composite, component, ieee 1394
> EVENT/ACTIONS - description of actions performed within event, e.g. trim silence, normalize
> EVENT/COMMENTS - notes about the event such as noted errors or untypical outcomes
> Another approach is to list more fields without the nested structure.
> SOURCE OBJECT CONDITION (Comments on the condition of the source, any preparation for playback of the source tape, if relevant): Tape had minor exterior damage. Tape was cleaning with an XYZ machine.
> DATE DIGITIZED (Date and time of the file creation process. Enter in ISO 8601 format): 2011-01-13T19:20:30-05:00
> DIGITIZATION EVENTS (Comments about the digitization process including notable time-based events, if relevant): Damage to the source tape prevented digitization of the first ~10 minutes of content
> ENCODED BY (Name of the technician or organization   responsible for the encoding and file creation process): Vendor ABC, Inc.
> CAPTURE DEVICE SETTINGS (Describe the settings or configuration of the capture device)
> PLAYBACK DECK SETTINGS (Describe settings of the playback device)
> PLAYBACK DEVICE MANUFACTURER (Name of the company that manufactured the tape player used to play the tape): Sony
> PLAYBACK DEVICE MODEL (Model name and number of the tape player used to play the tape): SVO-5800
> PLAYBACK DEVICE SERIALNO (Serial number of the tape player used to play the tape): Xyz1234
> CAPTURE DEVICE MANUFACTURER (Name of the company that manufactured the A/D card used to capture the digital file): Blackmagic Design
> CAPTURE DEVICE MODEL (Model name and number of the A/D card used to capture the digital file): Decklink Studio
> CAPTURE DEVICE SERIALNO (Serial number of the A/D card used to capture the digital file): Abc1324
> CAPTURE DEVICE SOFTWARE (Software used to operate the A/D card): Media Express
> OPERATING SYSTEM (Name and version of the operating system running the computer used to scan and/or process the image): Ubuntu 10.10
> PLAYBACK SIGNAL PATH (Protocols used to transfer audiovisual data between the playback deck and the capture device): Component
> PROCESSING ACTIONS (Description of any actions performed during processing, such as trimming silence)
> SOURCE OBJECT TRACK CONFIGURATION (Describe the intended audio track presentation of the source object): Four track mono
> We're sending these notes along for the consideration of the Matroska-devel group. If this initial draft and the scope of the tags used here look okay, then we can document them more formally with definitions and links to external authorities of these terms.
> Best Regards,
> Dave Rice
> avpreserve.com
> _______________________________________________
> Matroska-devel mailing list
> Matroska-devel at lists.matroska.org
> http://lists.matroska.org/cgi-bin/mailman/listinfo/matroska-devel
> Read Matroska-Devel on GMane: http://dir.gmane.org/gmane.comp.multimedia.matroska.devel

More information about the Matroska-devel mailing list