[Matroska-devel] [Cellar] using ISO 639-3 language codes in Matroska

Jerome Martinez jerome at mediaarea.net
Wed Jan 13 08:40:36 CET 2016


Le 13/01/2016 01:15, Dave Rice a écrit :
>> On Jan 12, 2016, at 8:28 AM, Moritz Bunkus <moritz at bunkus.org> wrote:
> […]
>
>> Problem is I don't know the best way to do this. I see three possible
>> avenues each with their own sets of pros and cons, and I'd like some
>> feedback in order to turn this into a proper proposal:
>>
>> 1. Change the specs so that all language elements use 639-3 codes
>>
>> 2. Introduce new elements on the same level as the existing language
>>    elements that determine the standard the corresponding language
>>    element uses defaulting to 639-2 if missing
>>
>> 3. Introduce new elements on the same level as the existing language
>>    elements that contain a 639-3 code
> I’d vote for #2 or #1 with a preference to #2. If option #1 I’d suggest that the language elements may use 639-2 OR 639-3. I’d also suggest that the Matroska specification adopt a externally-managed language authority (639-3 is the most obvious choice)

RFC 5646 extends ISO 639-3 so it is not less obvious to consider it.
Additionally, even if it is not used in reality, Matroska spec already 
has some kind of RFC 5646 style (countries) so it may be relevant.
What are the arguments against using 639-3 "extended" with RFC 5646?
less complex, the definition could be:
Language codes can be either the 3 letters ISO-639-3 form (like "fre" 
for french), or such a language code followed by a dash and a country 
code for specialities in languages (like "fre-ca" for Canadian French). 
Country codes are the same as used for internet domains.
("bibliographic ISO-639-2" changed by "ISO-639-3" in the current spec)

>   and not labor to extend language identification beyond the adopted standard(s). For instance if there is a noticed deficiency in ISO 639-3, I think that the deficiency is most likely in shared interest with other projects and that addressing that deficiency would be better to direct to http://www-01.sil.org/iso639-3/.

I provided 1 example (zh-HK), but I guess there are more examples, and 
change requests are a long process.
We may need a "workaround" in the meanwhile.
We could say that ISO 639-3 SHOULD be used and RFC 5646 is used only if 
up to date ISO 639-3 does not fit the need.

Jérôme


More information about the Matroska-devel mailing list