ICANN/GNSO GNSO Email List Archives

[ga]


<<< Chronological Index >>>    <<< Thread Index >>>

[ga] Cary Karp's proposal for aggregated language tables

  • To: ga@xxxxxxxxxxxxxx
  • Subject: [ga] Cary Karp's proposal for aggregated language tables
  • From: Danny Younger <dannyyounger@xxxxxxxxx>
  • Date: Mon, 19 Dec 2005 06:01:32 -0800 (PST)
  • Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:Received:Date:From:Subject:To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=CiEiJP3Hm9FufNv/jYp+cCqHV+69r3Eu7mXXDp7x7GlE0k2GzkmlwKSR02JIh5aqMcb8TKWMCqOt5uZpTeOU3K11amJK9jkvya7MHKOfkNtAN+X8J23wuvtsqzUSjIk7/wvBzQd+Ks48xmGHLjscr4EGqG6tYZgY0nUwPXItXww= ;
  • Sender: owner-ga@xxxxxxxxxxxxxx

A generally implementable statement about the IDN
requirements for a given language can only be made by
an agency that has detailed familiarity with that
language's orthographic detail, with the IDNA
protocols, and would be recognized without question as
an authoritative source of information about
the way the two interrelate. Agencies responsible for
both ccTLDs and gTLDs are equally likely to be able to
make knowledgeable statements about the requirements
of the languages in which they conduct their daily
business.  

There is, however, a greater likelihood that a gTLD
would need to accommodate registration in languages
that are distant both geographically and
linguistically from its base of operation. Obtaining
requisite assistance with such languages will rarely
be a matter of insurmountable difficulty. If, however,
the results were to be placed in the IANA Registry for
IDN Character Tables, their credibility would more
likely be questioned than would be the case with
equivalent tables contributed by ccTLDs with obvious
association with those languages.

The IANA registry has clear utility as a platform for
sharing the language expertise possessed by all of the
TLD registries, both supporting the development of
individual registry statements and providing a
credible means for their publication. Its potential in
this regard is, however, developing at what may not be
an adequate rate. 

In absence of ideally suitable reference material in
the central repository, gTLDs are crafting their own
character tables as the needs of their target
communities require. Even if all such tables are
equally sound, they may differ in point of detail,
thus increasing the risk of general confusion. It is
perhaps in recognition of this that some tables in
actual use have not been placed in the registry.

However realistic it may be to expect the ccTLDs in
countries that share major language concerns
collaboratively to draft unified character tables, any
such action would obviate the need for some separate
gTLD action. In any case, a ccTLD normally has clearer
language nexus and can thus more readily make
generally authoritative statements about such things
as IDN character tables.

Anything that can be done to elicit more extensive
ccTLD involvement in the development of the IANA
registry would therefore be likely to hasten the
development of IDN. One factor that may be braking
this, and which is also directly relevant to gTLD
participation, is the apparent lack of provision in
the registry for tables based on language groups. The
ICANN Guidelines for the Implementation of
Internationalized Domain Names allows a TLD registry
to "associate each registered internationalized domain
name with one language or set of languages."

Since languages outnumber countries by about thirty to
one, national domains will commonly need to
accommodate more than one language. The lexical base
for many languages also includes a substantial number
of words borrowed from other languages and which are
represented using characters that are external to the
repertoire native to the receiving language. The
degree of such overlap in languages used by adjacent
communities can be 
significant, and many potential sources of confusion
in their respective IDN support can be revealed -- and
thus more easily avoided -- by including them in a
single aggregated character table.

This highlights the need to take extreme care in
recognizing the difference between language and
script. Strict focus on the first of these factors is
fundamental to the policies relating to certain
language groups. In other situations, a shared script
may be both a natural and satisfactory basis for
establishing a homogeneous language set. The
alternative of maintaining 
separate tables for each language belonging to such a
group, and indicating the associations between them
via some external device (another table?), can conceal
rather than reveal details that require particular
note. 

Danish, Norwegian and Swedish provide one example of
languages that might beneficially be presented in an
aggregated table. All three are extremely closely
related and use almost identical alphabets. The
differences between them are of potential relevance to
IDN policies and would readily be revealed in a shared
table. This reasoning could purposefully be extended
to a shared table, for example, for all of the
languages used in the EU member states that are
represented using the Latin Unicode tables.

Perhaps someone more familiar with the policies
underlying the IANA registry could comment on its
extensibility as sketched above, explaining either why
this is not a path along which we can proceed, or how
we might pick up the pace at which we can move onward.
http://forum.icann.org/lists/idn-discuss/msg00001.html

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 



<<< Chronological Index >>>    <<< Thread Index >>>