From: John C Klensin <klensin@xxxxxxx>
Date: Sat 20 Jan 2007 14:17:14 GMT+01:00
To: Liz Williams <liz.williams@xxxxxxxxx>, "Vinton G. Cerf"
<vint@xxxxxxxxxx>, Alejandro Pisanty <apisan@xxxxxxxxxxxxxxxx>,
Subject: FWD: Re: [registrars] Request for Reserved Names WG
Just FYI. Please feel free to distribute further as you think
appropriate. Since ICANN technical staff has still, as far as I
know, not changed the "we just discard mail" feature, I don't
even know if this reached the registrar list.
From: John C Klensin <klensin@xxxxxxx>
Date: Sat 20 Jan 2007 14:11:11 GMT+01:00
Cc: Bruce Tonkin <Bruce.Tonkin@xxxxxxxxxxxxxxxxxx>, Registrars
Constituency <registrars@xxxxxxxxxxxxxx>, "Gomes, Chuck"
Subject: Re: [registrars] Request for Reserved Names WG Volunteers
--On Friday, 19 January, 2007 20:57 -0500 brunner@xxxxxxxxxxx
Your note on the subject of single octet labels was forwarded
to the ICANN RC mailing list, where I saw it.
When Donald, Bill and I wrote 2929 we also considered the in
the context of IANA Considerations for DNS Resource Records,
the RR NAME included.
That is certainly consistent with my recollection of discussions
at the time.
At the time, there were no ICANN considerations, and no IDN,
so the whole mistake of Unicode hadn't yet happened.
I no longer have the correspondence that circulated between
Donald and I in particular, but we had some discussion related
to the issue, circa 1999/2000, which is observable in the lack
of overspecification for the values of octets used to
construct lables (Section 3.3), and the care we took to
document the RR CLASS (Section 3.2).
If memory serves, Donald wanted to nail down the US-ASCII
values for each octet, and prevent any octet sequence other
than the iso3166 and c/n/o sequences from ever being available
I think I made the case for future extension that might use
values other than .-0-9A-Za-z, and that lables might be made
from sequences not in the two-octet (iso3166) or three-octet
(c/n/o et al) string spaces.
As you may recall from discussions about RFC2181 a few years
earlier, my personal position has been is that the "recommended"
text about contents and structure of labels in 1035 should be
read as a requirement for RR types specified in 1034/1035 and as
more fuzzy only for new Classes and possibly new RR types within
Class=IN that specify different rules. That probably puts me,
again personally, closer to Donald than to you, but, as you
know, I've always been very conservative on this issue, and...
Going back much further, 1123 relaxed the rule against leading
digits which was originally instituted to prevent confusion with
host numbers (later IP addresses), there was some discussion
about whether to explicitly continue to ban all-digit labels.
Those of us who don't like look-ahead when it can be avoided
would have preferred to retain the prohibition; others felt that
looking ahead for four putative labels to see if any non-digit
characters appeared was reasonable and that the prohibition was
not needed. I can't recall whether there was actually consensus
around the latter or whether we just gave up and omitted an
explicit rule, but 1123 did end up with an aside about numeric
When 1591 was finally written to make fairly long-term practice
explicit and more easily available, we discussed TLD naming and
whether to include specific rules there. At that time, Jon
considered the odds of adding new TLDs, other than those in the
3166 set, to be very low and, if I recall, didn't want to
clutter up the text with contingencies, but there was general
(1) there would be no single-letter TLD names
(2) there would be no TLD names with numeric components
(unless ISO introduced them into the 3166 space) and
certainly no TLDs starting in digits.
(3) all (additional) two-letter TLD names would be
derived from the primary 3166 list, with no further
nonsense about "certainly going to be approved",
(4) any additional gTLDs would have three-character names
(5) four-character gTLD names were to be exclusively
"ARPA". I suppose that, in principle, something else
connected to a distinct network, might have been created
with a four-character name, but we certainly didn't
spend time talking about it. By "another network", I'm
not referring to topology or connectivity or protocols
(as you will recall, ".uucp" and, for that matter,
".csnet" and ".bitnet", were in active use at the time,
but as notation within those networks, not in the DNS
root), but something really different... I guess I'm
having visions of ".MARS" for an infrastructure domain
for a network on that planet.
(6) there were to have been _no_ TLD names with five or
These kinds of firm naming rules aid users and application
software in detecting nonsense, producing good presentation
forms and error messages, and preventing repeated queries for
the impossible. Whether those benefits are sufficiently
important to overcome the desire to commercialize one
seemingly-attractive string or another is a balance that ICANN
needs to figure out how to resolve, preferably after considering
the needs of the widest range of materially concerned parties
possible rather than only short or medium-term commercialization
Of course ICANN blew the last two rules off in 2000, with no
discussion at all. Suggestions that ICANN warn applicants that
there was applications software that incorporated those rules
and hence might make their domains inaccessible were rejected by
Which begs the question of what John's intent was, or whether
I then, or ever, have had the smallest part of a clue.
Of course, at that point in time, we weren't trying to see
characters through the blinders of font fanatics, so phishing
was limited to il1 and o0O bits of nonsense which policy, at
either the registry, or the registrar or the zone file
publishers, could have (and at times did) interposed upon.
Exactly. As has been pointed out many times, the
"confusability" problem introduced by IDNs is, at one level, no
different from the ones we have had, in principle, with ASCII
since the dawn of the hostname table. However, addition of
hundreds or thousands of new "opportunities" with IDNs arguably
changes the problem from one about which those involved needed
to exercise moderate caution to one that no user is able to
fully understand and protect against, making an extremely strong
case for some precautions in the registration/zone-entry process
lest we essentially put referential integrity at the mercy of
the phishers and other Bad Guys.
And, has also been noted multiple times (albeit without much
effective success), IDNs --especially any ideas or plans about
IDNs at the top level-- also require that we go back and
reinterpret and adjust all sorts of rules and conventions that
assumed ASCII, LDH, labels. The two most obvious examples
involve the questions of what goes into domain registration
("whois") records and how they are indexed and whether such
"reserved name" rules as the restriction against
single-character names apply to the presentation form or to what
is stored in the DNS. The latter question is what makes the
single-character prohibition particular important at this time:
if the motivation for that prohibition really had been an idea
about future expansion, then IDNs would be irrelevant since the
form stored in the DNS is always more than four characters long.
On the other hand, since the prohibition was imposed because of
human factors and retrieval considerations, it should probably
apply to the presentation ("native character", "Unicode") form