Friday, October 30, 2009

finding the non-Latin URL or IDN

Here is the news - ICAAN has approved Internationalized Domain Names. According to this video released today, this could be considered the most significant change on the internet in - did he say 40 years???

Here is the much proclaimed upside. People, and especially children, worldwide will be able to access the internet in their own script for the first time. This is an amazing surprise to me, since most scripts are input on the computer via the Latin alphabet in the first place. Try this Input Method Editor for Japanese. Input a meaningless string of Latin consonants and vowels to get a sense of how it works. You are inputting Latin letters, don't you think, in order to create text in Japanese.

But here is the downside. Phishing. How will you know that these two sequences are made up of different codepoints? code and cοde - two separate sequences now available for domain names. If you don't think these two are composed of different codepoints, try putting them in google. Funny - they look the same.

One thing you do not have to worry about is how you, even if you are a monolingual who only dabbles in foreign scripts, will input the required domain name. You can use an online Input Method Editor or IME. Here are a few -

Russian and others

In a matter of minutes, I was able to recreate this Chinese word 龍 with the appropriate search results.

In any case, there are dozens of these online input doohickeys, so you don't really have to worry. However, do read the comments under this post and think about it. I don't know whether it is a good thing or not.

1 comment:

Ramesh said...

A good place to understand how URL's are processed by browsers is this:

Google > Browser Security Handbook.

Read the Browser Security Handbook, part 1, for it deals with URL's processing.

This blog deals with online security:

Google Online Security Blog.