[PunyCode]中文網址取得中文 SubDomain, Domain

當使用中文網址時,例如:小喵.姍舞之間凝聚.net。如何取得中文的子網域名稱(小喵),與網域名稱(姍舞之間凝聚.net),這篇會應用到中文網址的編碼 PunyCode,讓我們看下去

緣起

小喵會寫這篇,是因為一個網友在藍色小舖中詢問了使用中文網址的問題
http://www.blueshop.com.tw/board/FUM20041006161839LRJ/BRD20160104215734ALP.html
小喵於是動手來測試看看,把過程記錄下來,提供大家參考

PunyCode

中文網址在運作時,實際上會由瀏覽器轉碼為PunyCode,然後再進行後續連係,有關PunyCode的部分,可以參考一下維基百科上的內容
https://zh.wikipedia.org/wiki/Punycode
例如:
中文網址:小喵.姍舞之間凝聚.net
PunyCode:xn--i2rr1m.xn--9iqw0ge4i7u5br3b6z0c.net

中文網址測試環境建立

要測試這樣的問題,首先要在自己的機器上建立一個可以使用中文網址的環境。最簡單的方式就是透過 hosts 來處理,檔案位置在【c:\windows\System32\drivers\etc】,將準備的網址填上,指向 127.0.0.1 ,這樣就能夠在本機IIS測試時,用指定的中文網址來替代 localhost。填上需要填 PunyCode轉換後的網址,填上中文會沒有作用。

127.0.0.1       localhost
127.0.0.1	xn--i2rr1m.xn--9iqw0ge4i7u5br3b6z0c.net

PunyCode轉換工具程式

環境有了,接下來就來撰寫,由於需要進行 PunyCode 與 中文 的轉換,所以小喵必須準備一段轉換的程式碼,將他寫成一個類別,未來可以放到專案中,由其他需要的人引用。小喵參考網路上一篇討論,裡面有些小錯誤(Domain轉換的部分,他少處理非中文的部分),小喵順手將相關問題補好。

新增一個類別,命名為:UtilPunyCode,相關程式如下:

Imports Microsoft.VisualBasic
Imports System.Collections.Generic
Imports System.Text
Imports System.Runtime.InteropServices


''' <summary>
''' PunyCode轉換
''' 參考來源:http://bbs.csdn.net/topics/390110788
''' </summary>
''' <remarks></remarks>
Public Class UtilPunyCode
    Shared TMIN As Integer = 1
    Shared TMAX As Integer = 26
    Shared BASE As Integer = 36
    Shared INITIAL_N As Integer = 128
    Shared INITIAL_BIAS As Integer = 72
    Shared DAMP As Integer = 700
    Shared SKEW As Integer = 38
    Shared DELIMITER As Char = "-"c

    Public Function EncodingDomain(domain As String) As String
        domain = domain.Replace("。", ".")
        Dim domainArray As String() = domain.Split(New String() {"."}, StringSplitOptions.None)
        Dim result As String = ""
        For Each item As String In domainArray
            If item = "" Then
                result += "."
                Continue For
            End If
            '檢查是否有無中文
            If ChkENC(item) Then
                '無中文,只有英數_-
                result += item & "."
            Else
                '有中文
                result += "xn--" & Encode(item) & "."
            End If

        Next
        If result(result.Length - 1) = "."c Then
            result = result.Substring(0, result.Length - 1)
        End If
        Return result
    End Function

    Public Function DecodingDomain(domain As String) As String
        domain = domain.Replace("。", ".")
        Dim domainArray As String() = domain.Split(New String() {"."}, StringSplitOptions.None)
        Dim result As String = ""
        For Each item As String In domainArray
            If item = "" Then
                result += "."
                Continue For
            End If
            Dim item2 As String = item
            If item2.Length > 4 AndAlso item2.Substring(0, 4) = "xn--" Then
                result += Decode(item2.Substring(4, item2.Length - 4)) + "."
            Else
                result &= item2 & "."
            End If
        Next
        If result(result.Length - 1) = "." Then
            result = result.Substring(0, result.Length - 1)
        End If
        Return result
    End Function
    Public Function Encode(inputStr As String) As String
        Dim n As Integer = INITIAL_N
        Dim delta As Integer = 0
        Dim bias As Integer = INITIAL_BIAS
        Dim output As New StringBuilder()
        ' Copy all basic code points to the output
        Dim b As Integer = 0
        For i As Integer = 0 To inputStr.Length - 1
            Dim c As Char = inputStr(i)
            If isBasic(c) Then
                output.Append(c)
                b += 1
            End If
        Next
        ' Append delimiter
        If b > 0 Then
            output.Append(DELIMITER)
        End If
        Dim h As Integer = b
        While h < inputStr.Length
            Dim m As Integer = Integer.MaxValue
            ' Find the minimum code point >= n
            For i As Integer = 0 To inputStr.Length - 1
                Dim c As Integer = AscW(inputStr.Substring(i, 1))
                If c >= n AndAlso c < m Then
                    m = c
                End If
            Next
            If m - n > (Integer.MaxValue - delta) \ (h + 1) Then
                Throw New Exception("OVERFLOW")
            End If
            delta = delta + (m - n) * (h + 1)
            n = m
            For j As Integer = 0 To inputStr.Length - 1
                Dim c As Integer = AscW(inputStr.Substring(j, 1))
                If c < n Then
                    delta += 1
                    If 0 = delta Then
                        Throw New Exception("OVERFLOW")
                    End If
                End If
                If c = n Then
                    Dim q As Integer = delta
                    Dim k As Integer = BASE
                    While True
                        Dim t As Integer
                        If k <= bias Then
                            t = TMIN
                        ElseIf k >= bias + TMAX Then
                            t = TMAX
                        Else
                            t = k - bias
                        End If
                        If q < t Then
                            Exit While
                        End If
                        output.Append(ChrW(digit2codepoint(t + (q - t) Mod (BASE - t))))
                        q = (q - t) \ (BASE - t)
                        k += BASE
                    End While
                    output.Append(ChrW(digit2codepoint(q)))
                    bias = adapt(delta, h + 1, h = b)
                    delta = 0
                    h += 1
                End If
            Next
            delta += 1
            n += 1
        End While
        Return output.ToString()
    End Function
    Public Function Decode(input As String) As String

        Dim n As Integer = INITIAL_N
        Dim i As Integer = 0
        Dim bias As Integer = INITIAL_BIAS
        Dim output As New StringBuilder()
        Dim d As Integer = input.LastIndexOf(DELIMITER)

        If d > 0 Then
            For j As Integer = 0 To d - 1
                Dim c As Char = input(j)
                If Not isBasic(c) Then
                    Throw New Exception("BAD_INPUT")
                End If
                output.Append(c)
            Next
            d += 1
        Else
            d = 0
        End If
        While d < input.Length
            Dim oldi As Integer = i
            Dim w As Integer = 1
            Dim k As Integer = BASE
            While True
                If d = input.Length Then
                    Throw New Exception("BAD_INPUT")
                End If

                Dim c As Integer = AscW(input.Substring(d, 1))
                d = d + 1
                Dim digit As Integer = codepoint2digit(c)
                If digit > (Integer.MaxValue - i) \ w Then
                    Throw New Exception("OVERFLOW")
                End If
                i = i + digit * w
                Dim t As Integer
                If k <= bias Then
                    t = TMIN
                ElseIf k >= bias + TMAX Then
                    t = TMAX
                Else
                    t = k - bias
                End If
                If digit < t Then
                    Exit While
                End If
                w = w * (BASE - t)
                k += BASE
            End While
            bias = adapt(i - oldi, output.Length + 1, oldi = 0)
            If i \ (output.Length + 1) > Integer.MaxValue - n Then
                Throw New Exception("OVERFLOW")
            End If
            n = n + i \ (output.Length + 1)
            i = i Mod (output.Length + 1)
            output.Insert(i, ChrW(n))
            i += 1
        End While
        Return output.ToString()
    End Function

    Private Function adapt(delta As Integer, numpoints As Integer, first As Boolean) As Integer
        If first Then
            delta = delta \ DAMP
        Else
            delta = delta \ 2
        End If
        delta = delta + (delta \ numpoints)
        Dim k As Integer = 0
        While delta > ((BASE - TMIN) * TMAX) \ 2
            delta = delta \ (BASE - TMIN)
            k = k + BASE
        End While
        Return k + ((BASE - TMIN + 1) * delta) \ (delta + SKEW)
    End Function
    Private Function isBasic(c As Char) As Boolean
        Return AscW(c) < &H80
    End Function
    Private Function digit2codepoint(d As Integer) As Integer
        If d < 26 Then
            ' 0..25 : 'a'..'z'
            Return d + Asc("a"c)
        ElseIf d < 36 Then
            ' 26..35 : '0'..'9';
            Return d - 26 + Asc("0"c)
        Else
            Throw New Exception("BAD_INPUT")
        End If
    End Function
    Private Function codepoint2digit(c As Integer) As Integer
        If c - Asc("0"c) < 10 Then
            ' '0'..'9' : 26..35
            Return c - Asc("0"c) + 26
        ElseIf c - Asc("a"c) < 26 Then
            ' 'a'..'z' : 0..25
            Return c - Asc("a"c)
        Else
            Throw New Exception("BAD_INPUT")
        End If
    End Function

    Private Function ChkENC(item As String) As Boolean
        Dim Rc As Boolean = False
        Dim Reg As New Regex("^[a-zA-Z0-9_-]+$")
        Rc = Reg.IsMatch(item)
        Return Rc
    End Function



End Class

測試畫面:

小喵準備一個WebForm的畫面來進行測試,安排一個Button,還有幾個Label來顯式結果,aspx的內容如下:

<asp:Button ID="btnProcessUrl" runat="server" Text="處理" /><br />
網址:<asp:Label ID="lblURL" runat="server" Text=""></asp:Label>
<br />
網站:<asp:Label ID="lblSite" runat="server" Text=""></asp:Label>
<br />
子網域:<asp:Label ID="lblSubDomain" runat="server" Text=""></asp:Label><br />
網域:<asp:Label ID="lblDomain" runat="server" Text=""></asp:Label>

接著,就是按鈕Click的程式碼:

Protected Sub btnProcessUrl_Click(sender As Object, e As EventArgs) Handles btnProcessUrl.Click
	Me.lblURL.Text = Request.Url.ToString
	Dim Host As String = Request.ServerVariables("HTTP_HOST")
	Dim oUPC As New UtilPunyCode
	Dim ChtHost As String = oUPC.DecodingDomain(Host)
	Me.lblSite.Text = ChtHost

	Dim sArr() As String = ChtHost.Split(".")
	Dim SubDomain As String = sArr(0)
	Dim Domain As String = Mid(ChtHost.Replace(SubDomain, ""), 2, Len(ChtHost.Replace(SubDomain, "")) - 1)

	Me.lblSubDomain.Text = SubDomain
	Me.lblDomain.Text = Domain
End Sub

執行結果如下:

末記

透過這個範例的實做中,小喵從過程中學到兩個東西,一個是中文網址與 PunyCode,另一個是如何從本機上建立中文網址的測試環境,提供給大家參考

^_^

 


參考來源:
PunyCode轉換:http://bbs.csdn.net/topics/390110788

 

 


以下是簽名:


Microsoft MVP
Visual Studio and Development Technologies
(2005~2019/6)