網頁超連結路徑置換

包仔最近正好在玩網頁串流,其中有一個步驟剛好要判別站台內之超連結,而這些超連結有不同的寫法,有些是絕對路徑有些則是相對路徑,然而如何置換這些路徑且看包仔之分解。

包仔最近正好在玩網頁串流,其中有一個步驟剛好要判別站台內之超連結,而這些超連結有不同的寫法,有些是絕對路徑有些則是相對路徑,然而如何置換這些路徑且看包仔之分解。

transPath.aspx.cs

protected void Button1_Click(object sender, EventArgs e)
    {
        if (this.TextBox1.Text.Trim() != "")
        {
            string baseURL = "http://www.test.com.tw";
            

            Response.Write(TransLink(baseURL,this.TextBox1.Text.Trim(),this.TextBox2.Text.Trim()));
        }

    }


    private string TransLink(string baseURL, string LinkPath, string FullPath)
    {
            if (LinkPath.Substring(0, 1) == "/")//判斷為內部網頁則加入完整網址
            {
                LinkPath = baseURL + LinkPath;
            }

            else if (LinkPath.StartsWith("../"))//判斷為內部網頁則加入完整網址
            {
                

                LinkPath = Regex.Split(LinkPath, "(?i)(?<tag>(^[\\.\\/]+\\/))").GetValue(3).ToString();

                LinkPath = baseURL + "/" + LinkPath;
            }

            else if (LinkPath.IndexOf(":") < 0)
            {
                string LinkPaths = Regex.Match(LinkPath, "(?i)(?<baseurl>http://.+/)").Groups["baseurl"].ToString();

                if (LinkPaths == "")
                {
                    int PathIndex = FullPath.LastIndexOf("/");
                    string LastPath = FullPath.Substring(PathIndex + 1);
                    string RootPath = "";
                    if (LastPath != "")
                    {
                        RootPath = FullPath.Replace(LastPath, "");
                    }

                    else
                    {
                        RootPath = FullPath;
                    }


                    LinkPath = RootPath + LinkPath;
                }

                else
                {
                    LinkPath = LinkPaths;
                }

            }


            return LinkPath;
    }

此方法可配合網頁串流遞迴掃描進而抓取整個站台的超連結