常見論壇隱藏文字 消去法 (VB.NET)

常見論壇隱藏文字 消去法 (VB.NET)

<span style="display:none">% |# m&nbsp;&nbsp;J! x( }" n# L3 @& I: e</span>
<font style="font-size: 0px; color: rgb(255, 255, 255);">1 p6 z8 E8 D8 k7 W</font>

 

現在很多論壇 為了防止copy,常會加上 隱藏文字,如開頭兩行

我使用 規則運算式 (Regular Expression) 搭配Replace 來解碼

先Imports System.Text.RegularExpressions


            '消掉前面隱藏字
            Dim ex As New Regex("<span style[^>]*>(?:[^<]+|<(?!table\b[^>]*>))*?</span>")

            For Each m As Match In ex.Matches(TextBox1.Text)
                For Each c As Capture In m.Captures
                    TextBox1.Text = Replace(TextBox1.Text, c.Value, "")
                Next
            Next
            '消掉後面隱藏字
            Dim ex2 As New Regex("<font style[^>]*>(?:[^<]+|<(?!table\b[^>]*>))*?</font>")
            ' Dim ex2 As New Regex("<(?!br|\/?p|b|\/?font|font color)[^>]*>")

            For Each m As Match In ex2.Matches(TextBox1.Text)
                For Each c As Capture In m.Captures
                    '   Debug.Print(c.Value)
                    TextBox1.Text = Replace(TextBox1.Text, c.Value, "")

                Next
            Next


如有錯誤 歡迎指正