常見論壇隱藏文字 消去法 (VB.NET)
<span style="display:none">% |# m J! x( }" n# L3 @& I: e</span>
<font style="font-size: 0px; color: rgb(255, 255, 255);">1 p6 z8 E8 D8 k7 W</font>
現在很多論壇 為了防止copy,常會加上 隱藏文字,如開頭兩行
我使用 規則運算式 (Regular Expression) 搭配Replace 來解碼
先Imports System.Text.RegularExpressions
'消掉前面隱藏字
Dim ex As New Regex("<span style[^>]*>(?:[^<]+|<(?!table\b[^>]*>))*?</span>")
For Each m As Match In ex.Matches(TextBox1.Text)
For Each c As Capture In m.Captures
TextBox1.Text = Replace(TextBox1.Text, c.Value, "")
Next
Next
'消掉後面隱藏字
Dim ex2 As New Regex("<font style[^>]*>(?:[^<]+|<(?!table\b[^>]*>))*?</font>")
' Dim ex2 As New Regex("<(?!br|\/?p|b|\/?font|font color)[^>]*>")
For Each m As Match In ex2.Matches(TextBox1.Text)
For Each c As Capture In m.Captures
' Debug.Print(c.Value)
TextBox1.Text = Replace(TextBox1.Text, c.Value, "")
Next
Next
如有錯誤 歡迎指正