將網頁上的Table資料轉到Excel (VB.NET)
http://www.autohotkey.com/docs/commands/Send.htm
這網址是實驗的網頁
首先 先把這一頁另存新檔 存到桌面上
檔名是Send.htm
改副檔名 為.xls
打開 Send.xls
表格就呈現出來 任人宰割了
大功告成
Vb.net 可以用以下原始碼
按下button1 執行以下動作
先下載網頁原始碼 再利用 規則運算式 取出 <table> 的頭尾 再存到c:test.xls
Imports System.Net
Imports System.IO
Imports System.Text
Imports System.Text.RegularExpressions
Public Class Form2
'TextBox1 就是該table的原始碼 部分
Protected Function GetWebPage(ByVal url As String) As String
Dim myRequest As System.Net.HttpWebRequest = System.Net.WebRequest.Create(url)
Dim myResponse As System.Net.WebResponse = myRequest.GetResponse
Dim myStream As IO.Stream = myResponse.GetResponseStream
Dim streamReader As New IO.StreamReader(myStream, System.Text.Encoding.UTF8)
Return streamReader.ReadToEnd
End Function
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
'規則運算式 參考 http://blog.stevenlevithan.com/archives/match-innermost-html-element
'-----------先讀入 網址的原始碼 放到textbox1
TextBox1.Text = GetWebPage("http://www.autohotkey.com/docs/commands/Send.htm")
'-----------再用規則 取出 <table> 的頭尾
TextBox2.Text = ""
Dim ex As New Regex("<table border\b[^>]*>(?:[^<]+|<(?!table\b[^>]*>))*?</table>")
For Each m As Match In ex.Matches(TextBox1.Text)
For Each c As Capture In m.Captures
Debug.Print(c.Value)
TextBox2.Text &= c.Value
Next
Next
'-----------再存成excel
Dim fileName As String = "c:\test.xls"
Using fs As New FileStream(fileName, FileMode.Create, FileAccess.Write)
Dim sw As New StreamWriter(fs, _
Encoding.GetEncoding("big5"))
sw.Write(TextBox2.Text)
sw.Close()
End Using
End Sub
End Class
如有錯誤 歡迎指正