Python-爬蟲11-讀取表格內儲存格<td></td>內的資料

2020-11-17

19650
0
python
2022-12-03

讀取表格內儲存格<td></td>圖片內的超連結

目標網址:http://jwlin.github.io/py-scraping-analysis-book/ch2/table/table.html

 rows = soup.find('table', 'table').tbody.find_all('tr')
    for row in rows:
        all_tds = row.find_all('td') 
        if 'href' in all_tds[3].a.attrs:
            href = all_tds[3].a['href']
        else:
            href = None
        print(all_tds[0].text, all_tds[1].text, all_tds[2].text, href, all_tds[3].a.img['src'])

完整程式碼:

import requests
from bs4 import BeautifulSoup

def main():
    resp = requests.get('http://jwlin.github.io/py-scraping-analysis-book/ch2/table/table.html')
    soup = BeautifulSoup(resp.text, 'html.parser')
   
    rows = soup.find('table', 'table').tbody.find_all('tr')
    for row in rows:
        all_tds = row.find_all('td') 
        if 'href' in all_tds[3].a.attrs:
            href = all_tds[3].a['href']
        else:
            href = None
        print(all_tds[0].text, all_tds[1].text, all_tds[2].text, href, all_tds[3].a.img['src'])

if __name__ == '__main__':
    main()

參考書目:py-scraping-analysis-book

Yiru@Studio - 關於我 - 意如

回首頁

Yiru@Studio

Yiru@Studio

Python-爬蟲11-讀取表格內儲存格<td></td>內的資料

標籤雲

系列文章