🐞 Scrapy-based Crawlers for news of Taiwan including 10 media companies:
- 蘋果日報
- 中國時報
- 中央社
- 華視
- 東森新聞雲
- 自由時報
- 公視
- 三立
- TVBS
- UDN
$ git clone https://github.com/TaiwanStat/Taiwan-news-crawlers.git
$ cd Taiwan-news-crawlers
$ pip install -r requirements.txt
$ scrapy crawl apple -o apple_news.json
- Python3
- Scrapy 1.3.0
scrapy crawl <spider> -o <output_name>
- apple
- appleRealtime
- china
- cna
- cts
- ettoday
- liberty
- libertyRealtime
- pts
- setn
- tvbs
- udn
Key | Value |
---|---|
website | the publisher |
url | the origin web |
title | the news title |
content | the news content |
category | the category of news |
The MIT License