Skip to content

Commit 985457c

Browse files
committed
web scraping readme update
1 parent c481019 commit 985457c

File tree

1 file changed

+20
-16
lines changed

1 file changed

+20
-16
lines changed

days/073-076-webscraping/README.md

Lines changed: 20 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,32 +1,36 @@
1-
# Days 029-032 Static Site Generators
1+
# Days 073-076 Web Scraping
22

3-
Welcome to Static Site Generators!
3+
Welcome to Web Scraping!
44

5-
Over the next four days we're going to play with Pelican - a Python based Static Site Generator. After we build our base site we're also going to look at deploying the site online using Netlify!
5+
The next four days will be based on BeautifulSoup4 (bs4) and Newspaper3k.
66

77
## Days 1 - 2:
88

9-
We're taking a slightly different approach for this chapter.
9+
The first two days will be spent on bs4.
1010

11-
Given the nature of the videos in this chapter we're asking that over the first two days you simply watch and follow along with the videos.
11+
Day 1: Watch bs4 videos 3 - 5 on the first day and follow along. Try your hand at scraping additional data from the Talk Python Course listing page.
1212

13-
There's no need to dive too far into the coding, just get the site up and running and deployed on Netlify. Take the two days to do this.
14-
15-
Add some extra posts to the site if you have spare time and can't wait until Days 3 - 4.
13+
Day 2: Scrape your own sites! bs4 is extremely simple so you'll be able to take what you learned on Day 1 and apply it to websites of your choice.
1614

1715

1816
## Days 3 - 4:
1917

20-
Play time!
18+
Introducing newspaper3k.
19+
20+
Day 1: Follow along with the remaining videos, 6 - 8, and finish off the course.
21+
22+
After completing the example in the video, practice on other news articles of your choice. Get some practice in.
23+
24+
Day 2: Head to [http://newspaper-demo.herokuapp.com](http://newspaper-demo.herokuapp.com). This is the site used in the video to demo newspaper3k.
25+
26+
Reproduce this page using the Flask skills you've already learned in this course so far.
2127

22-
Build up your site over the remaining two days. You have the building blocks with static images and pages - now build!
28+
Tips:
2329

24-
- Add posts to bulk out your site.
25-
- Create another few static pages. Eg: Privacy Policy; Terms and Conditions; Related Links
26-
- Experiment with the HTML/CSS.
27-
- Browse to [http://www.pelicanthemes.com/](http://www.pelicanthemes.com/) and pick a new theme! We use [Flex](https://github.com/alexandrevicenzi/Flex/tree/b3bd59002a3e85803332c35702d90e1e19ef39b6) on [PyBites](https://pybit.es).
28-
- Visit the [Pelican docs](http://docs.getpelican.com/en/stable/index.html) and take a look at the [Pelican Themes](http://docs.getpelican.com/en/stable/pelican-themes.html) documentation to install the new theme you've selected.
29-
- Customise and make it your own.
30+
- You're essentially just parsing the newspaper3k page for Authors; Publish Date; Text and Image.
31+
- Decide whether you want the page to reload once the URL is entered or if you want to direct the user to a new page that renders the content.
32+
- Don't fret if you can't get the page returning the exact same info, just do what you can.
33+
- Focus more on the code/content than the HTML/CSS. We don't care how pretty this is.
3034

3135
Note: Remember to keep your repository up to date on GitHub to allow Netlify to keep auto-building/updating.
3236

0 commit comments

Comments
 (0)