A more simplified and straightforward version of HackerNews. The project uses BeautifulSoup, a python web scraping framework to scrape the data from the hackernews website and take ony the most relevant news and displays it on your terminal.
The Hackernews page has many stories, articles and news that are not that relevant and useful to the community. The page isn't arranged properly to put forward the stories that have the most votes or approval.
The news displayed on your terminal(after you run hackernews.py
) is sorted by the number of votes of that specific news/article on the site. Only the news having 218 or more votes are displayed systematically on your terminal. The stories are scrapped and collected from the first 5 pages of Hacker News.
Clone the repo and open the Terminal/CLI in the folder containing hackernews.py
. Into the terminal, type in: python3 hackernews.py
The project code(hackernews.py
) has been/ is filled with useful comments that can be used to assert the process of grabbing data from the web page.
In order to understand the flow of control and how the Script operates, you can refer to hackernews(py).md
to understand the working of the Script and BeautifulSoup.md
for information on what features/components of the framework are used in the scripts.
Install requirements by running the the following command:
pip install -r requirements.txt
You can either clone or fork this repo onto your device. Then again, make sure all the files are in a single directory/folder(just for safekeeping).
After installation, open your Terminal/CLI in the directory/folder containing the script. Type in or copy/paste the following command:
python3 hackernews.py
You can also build the docker image using the command
docker build . -t hackernews
and then run it using
docker run hackernews
Then again, the output will have the stories with the most votes(>=127). When you check them on the hackernews page, they won't be at the top and you'll have to scroll down or sometimes, navigate to the 2nd page to find them.
https://news.ycombinator.com/ - Hacker News
https://www.excellarate.com/blogs/web-scraping-introduction-applications-and-best-practices/#:~:text=Web%20scraping%20typically%20extracts%20large,show%20data%20from%20a%20website. - Introduction to Web Scraping and best practices
https://developer.mozilla.org/en-US/docs/Learn/CSS/Building_blocks/Selectors - CSS Selectors
https://pypi.org/project/beautifulsoup4/ - BeautifulSoup Docs.