Skip to content

[site] bbc.co.uk #658

Open
Open
@kdenaeem

Description

@kdenaeem

I am trying to scrape bbc.co.uk/news/world, I'm hoping to scrape 20/30 of the articles on the front page of this site

bbc_papers = newspaper.build("https://www.bbc.co.uk/news/world", number_threads=3)

article_urls = [article.url for article in bbc_papers.articles]
print(article_urls[10])

This always says list is out of index or return empty [], I'm guessing this is because the request was blocked.
Does anyone know why it wont return article_urls ?

Metadata

Metadata

Assignees

No one assigned

    Labels

    help wantedExtra attention is needed

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions