Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What if there isn't any news after 2017.03? #18

Open
YunBAI-PSL opened this issue Mar 4, 2022 · 4 comments
Open

What if there isn't any news after 2017.03? #18

YunBAI-PSL opened this issue Mar 4, 2022 · 4 comments

Comments

@YunBAI-PSL
Copy link

Dear Author,

Thanks for your nice job. I run your codes and find there isn't news after 2017.03. But I need some recent news, how do you handle this kind of problem?

Many thanks.

@LuChang-CS
Copy link
Owner

Hi, thank you for your interests.

Did you change the time range setting in the settings/*.cfg files? Also, you may also need to set a larger sleep time because frequent visits to nytimes from the same IP may trigger their reCAPTCHA verification.

@swthinking
Copy link

Even if you set the date in cfg, data cannot be crawled after 2017.

@ducnva
Copy link

ducnva commented Apr 6, 2023

Maybe name class has changed, so you can not get all link paper. You can check line 31

@liyucheng09
Copy link

Just change line 31 to elements = soup.table.find_all('a') .

Just test, it runs without problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants