Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patterns added using REST api are not being crawled. Patterns added manually however are crawled. #1903

Open
JimHha opened this issue Feb 1, 2018 · 0 comments

Comments

@JimHha
Copy link

JimHha commented Feb 1, 2018

When adding patterns with a wildcard on the end to OSS using either curl or the python requests library the web crawler does not fetch them. I have retrieved success responses each time and verified their presence afterwards.

Occasionally if I request a manual crawl of a pattern added using the REST api that OSS isn't crawling, the manual crawl may trigger the crawler.

If I wait long enough for it to re-fetch, it still doesn't crawl the urls added using the REST api.

If I add patterns using the REST api, see that they aren't being crawled, then add a new url manually the new url that was added manually will be crawled while the ones added previously using the REST api still aren't crawled.

If however I use the interface to manually add them and start the crawler they are fetched, parsed and indexed just fine.

I have been using the .deb package for installation and testing on both Debian 8.6 and a Debian 9 with no success.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant