Skip to content

Crawl4AI Error: This page is not fully supported. #281

Closed
@Olliejp

Description

I was wondering if you could help me with a recurrent issue which I can find no repeatable solution for. Giving this URL as an example: https://www.newcleo.com/. I have tried many combinations of wait_for, and various js_code strategies but cannot access the actual page. I don't see any significant anti-bot measures on chrome. I do notice that for a split second a .gif animation pops up before the page renders. If i try to use crawl4ai without a delay I basically scrape this url. If i add a delay I see the following error.

async def main():
    async with AsyncWebCrawler(always_by_pass_cache=True, verbose=True) as crawler:
        result = await crawler.arun(
            url="https://www.newcleo.com/", 
            magic=True, 
            headless=True, 
            #delay_before_return_html=5.0
            )

        print(result.markdown)
        return None
     
if __name__ == "__main__":
    asyncio.run(main())

Crawl4AI Error: This page is not fully supported. Possible reasons: 1. The page may have restrictions that prevent crawling. 2. The page might not be fully loaded. Suggestions: - Try calling the crawl function with these parameters: magic=True, - Set headless=False to visualize what's happening on the page. If the issue persists, please check the page's structure and any potential anti-crawling measures.

Thanks for any help!

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions