Description
crawl4ai version
0.5.0.post4
Expected Behavior
It's can scrape the data from the website,
Current Behavior
[ERROR]... × https://www.hifiboehm.de/de/produkt/sonos-sub-4-we... | Error: ┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ × Unexpected error in _crawl_web at line 579 in _crawl_web (.venv/lib/python3.11/site- │ │ packages/crawl4ai/async_crawler_strategy.py): │ │ Error: Failed on navigating ACS-GOTO: │ │ Page.goto: Timeout 60000ms exceeded. │ │ Call log: │ │ - navigating to "https://www.hifiboehm.de/de/produkt/sonos-sub-4-weiss", waiting until "domcontentloaded" │ │ │ │ │ │ Code context: │ │ 574 response = await page.goto( │ │ 575 url, wait_until=config.wait_until, timeout=config.page_timeout │ │ 576 ) │ │ 577 redirected_url = page.url │ │ 578 except Error as e: │ │ 579 → raise RuntimeError(f"Failed on navigating ACS-GOTO:\n{str(e)}") │ │ 580 │ │ 581 await self.execute_hook( │ │ 582 "after_goto", page, context=context, url=url, response=response, config=config │ │ 583 ) │ │ 584 │ └───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
Is this reproducible?
Yes
Inputs Causing the Bug
- URL: https://www.hifiboehm.de/de/produkt/sonos-sub-4-weiss
- Setting used:
+ ["--disable-gpu", "--disable-dev-shm-usage", "--no-sandbox"]
+ Headless True
+ user_agent_mode="random"
+ magic=True
Steps to Reproduce
Code snippets
class Crawl4AIAdapter:
def __init__(self, headless: bool = True, verbose: bool = True):
# Set up browser configuration with extra args for stability.
self.browser_config = BrowserConfig(
headless=headless,
verbose=verbose,
extra_args=["--disable-gpu", "--disable-dev-shm-usage", "--no-sandbox"],
)
# Use your preferred cache mode (here, DISABLED)
self.crawl_config = CrawlerRunConfig(cache_mode=CacheMode.BYPASS)
self.crawler = AsyncWebCrawler(
config=self.browser_config,
user_agent_mode="random",
user_agent_generator_config={
"device_type": "mobile",
"os_type": "android"
},
magic=True,
)
OS
Ubuntu 22.04
Python version
3.11
Browser
No response
Browser version
No response
Error logs & Screenshots (if applicable)
[ERROR]... × https://www.hifiboehm.de/de/produkt/sonos-sub-4-we... | Error:
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ × Unexpected error in _crawl_web at line 579 in _crawl_web (.venv/lib/python3.11/site- │
│ packages/crawl4ai/async_crawler_strategy.py): │
│ Error: Failed on navigating ACS-GOTO: │
│ Page.goto: Timeout 60000ms exceeded. │
│ Call log: │
│ - navigating to "https://www.hifiboehm.de/de/produkt/sonos-sub-4-weiss", waiting until "domcontentloaded" │
│ │
│ │
│ Code context: │
│ 574 response = await page.goto( │
│ 575 url, wait_until=config.wait_until, timeout=config.page_timeout │
│ 576 ) │
│ 577 redirected_url = page.url │
│ 578 except Error as e: │
│ 579 → raise RuntimeError(f"Failed on navigating ACS-GOTO:\n{str(e)}") │
│ 580 │
│ 581 await self.execute_hook( │
│ 582 "after_goto", page, context=context, url=url, response=response, config=config │
│ 583 ) │
│ 584 │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
Activity