-
-
Notifications
You must be signed in to change notification settings - Fork 5.9k
Description
crawl4ai version
0.7.2
Expected Behavior
In async_crawler_strategy.py, the remove_overlay_elements method loads the remove_overlay_elements.js script and evaluates it, but the injected source code is only wrapped in a try/catch block and never actually called.
Current (non-working) code:
try: await page.evaluate( f""" (() => {{ try {{ {remove_overlays_js} // only injected but not called return {{ success: true }}; }} catch (error) {{ return {{ success: false, error: error.toString(), stack: error.stack }}; }} }})() """ ) await page.wait_for_timeout(500) # Wait for any animations to complete
Working fix:
try: await page.evaluate( f""" (async () => {{ try {{ const removeOverlays = {remove_overlays_js}; // Injected await removeOverlays(); // called here return {{ success: true }}; }} catch (error) {{ return {{ success: false, error: error.toString(), stack: error.stack }}; }} }})() """ ) await page.wait_for_timeout(500) # Wait for any animations to complete
Current Behavior
The remove_overlay_elements flag in CrawlerConfig is not working because of this. After inspecting the source code, I found the issue above.
I’m surprised this hasn’t been reported before.
Is this reproducible?
Yes
Inputs Causing the Bug
config = CrawlerRunConfig(
remove_overlay_elements=True,
)
b_config = BrowserConfig(
headless=True)
async with AsyncWebCrawler(config=b_config) as crawler:
results = await crawler.arun_many(urls, config=config)Steps to Reproduce
Code snippets
OS
macOs
Python version
3.12
Browser
No response
Browser version
No response
Error logs & Screenshots (if applicable)
No response