Skip to content

user data crawling opens two windows, unable to control correct user browser #236

Closed
@BZBY

Description

I've just tested the latest main branch functionality on Windows 11 or Ubuntu and encountered an issue. Here’s the test code I used:

python
async def main():
    async with AsyncWebCrawler(
            headless=False,  # Set to False to see what is happening
            use_managed_browser=True,
            browser_type="chromium",
    ) as crawler:
        result = await crawler.arun(
            url="https://www.youtube.com/",
            magic=True
        )
        print(result.markdown)

if __name__ == "__main__":
    asyncio.run(main())

When running this code, two browser windows open up: one displays Chrome's login screen, and the other loads the URL I specified. All subsequent operations happen in the second browser window, but closing it also causes the first browser window to close. This suggests that the two windows are instances of the same browser. However, when I add user data as follows:

python
async with AsyncWebCrawler(
        headless=False,
        use_managed_browser=True,
        user_data_dir=r"C:\Users\BZBY\AppData\Local\Google\Chrome\User Data",
        browser_type="chromium",
) as crawler:

The issue becomes apparent. The first window is my real browser instance, but the second window lacks my user data—it only has bookmark information and doesn’t display the user profile icon in the top right corner of Chrome. This means that the second window cannot access sites I’m already logged into, so I have to log in again.

Ideally, I should be able to open a browser with my actual user profile or use a command like:


bash
start chrome.exe --remote-debugging-port=9222 --user-data-dir="C:\Users\BZBY\AppData\Local\Google\Chrome\User Data"

This command allows me to open a browser that I can access directly using playwright.chromium.connect_over_cdp(cdp_url) to interact with my existing open browser instance.

截图 2024-11-07 10-09-20

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions