fix: Fix the order in which cookies are saved to the SessionCookies and the handler is executed for PlaywrightCrawler#1163
Merged
Pijukatel merged 3 commits intoApr 23, 2025
Conversation
PlaywrightCrawlerSessionCookies and the handler is executed for PlaywrightCrawler
Contributor
There was a problem hiding this comment.
Pull Request Overview
This PR fixes the order of cookie saving in the PlaywrightCrawler to ensure cookies are saved only after the request handler has fully executed.
- Reorders the cookie saving logic in the _navigate function so that cookies are set after the handler processing.
- Adds a new unit test to simulate and verify the correct cookie saving behavior.
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| tests/unit/crawlers/_playwright/test_playwright_crawler.py | Introduces a new test ensuring cookies are saved post-handler execution. |
| src/crawlee/crawlers/_playwright/_playwright_crawler.py | Reorders the cookie saving logic to occur after the handler finishes. |
Contributor
There was a problem hiding this comment.
Pull Request Overview
This PR fixes the order in which cookies are saved for PlaywrightCrawler by ensuring that session cookies are only set after the handler has fully executed.
- Updated cookie saving logic in _playwright_crawler.py to run after handler processing.
- Enhanced tests in test_playwright_crawler.py to verify that cookies are correctly stored post-handler execution.
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| tests/unit/crawlers/_playwright/test_playwright_crawler.py | Updated test code to record sessions and verify correct cookie saving order. |
| src/crawlee/crawlers/_playwright/_playwright_crawler.py | Reordered the cookie saving logic to execute after the response is processed. |
Comments suppressed due to low confidence (2)
tests/unit/crawlers/_playwright/test_playwright_crawler.py:360
- The code assumes sessions_ids has at least three elements, but earlier assertions indicate only two unique sessions. Please verify that this indexing is intentional and update the test to accurately reflect the expected session count.
clean_session_id = sessions_ids[2]
tests/unit/crawlers/_playwright/test_playwright_crawler.py:360
- [nitpick] The identifier 'clean_session_id' may be ambiguous; consider renaming it to a more descriptive name that clarifies its role in indicating the session state after handler processing.
clean_session_id = sessions_ids[2]
Pijukatel
approved these changes
Apr 21, 2025
janbuchar
approved these changes
Apr 23, 2025
Mantisus
added a commit
to Mantisus/crawlee-python
that referenced
this pull request
Apr 24, 2025
… and the handler is executed for `PlaywrightCrawler` (apify#1163) ### Description - For `PlaywrightCrawler`, cookies should only be saved to the session store when the handler is fully executed. This is because the browser may continue to set cookies while the handler is being executed ### Testing - Add a test simulating the installation of a cookie in the browser during the `default_handler` execution process - Update the `test_isolation_cookies` test
Mantisus
added a commit
to Mantisus/crawlee-python
that referenced
this pull request
Apr 24, 2025
… and the handler is executed for `PlaywrightCrawler` (apify#1163) ### Description - For `PlaywrightCrawler`, cookies should only be saved to the session store when the handler is fully executed. This is because the browser may continue to set cookies while the handler is being executed ### Testing - Add a test simulating the installation of a cookie in the browser during the `default_handler` execution process - Update the `test_isolation_cookies` test
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
PlaywrightCrawler, cookies should only be saved to the session store when the handler is fully executed. This is because the browser may continue to set cookies while the handler is being executedTesting
default_handlerexecution processtest_isolation_cookiestest