Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process request headers #38

Merged
merged 10 commits into from
Jan 27, 2022
Merged

Process request headers #38

merged 10 commits into from
Jan 27, 2022

Conversation

elacuesta
Copy link
Member

@elacuesta elacuesta commented Oct 20, 2021

Partially address #36.

Allow to override the processing of request headers via the PLAYWRIGHT_PROCESS_REQUEST_HEADERS setting. For instance, set PLAYWRIGHT_PROCESS_REQUEST_HEADERS=scrapy_playwright.headers.use_playwright_headers to use Playwright headers and ignore the ones coming from Scrapy.

Names are subject to change, I'm open to suggestions.

Tasks:

  • Code
  • Tests
  • Docs

@codecov
Copy link

codecov bot commented Oct 20, 2021

Codecov Report

Merging #38 (16ac2a2) into master (2ca67f3) will not change coverage.
The diff coverage is 100.00%.

❗ Current head 16ac2a2 differs from pull request most recent head 42a1280. Consider uploading reports for the commit 42a1280 to get more accurate results
Impacted file tree graph

@@            Coverage Diff            @@
##            master       #38   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            3         4    +1     
  Lines          185       203   +18     
=========================================
+ Hits           185       203   +18     
Impacted Files Coverage Δ
scrapy_playwright/handler.py 100.00% <100.00%> (ø)
scrapy_playwright/headers.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2ca67f3...42a1280. Read the comment docs.

Copy link
Contributor

@Gallaecio Gallaecio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, pending the documentation.

scrapy_playwright/handler.py Outdated Show resolved Hide resolved
Comment on lines +72 to +77
if crawler.settings.get("PLAYWRIGHT_PROCESS_REQUEST_HEADERS"):
self.process_request_headers = load_object(
crawler.settings["PLAYWRIGHT_PROCESS_REQUEST_HEADERS"]
)
else:
self.process_request_headers = use_scrapy_headers
Copy link
Contributor

@Gallaecio Gallaecio Oct 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shame, this could be a 1-liner in Scrapy 2.4+ 🙁

(technically also here, but I’m guessing you don’t want to have load_object parse a string in the default scenario)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, I'd like to avoid any (admittedly small) overhead if we already have the object.

This is a nice catch, I was planning on documenting the setting as accepting either paths or functions directly, but didn't remember that was only valid on Scrapy 2.4+. I'll be sure to mention that when I write the docs, thanks!

@elacuesta elacuesta marked this pull request as ready for review January 26, 2022 20:18
@elacuesta elacuesta merged commit f50e7df into master Jan 27, 2022
@elacuesta elacuesta deleted the process-request-headers branch January 27, 2022 16:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants