Skip to content

setdefaulttimeout is causing issues with pyOpenSSL #62

@ejulio

Description

@ejulio

This issue when running selenium-wire in scrapy cloud is caused because of a call to socket.setdefaulttimeout in sh_scrapy.crawl.

The related issue in pyOpenSSL pyca/pyopenssl#168

The cause of the issue is this call
https://github.com/scrapinghub/scrapinghub-entrypoint-scrapy/blob/master/sh_scrapy/crawl.py#L27

A minimal example to reproduce the issue

Clone https://github.com/pawelmhm/quotesbot/tree/selenium

Working case

cd quotesbot
docker built -t selenium-wire-scrapy-issue .
docker run selenium-wire-scrapy-issue scrapy crawl toscrape-css

It should finish properly scraping 100 items and no errors

Now, update settings.py and add the following to the top of the file

import socket
socket.setdefaulttimeout(60)

Then rebuild and rerun the container

docker built -t selenium-wire-scrapy-issue .
docker run selenium-wire-scrapy-issue scrapy crawl toscrape-css

Now it should fail, scraping only 10 items and showing an error net::ERR_SSL_PROTOCOL_ERROR in the logs

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions