Skip to content

proxy ips are dynamic from a ip provide, but crawlee cannot support ??? #2770

Closed as not planned
@WangShaoyu1

Description

Which package is this bug report for? If unsure which one to select, leave blank

@crawlee/playwright (PlaywrightCrawler)

Issue description

In Nodejs v20.16.0 ,crawlee 3.12 ,
when the proxy ips created by a api,and the ips result is different when excute the dunamic ip api, crawlee can not support this .
I suppose that when scrapy a website , it should excute the api

Code sample

import {PlaywrightCrawler, HttpCrawler, ProxyConfiguration, log, Session} from 'crawlee';
// 配置爬虫
const crawler = new PlaywrightCrawler({
    requestHandler: router,
    headless: true,
    requestHandlerTimeoutSecs: 200,
    autoscaledPoolOptions: {
        maxConcurrency: 20,
        minConcurrency: 10,
        desiredConcurrencyRatio: 0.5,  // 保持接近目标并发数
        scaleUpStepRatio: 0.15,        // 并发增加步长
        scaleDownStepRatio: 0.15,      // 并发减少步长
        autoscaleIntervalSecs: 5       // 自动缩放的时间间隔
    },
    proxyConfiguration: new ProxyConfiguration({
        proxyUrls: await getProxy()
    })
});

Package version

crawlee 3.12.0 Nodejs

Node.js version

20.16

Operating system

windows

Apify platform

  • Tick me if you encountered this issue on the Apify platform

I have tested this on the next release

No response

Other context

No response

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working.t-toolingIssues with this label are in the ownership of the tooling team.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions