-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Closed
Labels
bugSomething isn't working.Something isn't working.t-toolingIssues with this label are in the ownership of the tooling team.Issues with this label are in the ownership of the tooling team.
Milestone
Description
Which package is this bug report for? If unsure which one to select, leave blank
@crawlee/basic (BasicCrawler)
Issue description
When instantiating multiple crawler instances at once, their useState methods (both on the crawler instance and in the requestHandler context param) will always resolve to the same state.
From the API, this is not expected (crawler.useState feels like it should resolve to internal crawler state). If it is, it IMO requires better docs.
Code sample
import { CheerioCrawler } from '@crawlee/cheerio';
async function main() {
function createCrawler() {
return new CheerioCrawler({
requestHandler: async ({ request, useState }) => {
const state = await useState<string[]>([]);
state.push(request.url);
},
});
}
const [crawler1, crawler2] = [createCrawler(), createCrawler()];
await crawler1.run(['https://example.com']);
await crawler2.run(['https://example.org']);
console.log(crawler1 === crawler2); // false
console.log(await crawler1.useState() === await crawler2.useState()); // true
console.log(await crawler1.useState()); // ['https://example.com', 'https://example.org' ]
}
main();Package version
3.13.8
Node.js version
Node 22
Operating system
Linux
Apify platform
- Tick me if you encountered this issue on the Apify platform
I have tested this on the next release
No response
Other context
No response
Metadata
Metadata
Assignees
Labels
bugSomething isn't working.Something isn't working.t-toolingIssues with this label are in the ownership of the tooling team.Issues with this label are in the ownership of the tooling team.