Skip to content

Teach making requests in a consistent way #950

Open
@honzajavorek

Description

@honzajavorek

The courses now recommend using variety of tools to make HTTP requests. Sometimes it's more confusing, sometimes less.

  • The got library seems to be superseeded by ky, at least in the README Sindre mentions it.
  • Apify develops got-scraping, which seems to be married to the got library. Wouldn't it make sense to turn got-scraping into something more agnostic to a client library? Could it just prepare the request details, so that they can be attached to the request by any library?
  • Some guides use axios
  • Some guides mention request and request-promise, which are now both deprecated
  • Meanwhile, Node.js has adopted fetch to the stdlib

Using got-scraping in the basic tutorial is probably unnecessary, any HTTP client can be used in the initial lessons. The value of got-scraping should emerge with more complicated use cases.

But wouldn't it make more sense to skip got-scraping and promote Crawlee right away at that point? Is got-scraping something Apify wants to spend marketing energy on, or is it an implementation detail?

As of now, got-scraping doesn't have good Python alternatives I'd know about. There are independent libraries one can use, such as fake-user-agent, which have integrations with scraping frameworks.

Regarding request libraries, the scene is similarly shattered in Python, featuring requests, aiohttp, or httpx, each having their fans and use cases.

I'd like to kick off this as a discussion on what should be the preferred way for the Academy to teach making requests in 2024, using Node.js and Python.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request.t-academyIssues related to Web Scraping and Apify academies.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions