Skip to content

Cache getPage result #338

@TobiX

Description

@TobiX

Some extraction functions (starter, namer, etc.) might need access to the page data. Most use the naive approach and just call self.getPage(url), which leads to another HTTP request and another parsing of the page. This is not ideal.

Check which is the better design:

  • Cache everything getPage does for some time? (Maybe just the current page and be done with it?)
  • Rework methods which need this data to get them as a parameter?
  • Other?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions