Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parent / Children Tasks #268

Open
walmat opened this issue Jan 23, 2019 · 1 comment
Open

Parent / Children Tasks #268

walmat opened this issue Jan 23, 2019 · 1 comment
Labels
area:task-runner Related to Nebula's Task Runner package priority:low Issues that are low priority don't need to be solved right away type:enhancement New feature or request type:Future Feature Something to be done in the future

Comments

@walmat
Copy link
Owner

walmat commented Jan 23, 2019

Is your feature request related to a problem? Please describe.
Currently, when multiple tasks are running on the same site, the same variants of the same product & site will monitor the same endpoints and are sending 3 requests / task to monitor data. This seems pretty taxing on proxies if you ask me, and seems a little wasteful.

Describe the solution you'd like
Maybe we create a structure, where tasks monitoring the same site for the same product are split into 1 parent task and the rest children tasks of that parent task. This way, only the parent task would monitor the site for the product, and when found, propagate the data to the children tasks.

Describe alternatives you've considered
Dunno yet! Just thought of this.

@walmat walmat added type:enhancement New feature or request type:Future Feature Something to be done in the future area:task-runner Related to Nebula's Task Runner package labels Jan 23, 2019
@walmat walmat added this to the Beta 4 Release milestone Jan 23, 2019
@walmat walmat changed the title Parent / Child Tasks Parent / Children Tasks Jan 23, 2019
@pr1sm
Copy link
Collaborator

pr1sm commented Jan 23, 2019

I like this! Another alternative to using parent/child tasks is to build a cache of product data we've captured as we're making requests. When we request a site's product data or even variant data for a specific product, we can check our cache first before making the request. This would remove the need for a "parent" task, but would allow us to solve the problem of making too many requests.

The cache itself would have to stored in a file, or in-memory on the main thread, then task-runners would make a request for the data through some function (similar to the request function they are making now) If there is a cache hit, we return the data. If there is cache miss, we make the request using all the same parameters as the request function (specifically the given cookie jar) and update the cache.

The cache would have to have a pretty short "hot" time so we can still get recent data -- I'm thinking we use the monitorDelay. This would mean the the first task to miss the cache would keep it hot on any request it makes, while any other requests from other tasks would use the cached data.

@walmat walmat removed this from the Beta 4 Release milestone Feb 1, 2019
@pr1sm pr1sm added the priority:low Issues that are low priority don't need to be solved right away label Feb 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:task-runner Related to Nebula's Task Runner package priority:low Issues that are low priority don't need to be solved right away type:enhancement New feature or request type:Future Feature Something to be done in the future
Projects
None yet
Development

No branches or pull requests

2 participants