Implement Proper Captcha Token Management #362

pr1sm · 2019-02-25T01:22:32Z

Is your feature request related to a problem? Please describe.
#355 has merged and #359 is currently up, but both fail to address an underlying, related issue: Token expiration management.

This wasn't as big of an issue before #359, but now that one-click captchas can potentially be solved automatically, we can harvest quite a bit of tokens after the task manager triggers a harvest. The problem that arises is the expiration of a harvested token. There are only a certain number of one-click captchas that are available before the system gets suspicious and temporarily disables them.

A very real use case would be monitoring on a site that requires a captcha on login and captcha tokens for various checkout steps. If the monitor has to run for longer periods of time before we get to a checkout (i.e. running for restocks), the following steps occur:

A captcha required at login triggers the manager to "Start harvesting"
We easily harvest a one-click captcha for login
We move on to monitoring
i. BUT: we continue to harvest one-clicks until they run out
ii. Monitoring takes a long time, meaning all the previously harvested one-clicks are invalid
We reach the checkout stage and now require tokens
The user has to solve challenges since the one-clicks have expired

This situation effectively wastes the potential pool of easily harvestable captcha tokens because of the gap between when captcha tokens are required.

Describe the solution you'd like
There are a couple of things we need to change to fully address this issue.

Part 1: Create a more efficient (or accurate) method for assigning the harvesting state

We should tighten the "harvesting" state to just when we need it during a task. In my mind this includes 2 groups of when captcha tokens are required: When logging in and during the checkout. If we can have the harvesting state reflect that, we will remove the potential waste of one-click tokens since we will tell captcha windows to stop harvesting during the potentially long monitor stage.

Part 2: Using the token expiration control in the `CaptchaWindowManager`

Currently we directly pass the harvested tokens to the TaskManager, which has no token expiration logic associated with it. The CaptchaWindowManager itself does have this logic, but because of the bypass, we don't use it at all. Instead of directly listening for the ipc event in the TaskLauncher, the TaskLauncher should look to the CaptchaWindowManager for tokens. This will allow the token expiration control to be in use and prevent errors that occur from invalid tokens being used in requests

Part 3: Add a Captcha Window Harvesting Timeout

In order to prevent the disabling of one-click captchas, we should implement a timeout for harvesting within the CaptchaWindowManager. This would allow a limit of captcha tokens for be harvested (maybe 5-7?) before the CaptchaWindowManager would stop harvesting. This would give the captcha window some time to rest until the 5-7 tokens have been used or are expired. Some more thought will have to go into this, specifically with regards to having the CaptchaWindowManager trigger a restart on the harvesting.

Describe alternatives you've considered
With regards to Part 1 (see above), we could also start/stop harvesting for each time a token is requested. I think this would work, but from a UX perspective, the user could see that start/stop effect on their captcha windows (the captcha challenges rapidly switching). I think having the 2 sections still offers a solution the main gap, but prevents the potential stuttering that could occur if we were to start/stop harvesting on each request.

The text was updated successfully, but these errors were encountered:

* Reset Captcha Token After Use This commit updates the checkout processes to reset the stored captcha token after it has been used. This prevents a token from being used twice. * Enable Harvesting Suspension This commit adds a new suspend method for harvesting captchas that allows a runner to temporarily stop harvesting tokens without fully destroying the token queue. This allows the frontend to better detect when a token is needed. * Add HarvestStates Enum This commit adds an enumeration for harvest states to better handle harvest state transitions. * Fix Invalid Form Bug This commit fixes a bug caused by an invalid form structure when patching the checkout. An empty string passed into the captcha token caused the form structure to be invalid. The change allows an empty string to be passed in without causing an invalid form structure. * Add better handling for captcha requesters This commit refactors the handler for captcha requests now that harvest suspension is a thing. This will prevent multiple start/stops from a single runner interfering with the total semaphore count. * Enable Auto Stop for harvesting tokens This commit updates the captcha window manager to automatically stop after a certain number of tokens have been harvested. This prevents overharvesting of tokens. * Fix Captcha Reset Bug This commit fixes a bug that prevented the captcha from being reset properly. If the captcha window submitted a token and was stopped immediately, the reset logic would call the recaptcha api reset function _while_ the api was in the middle of calling the submit callback. This would cause a bug when resetting and prevent the reset from occurring. To fix this a new _submitting flag is added to track when the captcha window is in the process of submitting a token. If the stop handler is called when submitting, the stop handler skips resetting the challenge. The submit handler now calls the reset method regardless of the start status. Further. the reset function has been adjusted to remove the shouldAutoClick parameter. An analysis of the calls to this function show that we only autoclick if the _started flag is true. Thus the parameter was removed and the _started flag is used alone to determine whether autoclick should be run. Additionally, the logic to hide the captcha form is moved to the reset method since we only hide the form when _started is false. * Add Async Queue to Frontend This commit adds the async queue to the frontend so the captcha window manager can manage a queue and return tokens using promises. The async queue has been updated to include expiration functionality. The expiration functionality takes several parameters: - A filter function to determine whether a datum has "expired" - A interval time (default 1000ms) - Update callback to add logic _after_ the filter has occurred - thisArg optional this reference to use with _both_ the filter and update calls * Use AsyncQueue to manage tokens This commit refactors the captcha window manager to use an async queue for tokens. When a maximum number of tokens has been added to the backlog, the token harvesting is suspended. Once all tokens have been used (or expired) the harvesting resumes. * Source Captcha Tokens CWM This commit completes the transition to have the launcher source tokens from the CWM rather than directly from a harvest event. This allows us to take full advantage of the captcha backlog and handle expired tokens automatically. * Remove unnecessary function This commit removes a function that is no longer used. * Add Better Guard against Canceling Queue Request This commit updates the AsyncQueue to add better guarding against calling the `cancel` function for the returned request. Instead of a null reference, an empty function is used by default. Further, when the request resolves, the cancel function is replaced with an empty function to prevent a rejection from being called after the promise has already finished. * Add New Debug Commands This commit updates the frontend to add new debug commands surrounding captcha management: - viewRunnerRequests - view how many requests each unique runner has pending - viewCwmQueueStats - view the queue and backlog lengths for the captcha queue - viewCwmHarvestState - view the current harvest state of the CWM Additionally, a bug was fixed with the start and stop harvest debug commands to forward parameters in the correct order. Lastly some console logs were added for debugging. * Minor Refactors to Request Cancelling This commit updates the launcher to only cancel unfulfilled requests. Further, some minor refactoring is done sending the token to the launcher. This removes a redundant helper function in place of a one-line insert. * Fix AsyncQueue Bug This commit fixes a bug caused by the async queue treating the wait queue as a stack (using push/pop instead of unshift/pop). This caused incoming tokens to be sent to the wrong runner and have two requests cancelled instead of only one per token. fixes #362

This was referenced Feb 25, 2019

Task Runner EventEmitter to EventEmitter3 #363

Merged

Improve Captcha Autoclick Stability and Consistency #369

Merged

pr1sm mentioned this issue Mar 11, 2019

Refactor Captcha Token Management #382

Merged

10 tasks

walmat closed this as completed in d56fb3d Mar 23, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Proper Captcha Token Management #362

Implement Proper Captcha Token Management #362

pr1sm commented Feb 25, 2019

Implement Proper Captcha Token Management #362

Implement Proper Captcha Token Management #362

Comments

pr1sm commented Feb 25, 2019

Part 1: Create a more efficient (or accurate) method for assigning the harvesting state

Part 2: Using the token expiration control in the CaptchaWindowManager

Part 3: Add a Captcha Window Harvesting Timeout

Part 2: Using the token expiration control in the `CaptchaWindowManager`