-
-
Notifications
You must be signed in to change notification settings - Fork 976
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(Patreon) New "403 Forbidden" Cloudflare CAPTCHA error with 1.15.3 #1117
Comments
yup, can confirm |
What version of requests and urllib3 are both of you using? There was an update for both of them in the last couple of days. Maybe that causes these problems?
There is no other warning/error message and the cookies.txt file actually contains a |
oh, let me see... |
|
yup. can confirm it now totally works. 😅 |
so... since i'm antsy about keeping old versions of the libraries, should i remove this from my daily updates and let it update only when you tell me to... or what? |
Either that, or you create a |
no, it actually solved several problems i was having the last 2 days. i would just like to skip this version, but right now i just set it to not update those 2 packages, might resume when they get updated. |
Everything was as up-to-date as possible, so it would have been
Reverted and now the Cloudflare issue is gone. Will the next update of
Yes, even when the cookies.txt had the |
With the latest dev build being 1.16.0.dev0, is it safe to upgrade the dependancies to the latest versions. Pip has informed me that |
The fix was given to us 12 days ago. |
It should be OK to update everything except The problem lies in the new default behavior when establishing a TLS connection in |
but... why should using a better encryption give us a 403 error?!? 🤔 |
hmmm... apparently in TLS 2.0 the server looks for the client's certificate to continue, i guess ""we"" are not providing a certificate that works for patreon? |
This is more about whether or not Cloudflare thinks a request comes from a browser controlled by a human being or a bot, and it uses the TLS handshake among other things to determine that. Why Cloudflare believes requests with urllib3 1.26 are from a bot, but not with 1.25 is beyond me, but at least we know what works, just not the why. For example the latest Firefox versions only accept TLS 1.2 and 1.3, as does urllib3 1.26 (bot according to Cloudflare) in contrast to urllib3 1.25, which allows TLS 1.0, 1.1, 1.2, and 1.3 (not a bot). Maybe changing gallery-dl's user agent to some newer browser version is all that is needed to make it work with urllib3 1.26? (It currently uses Firefox 68 as user agent) |
oooh that's the problem! I'm specifying that i'm chrome 88 as a user agent and giving cookies that come from chrome 88. So if Urllib3 says it's firefox then they see chrome... if you ask me that's sus 🤣👍💖 |
No, urllib3 doesn't say it's a specific browser, gallery-dl by default is saying it is Firefox 68. When you've already changed gallery-dl's user agent and it still doesn't work with 1.26, we can discard my previous assumption ("Maybe changing gallery-dl's user agent ... to make it work with urllib3 1.26") |
hmmm... i have not tried "not changing" it though. |
I'm going to assume that even with the current 1.16.0 version of gallery-dl, that it's still not safe to upgrade to the latest version of urllib3? |
Urllib3 has been updated last time on the has been stuck on version 1.26.2 since Nov. 12 |
Doing a quick update report: Had upgraded Gallery-dl is at the latest release version. |
I believe I have tried everything that has been suggested but still getting the error. Can anyone explain all the versions they are using that makes it work? |
So I'm able to keep gallery-dl up to date with the current version To reduce my errors, I've had to keep the following packages at these versions.
There still might be the occasion where Cloudflare might start causing errors with individual post URL's, but running a generic "/posts" URL on a user page should get everything without issue. |
Latest report: Every dependency can be upgraded to the latest version except for urllib3 which still causes Cloudflare issues when trying to download either individual or whole user pages from Patreon. |
thanks! 👍💖 |
I am also having issues with Cloudflare. |
Yes, same here:
I'm not sure if the packaged executable (gallery-dl.exe) uses any of the locally installed Python (perhaps someone can enlighten me on this). In any case I ran this on two machines (same result), one has Python 3.7.0, and one has 3.9.1. -- Given that the -verbose option reports Python 3.7.9, I guess not. |
The packaged executable is fully self-contained, so no, it does not use any locally installed Python or something. |
Still on urllib3 1.25.11 and up to date on gallery-dl with 1.16.5, and there are still large amounts of time where trying to rip posts or users from Patreon results in Cloudflare CAPTCHA 403 Forbiddens. Sometimes it'll work, other times errors. |
- change default user agent to Firefox ESR 78 on Windows 10 - remove 'ciphers' option
So, there have been some changes in this regard and a new Please test this on Patreon user URLs and post URLs with and without this option enabled and let me know if this is more successful than it was before this change. I do want to make |
I've updated urllib3 to 1.26.3 and ran several User (as in the entirety of the Patreon timeline and Posts (as in individual posts) after updating to the latest dev build of gallery-dl. No Cloudflare issues so far, but will keep you updated. For record, my extractor settings in config has this structure for Patreon:
|
At least so far, trying out the Whatever changed was very recent, as I had just utilized the tool a few days ago. |
i was able to do0wnload from patreon fine with the updated urllib3 and "chrome" setting |
Well, since things seem to mostly be resolved, I"m going to close this particular issue. If anything pops up with Cloudflare and the new browser option, me or someone else will start a new ticket. |
iiit's back. apparently i have to re-login and even then it lasts exactly for 1 session. |
? Re-log in? I mean, other than replacing the session_id cookie about once a month, what exactly is it saying to do? I've also not had any errors (so far) when running multiple runs of User or Post urls. Can you post how you have your Patreon extractor settings are in your config? |
sute thing. General part of the config:
patreon specific:
ending:
|
How strange, you don't plug your session_id cookie directly into your config file? You have it read directly from your cookies.txt file in your postprocessors setting? Try to use the settings I posted for my Patreon extractor, where |
That is cookies from chrome. |
I don't use Firefox either (I used Waterfox, a forked browser). I'm thinking that the use of the browser setting is to emulate how the browser would get past the captcha. Either way, it couldn't hurt to try. |
|
yeh, if i had cookies from microsoft edge (for... masochistic reasons, i guess) i would still need to set it as chrome, since it's a chromium browser. |
Well, either way, I'm sure that whatever browser you're using is moot since I'm sure that, as I said earlier, it's more about emulating browser behavior using those settings, with Chrome or Firefox settings. I'm sure mikf can explain this. I'll reopen the issue since I started this ticket. |
Well, I can only make assumptions. For one, the browser "emulation" isn't particularly good. gallery-dl through requests/urllib3 only uses HTTP/1.1, while all modern browsers use HTTP/2. Chrome also sends a lot of HTTP/2 specific headers ( Then again, both Did you at least try Also, as the others already said, cookie origin shouldn't matter here. |
i reverted to urllib3 1.25.11 this afternoon as a test without re-logging into patreon and re-exporting the cookies. It downloaded fine. Aside from the fact any post that were blocked from my subscribed tiers would reset the count of "skip", which ... i do not know if that was the desired effect. But yeah, worked absolutely fine with the urllib3 downgrade. without it i have to re-login and re-export the cookies daily. |
@mikf You probably already know this, but just in case: https://github.com/encode/httpx This might be a viable alternative to move on from the requests/urllib3 combo. I'm not really in a position to judge here, so all I could do so far, given the presence of all these projects on this "social network" for software, was doing a bit of cross-referencing/researching/stalking the contributor page. |
I can confirm that I can still download User and individual Posts from Patreon just fine, with the settings I've posted earlier, with urllib3 still up-to-date. No Cloudflare issues with them at all since putting browser setting in the Patreon extractor. I can't imagine why Butterfly would have to export their cookies every day, I've still only had to re-export them once a month due to them having a short lifespan. |
using chrome cookies still does not work. :( |
@Butterfly-Dragon
|
yeah, for now this is what those lines look like in my json |
Have all dependencies and gallery-dl up to date, but have been getting constant 403 errors.
Made sure that session_id cookie was up to date in config, no dice.
Exporting all Patreon cookies into a cookies.txt and updating config to point to it leads to an error which reads "No session_id set", but downloads free posts.
Cloudflare has been blocking for about a whole day at this point, and verbose doesn't really give any useful information.
The text was updated successfully, but these errors were encountered: