Discussion: Accounts survival rate #175

agrieco · 2024-04-25T00:08:33Z

Over the past month or so I've had issues with keeping accounts reliability logged in. I'm authenticating via username/email and password when I do log in and things tend to work for a while.

I've got a script that runs on a cronjob. I've tried reducing the frequency of running, etc but it is all getting the accounts de-authed eventually.

Because my account is following a few private accounts, the idea of just generating new accounts doesn't work.

Has anyone been able to reliability keep accounts 'alive' without reauthing or code for automated re-auth?

takabinance · 2024-04-26T00:58:06Z

It seems like it's gotten much harder. Previously I was losing a few accounts/day, and in the past few days I think I went through close to 1,000. 👎

Automated re-auth would be great.

caterpillar1219 · 2024-04-26T08:56:56Z

@takabinance Same for me - looks like I'm not the only one. Since yesterday my accounts constantly gets banned, and every time I tried to re-login it gets banned. This hardly happened before. Are you experiencing the same?

takabinance · 2024-04-26T14:37:53Z

I have not tried to re-login... it's with a large number of accounts. I'm not sure how I would do this. But, yeah, it seems like something changed very recently.

BrokedTV · 2024-04-26T15:29:46Z

Same here

ExHix · 2024-04-26T19:43:29Z

It seems that X has made some changed in recent days and start suspending accounts more frequently, all of my old and new accounts are all suspended (read only) yesterday.

TMDR · 2024-04-27T21:41:56Z

It's been around a month or so since X has started systematically detecting automated actions and suspending them, even asking for Arkose Challenges as often as every day, I wonder if there's a work around for this, because for now I'm forced to check my account every couple hours for Arkose Challenges so my automation keeps on going... (one workaround I know of is buying premium, which I will not do), any ideas?

caterpillar1219 · 2024-04-27T23:46:46Z

@TMDR I believe we are talking about different issues. The one you mentioned has exist for quite some time, while the issue we discussed just emerged a few days ago, when dealing with a large number of accounts.

JonasBirk · 2024-04-28T18:28:58Z

same here.
I got around 3k accounts and used them daily in a rotating system. Around 100 per day.
Over 95% are suspended now (read only). Some of them I have not even used yet.
I am wondering how they detect such accounts and what we can do to keep them alive...

takabinance · 2024-04-28T18:57:01Z

Has anyone had luck with re-logging in (twscrape relogin ). I have tried with and without mfa_code and it always generates:

twscrape.accounts_pool:login:162 - Failed to login 'accountname': 429 - {"code":88,"message":"Rate limit exceeded."}

Have tried with multiple accounts, multiple machines and always same

takabinance · 2024-04-28T18:58:54Z

Also, has anyone tried with premium accounts. I'd happily pay for 10-20 accounts to avoid this.

ExHix · 2024-04-29T15:36:44Z

An small update about suspended (read only) account.
Turns out that this kind of account is literally "read only", with no write permission and limited read permission. This account are still able to do reading things like searching or loading user profile, but with a really low threshold on rate limit compared to a normal account.
According to X's docs, the x-rate-limit-limit , x-rate-limit-remaining and x-rate-limit-reset in response headers are supposed to indicate parameters about rate limit, but they are no longer functional under read only mode, which means you can still get rate limited even if x-rate-limit-remaining is not 0, and the account maybe still (reality is very likely) unavailable after x-rate-limit-reset.
I make a small test on SearchTimeline on 5 accounts. I can requests around 11-13 times with 10s interval before the account get limited. I am confident about these accounts' read operation is recoverable because the accounts I test are just dead 2 days ago, but the reset time is uncertain. I will update further research at tomorrow.
A conclusion I can made so far is if your business only needs read operation, then it is theoretically possible to do scraping with a fair large amount numbers of read only account, but be aware of ip banning. And I also want twscrape can make a feature about testing if these read only account are still available or is recovered, instead of just marking them as inactive.

caterpillar1219 · 2024-04-29T16:08:05Z

@ExHix Thanks for your investigation! Are you suggesting the rate-limited account can still be used for read-only operation? From my experience very little tweets can be loaded for these accounts, maybe I didn't test it after enough intervals. But looking forward to your further update!

ExHix · 2024-04-29T16:32:16Z

@caterpillar1219 It is theoretically possible at least on searching if my guess is correct, but I need more data about rate limit behavior on read only accounts. If anyone also want to investigate it, this is my script to collect requesting data on X's search API.
Also if there is existing research on X's read only account, please share it!

BrokedTV · 2024-04-29T17:02:59Z

When checking the usage, despite several accounts being added to the db, the same 20 accounts or so get used over and over, has anyone else noticed it? Wouldn't a better rotation between available accounts be better? so each account does actions less often

caterpillar1219 · 2024-04-29T19:26:25Z

@BrokedTV There is some accounts pool setting you can modify in the code, to either select available accounts by alphabetic order(default), or random. I tried the later, but unfortunately it does not help. My guess is the rate limit is not only on the account level, but somehow on the proxy level.

@ExHix fwiw, the normal account req limit is 49 - then it is locked for 15min and limit refreshed. But as I looked into those read-only account, it seems when read req reaches 8 or 9, then it is turned into an irreversible "Rate Limit Exceeded" state.

takabinance · 2024-04-29T23:07:33Z

When checking the usage, despite several accounts being added to the db, the same 20 accounts or so get used over and over, has anyone else noticed it? Wouldn't a better rotation between available accounts be better? so each account does actions less often

Add this to your code:

AccountsPool._order_by = "RANDOM()"

The AccountsPool will then randomly select accounts instead of going through by username.

xymou · 2024-04-30T08:14:47Z

@BrokedTV There is some accounts pool setting you can modify in the code, to either select available accounts by alphabetic order(default), or random. I tried the later, but unfortunately it does not help. My guess is the rate limit is not only on the account level, but somehow on the proxy level.

@ExHix fwiw, the normal account req limit is 49 - then it is locked for 15min and limit refreshed. But as I looked into those read-only account, it seems when read req reaches 8 or 9, then it is turned into an irreversible "Rate Limit Exceeded" state.

Very useful findings! So does this mean if we set the limit to 8 or 9, it's less likely for the accounts to be suspended?

dhl1402 · 2024-04-30T10:47:18Z

An small update about suspended (read only) account. Turns out that this kind of account is literally "read only", with no write permission and limited read permission. This account are still able to do reading things like searching or loading user profile, but with a really low threshold on rate limit compared to a normal account. According to X's docs, the x-rate-limit-limit , x-rate-limit-remaining and x-rate-limit-reset in response headers are supposed to indicate parameters about rate limit, but they are no longer functional under read only mode, which means you can still get rate limited even if x-rate-limit-remaining is not 0, and the account maybe still (reality is very likely) unavailable after x-rate-limit-reset. I make a small test on SearchTimeline on 5 accounts. I can requests around 11-13 times with 10s interval before the account get limited. I am confident about these accounts' read operation is recoverable because the accounts I test are just dead 2 days ago, but the reset time is uncertain. I will update further research at tomorrow. A conclusion I can made so far is if your business only needs read operation, then it is theoretically possible to do scraping with a fair large amount numbers of read only account, but be aware of ip banning. And I also want twscrape can make a feature about testing if these read only account are still available or is recovered, instead of just marking them as inactive.

Thank you for your findings. I set the limit to 5 requests for each 15 mins but unlucky my accounts still got banned.

ExHix · 2024-04-30T11:43:46Z

Thank you for your findings. I set the limit to 5 requests for each 15 mins but unlucky my accounts still got banned.

It may because X has implemented more radical abnormal behavior detection, no matter how low frequency you scrape.

ExHix · 2024-05-01T14:41:51Z

I will update further research at tomorrow. A conclusion I can made so far is if your business only needs read operation, then it is theoretically possible to do scraping with a fair large amount numbers of read only account, but be aware of ip banning.

Based on two days of testing data, I can make a reliable guess that the rate limit reset window for each read only account is 24 hours, and for searching, each account has 11 to 13 times available in every window. Of course, this is not enough for the business which needs always scraping, but is enough for light users. I guess other APIs may have similar behaviors, but I won't dig into it further because the project is bit huge, if anyone is interested in it you can try it.

Aprilpl · 2024-05-01T16:27:09Z

It seems like it's gotten much harder. Previously I was losing a few accounts/day, and in the past few days I think I went through close to 1,000. 👎

Automated re-auth would be great.

@takabinance Hello, I would like to ask: How to achieve Automated re-auth? Look forward to your answer, thank you very much.

caterpillar1219 · 2024-05-01T18:42:06Z

@ExHix Yeah 11~13 req is too few. Normal accounts has 50 req every 15min.

takabinance · 2024-05-02T22:22:13Z

My current configuration is:

random account selection
low daily volume - around 20-30 searches per day per account
an 8 hour delay if limit has been hit (although this rarely happens with random)
residential rotating proxies

I'm back to losing 1-2% per day over the past few days. Will update in several days to see if this holds.

takabinance · 2024-05-02T22:25:46Z

@takabinance Hello, I would like to ask: How to achieve Automated re-auth? Look forward to your answer, thank you very much.

I actually don't even know how to log back in once. I thought it was a simple twscrape relogin command, but this hasn't worked for me as I get a 429 every time. You can see that 2fa recovery was added to the library (but maybe not documented yet) but haven't dug into it to see how to use it yet.

codilau · 2024-05-03T08:28:50Z

I saw somewhere else that a header for 'ui_metrics' could have been implemented recently. Twscrape hits the GQL directly hence not scoring high on the UI use - maybe this could be worth investigating?

caterpillar1219 · 2024-05-04T05:02:05Z

@takabinance Thanks for sharing! I also notice that I started to lose less accounts since yesterday. I'm not sure if it's because some changes I made(e.g. no. accounts, search freq), or it's the change from Twitter side.
When you mentioned 20-30 searches, is it technically 20-30 searches(1 search could be ~10 req), or 20-30 req? Also may I ask how many accounts you are losing per day? like tens or hundreds?

takabinance · 2024-05-05T18:06:12Z

I haven't lost an account in a few days. I'm now only running around 100 accounts and maybe 40-50 requests per account per day (I only get the first page of search results so for me 1 search = 1 request).

axelblaze88 · 2024-05-07T11:13:39Z

I haven't lost an account in a few days. I'm now only running around 100 accounts and maybe 40-50 requests per account per day (I only get the first page of search results so for me 1 search = 1 request).

can I ask how you managed to create so many accounts?

ExHix · 2024-05-07T12:48:19Z

I haven't lost an account in a few days. I'm now only running around 100 accounts and maybe 40-50 requests per account per day (I only get the first page of search results so for me 1 search = 1 request).

can I ask how you managed to create so many accounts?

There are websites selling automatically registered accounts, no need to create manually.

Epikcoder · 2024-05-10T19:26:27Z

@ExHix Any examples?

Epikcoder · 2024-05-11T18:24:20Z

@ExHix any info?

takabinance · 2024-05-11T18:52:29Z

twaccs.com

Epikcoder · 2024-05-12T03:58:42Z

thx

takabinance · 2024-05-24T14:05:12Z

3 weeks passed with no banned accounts, then yesterday all were banned.

akuma0 · 2024-05-31T08:31:58Z

Same here... :(

ErSauravAdhikari · 2024-06-04T03:40:25Z

The bans are getting substantially frequent.

akuma0 · 2024-06-07T12:09:46Z

It's looks like to be the end of Twitter's web scraper...
Anyone works on this problem ?
I can help if someone need...

ExHix · 2024-06-12T16:35:53Z

I am very pessimistic about the future availability of twitter scraper. The platform is blocking accounts with unprecedented intensity, and I have even seen some users complaint that their normally used accounts are being restricted. Firetruck Elon Musk.

nijynot · 2024-06-17T10:45:44Z

Been reading up about X/Twitter scraping in the last few days.
Came upon this thread, and it doesn't look too good at the moment for scrapers.

It's a bit off-topic, but as complexity and difficulty to keep up the cat and mouse game between X and scrapers, I think it makes increasingly much more sense for someone to build a service for this. I'd pay someone a subscription if they solved all these headaches for me.

vladkens · 2024-06-29T18:30:38Z

Hi. I don't have a silver bullet on this issue either. My test accounts have not been banned in recent months (but I rarely use them).

I know from colleagues who do scrapping on a regular basis that more accounts have become required in the last couple weeks.

In v0.13 I updated the endpoint to x.com and the GQL endpoints versions to the latest. Perhaps this will make the situation a little better.

Pin this topic so they can participate in the discussion.

Jwom · 2024-07-01T12:11:06Z

hey,guys
1、According to my latest test results, the account suspension seems to be related only to certain parameters in headers and cookies.

2、Below are the test instructions
The status code of the suspended account is 401,326,32
Super speed status code: 429
My test interface: https://x.com/i/api/graphql/TQmyZ_haUqANuyBcFBLkUw/SearchTimeline

3、If an account is requested in dom mode for a long time (you can use selenium,DrissionPage, etc.), no matter how you request it, the account will not be suspended, and the speed too fast may only trigger the rate limit

4、However, if you use http and other direct requests, some encryption parameters are missing (initially suspected to be X-Client-UUID, x-client-transaction-id these two parameters), the account will be suspended if the request volume or time exceeds a certain amount.I set the interval of each request to be 15s, 25s,30s,60s, and the account will be suspended (the account may be suspended in the range of contact request 1-10h, generally not more than 10h).

5、In addition, I blocked all twitter requests except for SearchTimeline through the chrome extension, and then continuously used dom mode to request the SearchTimeline interface for 2 days, and so far no account has been suspended

takabinance · 2024-07-01T19:04:18Z

@Jwom

3、If an account is requested in dom mode for a long time (you can use selenium,DrissionPage, etc.), no matter how you request it, the account will not be suspended, and the speed too fast may only trigger the rate limit

When you say 'dom mode', you mean through something like selenium and not using twscrape?

Jwom · 2024-07-02T01:23:22Z

@Jwom

3、If an account is requested in dom mode for a long time (you can use selenium,DrissionPage, etc.), no matter how you request it, the account will not be suspended, and the speed too fast may only trigger the rate limit

When you say 'dom mode', you mean through something like selenium and not using twscrape?

yes, DrissionPage and selenium
DOM（Document Object Model）

JonasBirk · 2024-07-02T03:08:42Z

I know selenium and how it works. But how can you extract all seen tweets in a proper, structured format from there? Is there a any documentation or repo to get an idea? Jwom ***@***.***> schrieb am Di. 2. Juli 2024 um 03:23:

…

@Jwom <https://github.com/Jwom> 3、If an account is requested in dom mode for a long time (you can use selenium,DrissionPage, etc.), no matter how you request it, the account will not be suspended, and the speed too fast may only trigger the rate limit When you say 'dom mode', you mean through something like selenium and not using twscrape? yes, DrissionPage and selenium DOM（Document Object Model） — Reply to this email directly, view it on GitHub <#175 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AC3DBACRGLTTC3GXGUB45GTZKH6KBAVCNFSM6AAAAABGX3JEPGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMBRGYYDSNZUGE> . You are receiving this because you commented.Message ID: ***@***.***>

Jwom · 2024-07-02T06:17:12Z

I know selenium and how it works. But how can you extract all seen tweets in a proper, structured format from there? Is there a any documentation or repo to get an idea? Jwom @.> schrieb am Di. 2. Juli 2024 um 03:23:
…
@Jwom https://github.com/Jwom 3、If an account is requested in dom mode for a long time (you can use selenium,DrissionPage, etc.), no matter how you request it, the account will not be suspended, and the speed too fast may only trigger the rate limit When you say 'dom mode', you mean through something like selenium and not using twscrape? yes, DrissionPage and selenium DOM（Document Object Model） — Reply to this email directly, view it on GitHub <#175 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC3DBACRGLTTC3GXGUB45GTZKH6KBAVCNFSM6AAAAABGX3JEPGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMBRGYYDSNZUGE . You are receiving this because you commented.Message ID: @.>

If you use selenium for crawler, you can extract using xpath or css syntax. If you use DrissionPage, you can turn on its url capture feature to capture some of these xhr packets, You can refer to this document at https://drissionpage.cn/whatsnew/4/#%EF%B8%8F-%E6%96%B0%E7%9A%84%E6%8A%93%E5%8C%85%E5%8A%9F%E8%83%BD

Mikkael10 · 2024-07-09T22:53:20Z

Is there anybody here who would be open to offering "twscrape" as a service? @Jwom @takabinance particularly either of you. But open for anyone else, primarily X follower scraping - please reply with a contact method if so.

takabinance · 2024-07-10T02:58:33Z

As @Jwom pointed out there are different approaches. i figured out a way to to follower/friend tracking at scale without bans... and also realtime tweet tracking for known accounts. But need twscrape for searching. But I can barely keep my own instance running. :-) might be able to help depends what u need: takabinance at gmail

Mikkael10 · 2024-07-10T08:54:37Z

As @Jwom pointed out there are different approaches. i figured out a way to to follower/friend tracking at scale without bans... and also realtime tweet tracking for known accounts. But need twscrape for searching. But I can barely keep my own instance running. :-) might be able to help depends what u need: takabinance at gmail

Thanks, email sent. I have my Github in the email subject, maybe also check Spam folder just incase :).

brandonroot · 2024-07-10T20:59:45Z

I'm wondering if anyone is seeing this error?

{37 Authorization: Denied by access control: Missing LdapGroup(visibility-admins); Missing LdapGroup(visibility-custom-suspension)} {37 Authorization: Denied by access control: Missing LdapGroup(visibility-admins); Missing LdapGroup(visibility-custom-suspension)}

JonasBirk · 2024-07-11T19:13:14Z

yes, me. Starting from today. The strange thing is the account is not suspended. Does anyone know why and how to solve this?

…

On Wed, 10 Jul 2024 at 23:00, Brandon Root ***@***.***> wrote: I'm wondering if anyone is seeing this error? {37 Authorization: Denied by access control: Missing LdapGroup(visibility-admins); Missing LdapGroup(visibility-custom-suspension)} {37 Authorization: Denied by access control: Missing LdapGroup(visibility-admins); Missing LdapGroup(visibility-custom-suspension)} — Reply to this email directly, view it on GitHub <#175 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AC3DBAEBR2JYTEZ6F6WOCCLZLWOFPAVCNFSM6AAAAABGX3JEPGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRRGQ2TKMJTGY> . You are receiving this because you commented.Message ID: ***@***.***>

brandonroot · 2024-07-12T17:08:05Z

Seems to be intermittent and only impacts a smallish number of requests using the search endpoint, but I'm not seeing any today. Maybe Twitter pushed some bad code?

triplotriplo · 2024-07-14T05:35:22Z

I'm getting started with twscrape and, after much effort, reading, and AI assistance (I know very little about coding), I successfully created my own tool based on it. Many thanks!

On my first search query, it used only one account, which resulted in a "Session expired or banned" message. I was able to re-login later. According to the stats, the account made 80 requests to scrape the data I needed. How can I automatically switch accounts after 20-30 requests to avoid hitting limits and continue scraping seamlessly?

Mikkael10 · 2024-07-17T10:56:20Z

@JonasBirk @Jwom @takabinance

Hey guys (and other guys who read this).

I've been successful in scraping followers, however when scraping Tweet Replies, I never entirely scrape ALL comments. For example, if a specific tweet has 1000 replies, sometimes I may only receive 100 lines in my output.txt.

Has anybody has simillar issues? or knows how to circumvent this? thanks.

alphaleadership · 2024-07-17T15:41:36Z

DrissionPage

can you opensource the code

ymys · 2024-09-08T06:07:40Z

Is there anybody here who would be open to offering "twscrape" as a service? @Jwom @takabinance particularly either of you. But open for anyone else, primarily X follower scraping - please reply with a contact method if so.

Contact me, we have this service.

Jwom · 2024-09-08T06:08:12Z

收到

tanfpt · 2024-11-13T09:18:00Z

hey,guys 1、According to my latest test results, the account suspension seems to be related only to certain parameters in headers and cookies.

2、Below are the test instructions The status code of the suspended account is 401,326,32 Super speed status code: 429 My test interface: https://x.com/i/api/graphql/TQmyZ_haUqANuyBcFBLkUw/SearchTimeline

3、If an account is requested in dom mode for a long time (you can use selenium,DrissionPage, etc.), no matter how you request it, the account will not be suspended, and the speed too fast may only trigger the rate limit

4、However, if you use http and other direct requests, some encryption parameters are missing (initially suspected to be X-Client-UUID, x-client-transaction-id these two parameters), the account will be suspended if the request volume or time exceeds a certain amount.I set the interval of each request to be 15s, 25s,30s,60s, and the account will be suspended (the account may be suspended in the range of contact request 1-10h, generally not more than 10h).

5、In addition, I blocked all twitter requests except for SearchTimeline through the chrome extension, and then continuously used dom mode to request the SearchTimeline interface for 2 days, and so far no account has been suspended

I'm trying to use Selenium to make requests to Twitter, but after a while, my account got banned. I wasn't using a proxy—could that be the reason for the ban?

vladkens pinned this issue Jun 29, 2024

vladkens changed the title ~~reliability?~~ Discussion: Accounts survival rate Jun 29, 2024

caterpillar1219 mentioned this issue Aug 19, 2024

ct0 not in cookies (most likely ip ban) #214

Open

Discussion: Accounts survival rate #175

Discussion: Accounts survival rate #175

Comments

agrieco commented Apr 25, 2024

takabinance commented Apr 26, 2024

caterpillar1219 commented Apr 26, 2024

takabinance commented Apr 26, 2024

BrokedTV commented Apr 26, 2024

ExHix commented Apr 26, 2024

TMDR commented Apr 27, 2024

caterpillar1219 commented Apr 27, 2024

JonasBirk commented Apr 28, 2024

takabinance commented Apr 28, 2024

takabinance commented Apr 28, 2024

ExHix commented Apr 29, 2024 • edited Loading

caterpillar1219 commented Apr 29, 2024

ExHix commented Apr 29, 2024 • edited Loading

BrokedTV commented Apr 29, 2024

caterpillar1219 commented Apr 29, 2024

takabinance commented Apr 29, 2024

xymou commented Apr 30, 2024

dhl1402 commented Apr 30, 2024

ExHix commented Apr 30, 2024 • edited Loading

ExHix commented May 1, 2024 • edited Loading

Aprilpl commented May 1, 2024 • edited Loading

caterpillar1219 commented May 1, 2024

takabinance commented May 2, 2024

takabinance commented May 2, 2024

codilau commented May 3, 2024

caterpillar1219 commented May 4, 2024

takabinance commented May 5, 2024 • edited Loading

axelblaze88 commented May 7, 2024

ExHix commented May 7, 2024

Epikcoder commented May 10, 2024

Epikcoder commented May 11, 2024

takabinance commented May 11, 2024 via email • edited Loading

Epikcoder commented May 12, 2024

takabinance commented May 24, 2024

akuma0 commented May 31, 2024

ErSauravAdhikari commented Jun 4, 2024

akuma0 commented Jun 7, 2024

ExHix commented Jun 12, 2024

nijynot commented Jun 17, 2024

vladkens commented Jun 29, 2024

Jwom commented Jul 1, 2024

takabinance commented Jul 1, 2024

Jwom commented Jul 2, 2024

JonasBirk commented Jul 2, 2024 via email

Jwom commented Jul 2, 2024

Mikkael10 commented Jul 9, 2024

takabinance commented Jul 10, 2024

Mikkael10 commented Jul 10, 2024

brandonroot commented Jul 10, 2024

JonasBirk commented Jul 11, 2024 via email

brandonroot commented Jul 12, 2024

triplotriplo commented Jul 14, 2024

Mikkael10 commented Jul 17, 2024

alphaleadership commented Jul 17, 2024

ymys commented Sep 8, 2024

Jwom commented Sep 8, 2024 via email

tanfpt commented Nov 13, 2024

ExHix commented Apr 29, 2024 •

edited

Loading

ExHix commented Apr 29, 2024 •

edited

Loading

ExHix commented Apr 30, 2024 •

edited

Loading

ExHix commented May 1, 2024 •

edited

Loading

Aprilpl commented May 1, 2024 •

edited

Loading

takabinance commented May 5, 2024 •

edited

Loading

takabinance commented May 11, 2024 via email •

edited

Loading