Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[twitter] accounts getting suspended #6020

Open
ForxBase opened this issue Aug 14, 2024 · 40 comments
Open

[twitter] accounts getting suspended #6020

ForxBase opened this issue Aug 14, 2024 · 40 comments

Comments

@ForxBase
Copy link

two accounts got suspended in a matter of days and downloading a few user profile's media. is this problem solvable? anyone else with this problem?

@NonaSuomi
Copy link

Been running into the same thing myself- lost three accounts over the last week and change.

I'm planning to let it rest a week or so before trying to make any more accounts, on the suspicion that my IP had been flagged for enhanced scrutiny. I figure if may be a temporary thing and if I back off for a little I might be okay to try again with a more cautious set of values for sleep and sleep-request.

I also suspect that time of use may be a factor, so I was planning to only schedule my script to only the extractor during local daytime hours, in case the scraper running overnight was tripping some kind of suspicious-activity alert.

@a84r7a3rga76fg
Copy link

Post your configuration. Switch IP address. Don't use timeline (it's broken anyway) if you're using it.

@NonaSuomi
Copy link

Maybe you can elaborate on what you mean here about not using timeline and your rationale? I see you have your own open issue regarding infinite sleep-request behavior that mentions it, but that doesn't really seem relevant to the problem of accounts being banned by Twitter.

And maybe you could also prove some more helpful advice on how to actually get an ISP to issue a new IP? As far as I'm aware most of them use DHCP, so a release+renew at the gateway will just pull the same address, and I have no interest in leaving my modem disconnected for long enough that the lease expires.

At any rate, at command line, I'm just using gallery-dl https://x.com/<account> on first-pass, and gallery-dl -o skip=abort:3 https://x.com/<account> on subsequent runs. Config, in relevant part (removed other extractor details):

{
    "extractor": {
		"archive": "<database>,
		"base-directory": <directory>,
		"path-extended": true,
		"user-agent": "browser",
		"retries":-1,
		"twitter":{
			"archive": <twitter_db>,
			"cookies": <twitter_cookie_file>,
             "filename": "{date:%Y%m%d_%H%M%S}-{tweet_id}-img{num:>02}.{extension}",
			 "sleep": [5, 7.5],
			 "sleep-request": [30, 35]
        }
    },
    "downloader": {
		"mtime": false
    },
    "output": {
		"shorten": "eaw"
    }
}

@a84r7a3rga76fg
Copy link

timeline is retrieving tweets from random users. abort doesn't work because it ignores those tweets. The process never ends because I waited days for the process of the input URL to end. When timeline did work, it was what always caused Twitter to impose rate limiting regardless of what my sleep and sleep-request timings were. Twitter bans you if your IP address is on their list while getting rate limited.

@ForxBase
Copy link
Author

timeline is retrieving tweets from random users. abort doesn't work because it ignores those tweets. The process never ends because I waited days for the process of the input URL to end. When timeline did work, it was what always caused Twitter to impose rate limiting regardless of what my sleep and sleep-request timings were. Twitter bans you if your IP address is on their list while getting rate limited.

I just made a new account over another IP and used that IP to download from one single user. My account got suspended almost immediately after the download finished! I don't know what to do.

@ericsia
Copy link

ericsia commented Aug 22, 2024

same here, my account got suspended, twitter must have been noticing this

@ForxBase
Copy link
Author

same here, my account got suspended, twitter must have been noticing this

twitter is unusable now and I can't make a new account for each user download...

@06000208
Copy link

I've had an account suspended as well. Not sure what my sleep values were, unfortunately, as I increased them since then.

A related issue: #5775

@ForxBase
Copy link
Author

I've had an account suspended as well. Not sure what my sleep values were, unfortunately, as I increased them since then.

A related issue: #5775

Anyone who doesn't get suspended? Is there nothing I can do?

@overattackwatch
Copy link

I've had an account suspended as well. Not sure what my sleep values were, unfortunately, as I increased them since then.
A related issue: #5775

Anyone who doesn't get suspended? Is there nothing I can do?

I just kept appealing but I think they are ignoring me now so
Cheers musk

@ForxBase
Copy link
Author

ForxBase commented Sep 5, 2024

I've had an account suspended as well. Not sure what my sleep values were, unfortunately, as I increased them since then.
A related issue: #5775

Anyone who doesn't get suspended? Is there nothing I can do?

I just kept appealing but I think they are ignoring me now so Cheers musk

does it work for you now?

@WarmWelcome
Copy link

Hoping that a fix for this comes out soon. Have not been able to back up in a while and artists from Brazil have started to delete their accounts, or have already. Any help would be appreciated. Account got locked but not banned.
sleep at 10-38 and sleep-request at 15-55.

@NonaSuomi
Copy link

That can't be it, or at least not the whole picture, because the first account I used with GDL, and consequently the first one I lost, was my daily-driver personal account.

@Twi-Hard
Copy link

Twi-Hard commented Sep 9, 2024

I haven't used my account in weeks now and it's still been downloading 24/7 successfully. Even when I "used" my account I was just browsing art or looking for accounts to download without tweeting or retweeting.

@WarmWelcome
Copy link

I haven't used my account in weeks now and it's still been downloading 24/7 successfully. Even when I "used" my account I was just browsing art or looking for accounts to download without tweeting or and retweeting.

What region/country are you located in? Do you have 2fa activated, with phone or something? Trying to figure out the link between all of these inconsistent recommendations.

@Twi-Hard
Copy link

Twi-Hard commented Sep 9, 2024

I'm in the west coast USA. I don't have 2FA enabled. I have a phone number on the account but it's a Google voice number which means it's a VOIP number which I assume is the type of number the bot people use. I have a dynamic IP and have never used a VPN or proxy with this specific account. Before Elon Musks takeover I filled a 14TB drive with the drive speed being the bottleneck which means I was going extremely fast. I say that because that surely should have caused some red flags on their end. Unfortunately almost none of those accounts were relevant so I switched to whitelisting which accounts I download.
I've never used Twitter to tweet, retweet or like ever. I've used it to DM people and browse art.
The account is 5 years old. It's also a developer account but I doubt that matters.

I said the following in a related issue (#5775)

Anyone who doesn't get suspended?

I download Twitter 24/7 with a low sleep setting. Before I made the time between each request random (using a range) I had to have a lot higher sleep between each request. This is using my home IP with my actual twitter account I use. After I lowered the sleep and made it a range I've recieved no rate limiting at all. Before the change I'd get told to wait until a certain time before continuing. This is a little confusing to me because I'm probably making a lot more requests per day than Elon allocates to free accounts. I'm not home so I can't tell you the settings yet. I actually copied them from somewhere else in gallery-dl's issues.

-o "sleep=[1.5,5]"
-o "sleep-request=[6.0,12.0]"

@WarmWelcome
Copy link

I'm in the west coast USA. I don't have 2FA enabled. I have a phone number on the account but it's a Google voice number which means it's a VOIP number which I assume is the type of number the bot people use. I have a dynamic IP and have never used a VPN or proxy with this specific account. Before Elon Musks takeover I filled a 14TB drive with the drive speed being the bottleneck which means I was going extremely fast. I say that because that surely should have caused some red flags on their end. Unfortunately almost none of those accounts were relevant so I switched to whitelisting which accounts I download. I've never used Twitter to tweet, retweet or like ever. I've used it to DM people and browse art. The account is 5 years old. It's also a developer account but I doubt that matters.

I said the following in a related issue (#5775)

Anyone who doesn't get suspended?

I download Twitter 24/7 with a low sleep setting. Before I made the time between each request random (using a range) I had to have a lot higher sleep between each request. This is using my home IP with my actual twitter account I use. After I lowered the sleep and made it a range I've recieved no rate limiting at all. Before the change I'd get told to wait until a certain time before continuing. This is a little confusing to me because I'm probably making a lot more requests per day than Elon allocates to free accounts. I'm not home so I can't tell you the settings yet. I actually copied them from somewhere else in gallery-dl's issues.

-o "sleep=[1.5,5]"
-o "sleep-request=[6.0,12.0]"

Im guessing that that is as a result of having a legacy twitter developer account... ugh. Might have to give applying for one a shot, but they watch what you do and I dont even know what I would say in the 250 character application.

@WarmWelcome
Copy link

I applied for the API thing and immediately got access to the developer section, so apparently it isn't a "wait for approval" type application. Going to give it a try sometime later, but I still have hard API limits.

@ericsia
Copy link

ericsia commented Sep 10, 2024

I suspect that twitter does not delete accounts if it seems like a human user uses it regularly. What I suspect is going on is that gallery-dl activity is getting flagged as a possible bot activity and needs further investigation. Then if there is no evidence of typical user activity, such as tweeting, retweeting, etc. it deletes account. However if there is evidence of typical user activity, it makes the user do the arkrose challenge. At least, that is what I noticed with my gallery-dl use on my accounts.

I might be wrong, and I do not know for sure. Personally, I have not lost an account yet, but the accounts that I use gallery-dl on are also accounts that I use weekly. Though these days, I always get a arkrose challenge after a gallery-dl run. I also do not know if there are any actions or activity that would get a account deleted that a gallery-dl user might also be doing. So what I am saying could be a detrimental task. Try it on your account at your own risk.

I think your account will get banned soon as well. Afterall, we all do the arkose challenge before getting banned.
Now my new account can't even login using cookie option
image

@Joebugg
Copy link

Joebugg commented Sep 10, 2024

Wow, they really really want you to use a browser to scrape with? Seems kind of backwards.

So if you use Selenium, what happens? I'm wondering if they're using a JS version of a canary warrant. There's code you can run on the server that expects certain responses (i.e. anti-adblocker methods), for example. Not sure the "order of headers" is too useful now, since browsers all seem to be moving to randomizing the order. Do people get these same bans if they use a userscript method?

I'm actually curious what they're doing that triggers the bans. Of course, they don't want you to know. ;) Watch it be something stupidly simple because they only have to get it right, once.

@docholllidae
Copy link

docholllidae commented Sep 18, 2024

i have been scraping from twitter for near a year, and only lost two accounts for reasons unrelated to g-dl
e: also both cookies come from accounts which i actively use, the cookie i typically dl with comes from my more heavily used account

my config:

{
    "extractor": {
        "base-directory": "X:/My Drive/!pr0n/",
        "archive": "%appdata%/gallery-dl/archive.sqlite3",
        "path-restrict": "^A-Za-z0-9_.~!-",
        "skip": "abort:3",
        "keywords-default": "",

		
        "twitter": {
            "archive": "X:/My Drive/zzTwitter/archive.twitter.sqlite3",
            "parent-directory": "true",
            "skip": "abort:3",
            
            "#cookies": "X:/My Drive/zzTwitter/cookies.twitter.1.txt",
            "cookies": "X:/My Drive/zzTwitter/cookies.twitter.2.txt",

            "sleep": [24.9, 45.2],
            "sleep-request": [23.8, 52.6],
            
            "image-filter": "author is user",
            "logout": true,
            "syndication": true,
            "text-tweets": true,
            "include": ["avatar","background","media","timeline"],
            
            "directory": {
                "count ==0":["zzTwitter","downloads","{author[id]}.{author[name]}","text_tweets"],
                "":         ["zzTwitter","downloads","{author[id]}.{author[name]}","media"]
            },
            "filename": "{date:%Y-%m-%d_%H-%M-%S}~_~{tweet_id}-{num}.{author[name]}_~{content[0:69]}~_~{filename}.{extension}",
            "avatar": {
                "directory": ["zzTwitter","downloads","{author[id]}.{author[name]}","media","avatar"],
                "archive": "",
                "filename": "{date:%Y-%m-%d_%H-%M-%S}_avatar_{author[id]}.{author[name]}~_~{filename}.{extension}"
            },
            "background": {
                "directory": ["zzTwitter","downloads","{author[id]}.{author[name]}","media","background"],
                "archive": "",
                "filename": "background_{date:%Y-%m-%d_%H-%M-%S}~_~{filename}.{extension}"
            },
            
            "metadata": true,
            "postprocessors":[{
                "name": "metadata",
                "event": "post",
                "directory": "metadata",
                "filename": "{date:%Y-%m-%d_%H-%M-%S}~_~{tweet_id}.{author[name]}~_~{content[0:69]}.json"
            }]            
        }
    }
}

@ForxBase
Copy link
Author

ForxBase commented Sep 20, 2024

i have been scraping from twitter for near a year, and only lost two accounts for reasons unrelated to g-dl e: also both cookies come from accounts which i actively use, the cookie i typically dl with comes from my more heavily used account

my config:

{
    "extractor": {
        "base-directory": "X:/My Drive/!pr0n/",
        "archive": "%appdata%/gallery-dl/archive.sqlite3",
        "path-restrict": "^A-Za-z0-9_.~!-",
        "skip": "abort:3",
        "keywords-default": "",

		
        "twitter": {
            "archive": "X:/My Drive/zzTwitter/archive.twitter.sqlite3",
            "parent-directory": "true",
            "skip": "abort:3",
            
            "#cookies": "X:/My Drive/zzTwitter/cookies.twitter.1.txt",
            "cookies": "X:/My Drive/zzTwitter/cookies.twitter.2.txt",

            "sleep": [24.9, 45.2],
            "sleep-request": [23.8, 52.6],
            
            "image-filter": "author is user",
            "logout": true,
            "syndication": true,
            "text-tweets": true,
            "include": ["avatar","background","media","timeline"],
            
            "directory": {
                "count ==0":["zzTwitter","downloads","{author[id]}.{author[name]}","text_tweets"],
                "":         ["zzTwitter","downloads","{author[id]}.{author[name]}","media"]
            },
            "filename": "{date:%Y-%m-%d_%H-%M-%S}~_~{tweet_id}-{num}.{author[name]}_~{content[0:69]}~_~{filename}.{extension}",
            "avatar": {
                "directory": ["zzTwitter","downloads","{author[id]}.{author[name]}","media","avatar"],
                "archive": "",
                "filename": "{date:%Y-%m-%d_%H-%M-%S}_avatar_{author[id]}.{author[name]}~_~{filename}.{extension}"
            },
            "background": {
                "directory": ["zzTwitter","downloads","{author[id]}.{author[name]}","media","background"],
                "archive": "",
                "filename": "background_{date:%Y-%m-%d_%H-%M-%S}~_~{filename}.{extension}"
            },
            
            "metadata": true,
            "postprocessors":[{
                "name": "metadata",
                "event": "post",
                "directory": "metadata",
                "filename": "{date:%Y-%m-%d_%H-%M-%S}~_~{tweet_id}.{author[name]}~_~{content[0:69]}.json"
            }]            
        }
    }
}

I tried this from your config but didn't work it doesn't log in "authentication required"

{
"extractor": {
"base-directory": "D:\gallery-dl",
"path-restrict": "^A-Za-z0-9_.~!-",
"skip": "abort:3",
"keywords-default": "",

    "twitter": {
        "parent-directory": "true",
        "skip": "abort:3",
        
        "cookies-from-browser": "firefox",

        "sleep": [24.9, 45.2],
        "sleep-request": [23.8, 52.6],
        
        "image-filter": "author is user",
        "logout": true,
        "syndication": true,
        "text-tweets": true,
        "include": ["avatar","background","media","timeline"],
        
        "directory": {
            "count ==0":["Twitter","downloads","{author[id]}.{author[name]}","text_tweets"],
            "":         ["Twitter","downloads","{author[id]}.{author[name]}","media"]
        },
        "filename": "{date:%Y-%m-%d_%H-%M-%S}~_~{tweet_id}-{num}.{author[name]}_~{content[0:69]}~_~{filename}.{extension}",
        "avatar": {
            "directory": ["Twitter","downloads","{author[id]}.{author[name]}","media","avatar"],
            "filename": "{date:%Y-%m-%d_%H-%M-%S}_avatar_{author[id]}.{author[name]}~_~{filename}.{extension}"
        },
        "background": {
            "directory": ["Twitter","downloads","{author[id]}.{author[name]}","media","background"],
            "filename": "background_{date:%Y-%m-%d_%H-%M-%S}~_~{filename}.{extension}"
        },
        
        "metadata": true,
        "postprocessors": [{
            "name": "metadata",
            "event": "post",
            "directory": "metadata",
            "filename": "{date:%Y-%m-%d_%H-%M-%S}~_~{tweet_id}.{author[name]}~_~{content[0:69]}.json"
        }]
    }
}

}

@ForxBase
Copy link
Author

i have been scraping from twitter for near a year, and only lost two accounts for reasons unrelated to g-dl e: also both cookies come from accounts which i actively use, the cookie i typically dl with comes from my more heavily used account

my config:

{
    "extractor": {
        "base-directory": "X:/My Drive/!pr0n/",
        "archive": "%appdata%/gallery-dl/archive.sqlite3",
        "path-restrict": "^A-Za-z0-9_.~!-",
        "skip": "abort:3",
        "keywords-default": "",

		
        "twitter": {
            "archive": "X:/My Drive/zzTwitter/archive.twitter.sqlite3",
            "parent-directory": "true",
            "skip": "abort:3",
            
            "#cookies": "X:/My Drive/zzTwitter/cookies.twitter.1.txt",
            "cookies": "X:/My Drive/zzTwitter/cookies.twitter.2.txt",

            "sleep": [24.9, 45.2],
            "sleep-request": [23.8, 52.6],
            
            "image-filter": "author is user",
            "logout": true,
            "syndication": true,
            "text-tweets": true,
            "include": ["avatar","background","media","timeline"],
            
            "directory": {
                "count ==0":["zzTwitter","downloads","{author[id]}.{author[name]}","text_tweets"],
                "":         ["zzTwitter","downloads","{author[id]}.{author[name]}","media"]
            },
            "filename": "{date:%Y-%m-%d_%H-%M-%S}~_~{tweet_id}-{num}.{author[name]}_~{content[0:69]}~_~{filename}.{extension}",
            "avatar": {
                "directory": ["zzTwitter","downloads","{author[id]}.{author[name]}","media","avatar"],
                "archive": "",
                "filename": "{date:%Y-%m-%d_%H-%M-%S}_avatar_{author[id]}.{author[name]}~_~{filename}.{extension}"
            },
            "background": {
                "directory": ["zzTwitter","downloads","{author[id]}.{author[name]}","media","background"],
                "archive": "",
                "filename": "background_{date:%Y-%m-%d_%H-%M-%S}~_~{filename}.{extension}"
            },
            
            "metadata": true,
            "postprocessors":[{
                "name": "metadata",
                "event": "post",
                "directory": "metadata",
                "filename": "{date:%Y-%m-%d_%H-%M-%S}~_~{tweet_id}.{author[name]}~_~{content[0:69]}.json"
            }]            
        }
    }
}

but with this value you use

"sleep": [24.9, 45.2],
"sleep-request": [23.8, 52.6],

it takes forever to download one entire user profile!

@docholllidae
Copy link

docholllidae commented Sep 20, 2024

I tried this from your config but didn't work it doesn't log in "authentication required"

{ "extractor": { "base-directory": "D:\gallery-dl", "path-restrict": "^A-Za-z0-9_.~!-", "skip": "abort:3", "keywords-default": "",

    "twitter": {
        "parent-directory": "true",
        "skip": "abort:3",
        
        "cookies-from-browser": "firefox",

        "sleep": [24.9, 45.2],
        "sleep-request": [23.8, 52.6],
        
        "image-filter": "author is user",
        "logout": true,
        "syndication": true,
        "text-tweets": true,
        "include": ["avatar","background","media","timeline"],
        
        "directory": {
            "count ==0":["Twitter","downloads","{author[id]}.{author[name]}","text_tweets"],
            "":         ["Twitter","downloads","{author[id]}.{author[name]}","media"]
        },
        "filename": "{date:%Y-%m-%d_%H-%M-%S}~_~{tweet_id}-{num}.{author[name]}_~{content[0:69]}~_~{filename}.{extension}",
        "avatar": {
            "directory": ["Twitter","downloads","{author[id]}.{author[name]}","media","avatar"],
            "filename": "{date:%Y-%m-%d_%H-%M-%S}_avatar_{author[id]}.{author[name]}~_~{filename}.{extension}"
        },
        "background": {
            "directory": ["Twitter","downloads","{author[id]}.{author[name]}","media","background"],
            "filename": "background_{date:%Y-%m-%d_%H-%M-%S}~_~{filename}.{extension}"
        },
        
        "metadata": true,
        "postprocessors": [{
            "name": "metadata",
            "event": "post",
            "directory": "metadata",
            "filename": "{date:%Y-%m-%d_%H-%M-%S}~_~{tweet_id}.{author[name]}~_~{content[0:69]}.json"
        }]
    }
}

}

are you using a cookies file? twitter requires logging in to view most profiles nowadays
create a cookie file and point to that.

I use an chrome extension Open Cookies.txt, install that and then log into twitter on your desktop browser. click the extension and if it requests permission to read your data on Twitter say Grant Access
then choose the Raw Cookies.txt option and highlight everything in the resulting text block and copy-paste it into a file

but with this value you use

"sleep": [24.9, 45.2],
"sleep-request": [23.8, 52.6],

it takes forever to download one entire user profile!

yes it can take some time; i haven't played much with the sleep times but they can probably go lower without risk to the account being banned
it takes me about 3 days to re-download almost 500 profiles with the "skip": "abort:3" option set (about 10min per profile), after not running it for a while so there was more media than normal for me to grab.
run my same inputs again with lower sleep/sleep-request values

@docholllidae
Copy link

also, my config for twitter will download text tweets too, and makes a json file for all tweets.
if you don't want that you can easily edit them out of the config

@overattackwatch
Copy link

overattackwatch commented Sep 21, 2024

also, my config for twitter will download text tweets too, and makes a json file for all tweets. if you don't want that you can easily edit them out of the config

I havent really used cookies before, am I doing this right? my browser is operagx, I can switch if needed I have other browsers installed

code pasted into .conf file with twittercookie.txt being a copy paste of what I got from rawcookies.txt

        },

           "twitter": {
        "parent-directory": "true",
        "skip": "abort:3",
        
        "cookies": "C:\Users\UserProfile\AppData\Roaming\gallery-dl\twittercookie.txt",

        "sleep": [24.9, 45.2],
        "sleep-request": [23.8, 52.6],
        
        "image-filter": "author is user",
        "logout": true,
        "syndication": true,
        "text-tweets": false,
        "include": ["avatar","background","media","timeline"],
        
        "directory": {
            "count ==0":["Twitter","downloads","{author[id]}.{author[name]}","text_tweets"],
            "":         ["Twitter","downloads","{author[id]}.{author[name]}","media"]
        },
        "filename": "{date:%Y-%m-%d_%H-%M-%S}~_~{tweet_id}-{num}.{author[name]}_~{content[0:69]}~_~{filename}.{extension}",
        "avatar": {
            "directory": ["Twitter","downloads","{author[id]}.{author[name]}","media","avatar"],
            "filename": "{date:%Y-%m-%d_%H-%M-%S}_avatar_{author[id]}.{author[name]}~_~{filename}.{extension}"
        },
        "background": {
            "directory": ["Twitter","downloads","{author[id]}.{author[name]}","media","background"],
            "filename": "background_{date:%Y-%m-%d_%H-%M-%S}~_~{filename}.{extension}"
        },
        
        "metadata": true,
        "postprocessors": [{
            "name": "metadata",
            "event": "post",
            "directory": "metadata",
            "filename": "{date:%Y-%m-%d_%H-%M-%S}~_~{tweet_id}.{author[name]}~_~{content[0:69]}.json"
        }]
    }
}

I then do a basic download such as
py -3 -m gallery_dl -D C:\Users\downloadtempname\ https://x.com/tempname

And I get a response of
[twitter][info] Requesting guest token
[twitter][error] AuthorizationError: Login required

@docholllidae
Copy link

can you run it again but add -v at the end of the command
then paste the verbose output

@overattackwatch
Copy link

can you run it again but add -v at the end of the command then paste the verbose output

C:\Users\>py -3 -m gallery_dl -D C:\Users\downloadtempname\ https://x.com/tempname -v
[gallery-dl][debug] Version 1.27.1
[gallery-dl][debug] Python 3.12.2 - Windows-11-10.0.22631-SP0
[gallery-dl][debug] requests 2.32.3 - urllib3 2.2.0
[gallery-dl][debug] Configuration Files []
[gallery-dl][debug] Starting DownloadJob for 'https://x.com/tempname'
[twitter][debug] Using TwitterUserExtractor for 'https://x.com/tempname'
[twitter][debug] Using TwitterTimelineExtractor for 'https://x.com/tempname/timeline'
[twitter][info] Requesting guest token
[urllib3.connectionpool][debug] Starting new HTTPS connection (1): api.x.com:443
[urllib3.connectionpool][debug] https://api.x.com:443 "POST /1.1/guest/activate.json HTTP/1.1" 200 63
[urllib3.connectionpool][debug] Starting new HTTPS connection (1): x.com:443
[urllib3.connectionpool][debug] https://x.com:443 "GET /i/api/graphql/k5XapwcSikNsEsILW5FvgA/UserByScreenName?variables=**Listed information like what is set to true or false in the conf** HTTP/1.1" 200 1040
[urllib3.connectionpool][debug] https://x.com:443 "GET /i/api/graphql/tO4LMUYAZbR4T0SqQ85aAw/UserMedia?variables=%7 **Listed information like what is set to true or false in the conf** HTTP/1.1" 404 0
[twitter][debug] API error: 'Unspecified'
[twitter][error] AuthorizationError: Login required

@mikf
Copy link
Owner

mikf commented Sep 21, 2024

[gallery-dl][debug] Configuration Files []

You config file is not getting loaded. Make sure it is at one of the locations listed here or by gallery-dl --config-status.

@overattackwatch
Copy link

[gallery-dl][debug] Configuration Files []

You config file is not getting loaded. Make sure it is at one of the locations listed here or by gallery-dl --config-status.


C:\Users\Userprofile>gallery-dl --config-status
[config][error] JSONDecodeError when loading 'C:\Users\Userprofile\gallery-dl\config.json': Invalid \escape: line 70 column 23 (char 2234)
C:\Users\Userprofile\AppData\Roaming\gallery-dl\config.json : Not Present
C:\Users\Userprofile\gallery-dl\config.json                 : Invalid JSON
C:\Users\Userprofile\gallery-dl.conf                        : Not Present

C:\Users\Userprofile>py -3 gallery-dl --config-status
C:\Users\Userprofile\AppData\Local\Programs\Python\Python312\python.exe: can't find '__main__' module in 'C:\\Users\\Userprofile\\gallery-dl'

@mikf
Copy link
Owner

mikf commented Sep 21, 2024

[config][error] JSONDecodeError when loading 'C:\Users\Userprofile\gallery-dl\config.json': Invalid \escape: line 70 column 23 (char 2234)

    "cookies": "C:\Users\JamesD\AppData\Roaming\gallery-dl\twittercookie.txt",

You can't use single backslashes for filesystem paths in a JSON file. You need to either double them \\ or replace them with forward slashes /.

"cookies": "C:\\Users\\JamesD\\AppData\\Roaming\\gallery-dl\\twittercookie.txt",
"cookies": "C:/Users/JamesD/AppData/Roaming/gallery-dl/twittercookie.txt",

@overattackwatch
Copy link

C:\Users\UserProfile>gallery-dl --config-status
[config][error] JSONDecodeError when loading 'C:\Users\UserProfile\gallery-dl\config.json': Expecting ',' delimiter: line 103 column 2 (char 3653)
C:\Users\UserProfile\AppData\Roaming\gallery-dl\config.json : Not Present
C:\Users\UserProfile\gallery-dl\config.json                 : Invalid JSON
C:\Users\UserProfile\gallery-dl.conf                        : Not Present

@Hrxn
Copy link
Contributor

Hrxn commented Sep 22, 2024

Your config file is still not valid JSON, therefore it is not loaded/used at all.
Use a site like https://www.jslint.com/ for example to fix your JSON if your editor can't do that.

@overattackwatch
Copy link

Your config file is still not valid JSON, therefore it is not loaded/used at all. Use a site like https://www.jslint.com/ for example to fix your JSON if your editor can't do that.

Thanks for the site, I had the json ending with

            }]            
}

fixed when I changed it to

            }]            
        }
    }
}

@ForxBase
Copy link
Author

I tried this from your config but didn't work it doesn't log in "authentication required"
{ "extractor": { "base-directory": "D:\gallery-dl", "path-restrict": "^A-Za-z0-9_.~!-", "skip": "abort:3", "keywords-default": "",

    "twitter": {
        "parent-directory": "true",
        "skip": "abort:3",
        
        "cookies-from-browser": "firefox",

        "sleep": [24.9, 45.2],
        "sleep-request": [23.8, 52.6],
        
        "image-filter": "author is user",
        "logout": true,
        "syndication": true,
        "text-tweets": true,
        "include": ["avatar","background","media","timeline"],
        
        "directory": {
            "count ==0":["Twitter","downloads","{author[id]}.{author[name]}","text_tweets"],
            "":         ["Twitter","downloads","{author[id]}.{author[name]}","media"]
        },
        "filename": "{date:%Y-%m-%d_%H-%M-%S}~_~{tweet_id}-{num}.{author[name]}_~{content[0:69]}~_~{filename}.{extension}",
        "avatar": {
            "directory": ["Twitter","downloads","{author[id]}.{author[name]}","media","avatar"],
            "filename": "{date:%Y-%m-%d_%H-%M-%S}_avatar_{author[id]}.{author[name]}~_~{filename}.{extension}"
        },
        "background": {
            "directory": ["Twitter","downloads","{author[id]}.{author[name]}","media","background"],
            "filename": "background_{date:%Y-%m-%d_%H-%M-%S}~_~{filename}.{extension}"
        },
        
        "metadata": true,
        "postprocessors": [{
            "name": "metadata",
            "event": "post",
            "directory": "metadata",
            "filename": "{date:%Y-%m-%d_%H-%M-%S}~_~{tweet_id}.{author[name]}~_~{content[0:69]}.json"
        }]
    }
}

}

are you using a cookies file? twitter requires logging in to view most profiles nowadays create a cookie file and point to that.

I use an chrome extension Open Cookies.txt, install that and then log into twitter on your desktop browser. click the extension and if it requests permission to read your data on Twitter say Grant Access then choose the Raw Cookies.txt option and highlight everything in the resulting text block and copy-paste it into a file

but with this value you use
"sleep": [24.9, 45.2],
"sleep-request": [23.8, 52.6],
it takes forever to download one entire user profile!

yes it can take some time; i haven't played much with the sleep times but they can probably go lower without risk to the account being banned it takes me about 3 days to re-download almost 500 profiles with the "skip": "abort:3" option set (about 10min per profile), after not running it for a while so there was more media than normal for me to grab. run my same inputs again with lower sleep/sleep-request values

Thanks. With these values

"sleep": [24.9, 45.2],
"sleep-request": [23.8, 52.6],

it takes you three days to download 500 profiles? How? I lowered them, didn't get banned yet.

@docholllidae
Copy link

Thanks. With these values

"sleep": [24.9, 45.2],
"sleep-request": [23.8, 52.6],

it takes you three days to download 500 profiles? How? I lowered them, didn't get banned yet.

i updated to these values:

            "sleep": [12.9, 31.2],
            "sleep-request": [11.8, 35.6],

and got through 400 profiles in about 16hr
i simply am being overly cautious of the twitter timeouts
when i first started with sleep times of something like 5s i would frequently be forced to prove i'm human . i'm actually surprised it never resulted in a ban considering how often it happened

@overattackwatch
Copy link

and got through 400 profiles in about 16hr i simply am being overly cautious of the twitter timeouts when i first started with sleep times of something like 5s i would frequently be forced to prove i'm human . i'm actually surprised it never resulted in a ban considering how often it happened

Same used to get prove your human alot now never, Its weird. the account I use to download is even suspended and it just doesnt care

@WarmWelcome
Copy link

i have been scraping from twitter for near a year, and only lost two accounts for reasons unrelated to g-dl e: also both cookies come from accounts which i actively use, the cookie i typically dl with comes from my more heavily used account

my config:

{
    "extractor": {
        "base-directory": "X:/My Drive/!pr0n/",
        "archive": "%appdata%/gallery-dl/archive.sqlite3",
        "path-restrict": "^A-Za-z0-9_.~!-",
        "skip": "abort:3",
        "keywords-default": "",

		
        "twitter": {
            "archive": "X:/My Drive/zzTwitter/archive.twitter.sqlite3",
            "parent-directory": "true",
            "skip": "abort:3",
            
            "#cookies": "X:/My Drive/zzTwitter/cookies.twitter.1.txt",
            "cookies": "X:/My Drive/zzTwitter/cookies.twitter.2.txt",

            "sleep": [24.9, 45.2],
            "sleep-request": [23.8, 52.6],
            
            "image-filter": "author is user",
            "logout": true,
            "syndication": true,
            "text-tweets": true,
            "include": ["avatar","background","media","timeline"],
            
            "directory": {
                "count ==0":["zzTwitter","downloads","{author[id]}.{author[name]}","text_tweets"],
                "":         ["zzTwitter","downloads","{author[id]}.{author[name]}","media"]
            },
            "filename": "{date:%Y-%m-%d_%H-%M-%S}~_~{tweet_id}-{num}.{author[name]}_~{content[0:69]}~_~{filename}.{extension}",
            "avatar": {
                "directory": ["zzTwitter","downloads","{author[id]}.{author[name]}","media","avatar"],
                "archive": "",
                "filename": "{date:%Y-%m-%d_%H-%M-%S}_avatar_{author[id]}.{author[name]}~_~{filename}.{extension}"
            },
            "background": {
                "directory": ["zzTwitter","downloads","{author[id]}.{author[name]}","media","background"],
                "archive": "",
                "filename": "background_{date:%Y-%m-%d_%H-%M-%S}~_~{filename}.{extension}"
            },
            
            "metadata": true,
            "postprocessors":[{
                "name": "metadata",
                "event": "post",
                "directory": "metadata",
                "filename": "{date:%Y-%m-%d_%H-%M-%S}~_~{tweet_id}.{author[name]}~_~{content[0:69]}.json"
            }]            
        }
    }
}

I stole a great deal of this and it works pretty damn well. One question though, what is syndication? Cant find it in the configuration docs here: https://gdl-org.github.io/docs/configuration.html

@mikf
Copy link
Owner

mikf commented Oct 7, 2024

what is syndication?

It was a workaround to download age-restricted content without login, back when you could use Twitter as guest user. 1171911, 92ff99c

@wankio
Copy link
Contributor

wankio commented Oct 15, 2024

i'm using main twitter account, it dont even getting suspended for many years, but the one thing is rate limit lol. if you using clone account, etc.. it's highly getting suspended

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests