Instagram stopped working #1149

mikaljan · 2020-12-01T19:12:29Z

gallery-dl stop working on instagram today, i'm getting the following error:

E:\gallery-dl>gallery-dl https://www.instagram.com/migichen_/
[instagram][warning] Unable to fetch data from 'https://www.instagram.com/p/CIP9dLAhkn3/': JSONDecodeError: Expecting value: line 1 column 1 (char 0)
[instagram][warning] Unable to fetch data from 'https://www.instagram.com/p/CIN3Hhwhtne/': JSONDecodeError: Expecting value: line 1 column 1 (char 0)
[instagram][warning] Unable to fetch data from 'https://www.instagram.com/p/CIDVKJshBuM/': JSONDecodeError: Expecting value: line 1 column 1 (char 0)
[instagram][warning] Unable to fetch data from 'https://www.instagram.com/p/CH-cjDIh0Tz/': JSONDecodeError: Expecting value: line 1 column 1 (char 0)
[instagram][warning] Unable to fetch data from 'https://www.instagram.com/p/CH4-mdcBlAP/': JSONDecodeError: Expecting value: line 1 column 1 (char 0)
[instagram][warning] Unable to fetch data from 'https://www.instagram.com/p/CH2itYohHD8/': JSONDecodeError: Expecting value: line 1 column 1 (char 0)
[instagram][warning] Unable to fetch data from 'https://www.instagram.com/p/CH0I8u5BWVQ/': JSONDecodeError: Expecting value: line 1 column 1 (char 0)
[instagram][warning] Unable to fetch data from 'https://www.instagram.com/p/CHxLcqxBqfe/': JSONDecodeError: Expecting value: line 1 column 1 (char 0)
.
.
.
.

iamleot · 2020-12-01T21:34:16Z

Hello @mikaljan!
Unfortunately I think this is similar to #1113 (i.e. Instagram starting being more aggressive with users that requests several images).

(I've tryed downloading the profile here - without authenticating - and it seems that I'm downloading it but I'm pretty sure I will be blocked soon.)

iamleot · 2020-12-01T21:36:21Z

...and indeed after ~2 minutes or so:

% gallery-dl -v 'https://www.instagram.com/migichen_/'                                                                                                                                    
[gallery-dl][debug] Version 1.15.4
[gallery-dl][debug] Python 3.8.6 - NetBSD-9.99.75-amd64-x86_64-64bit-ELF
[gallery-dl][debug] requests 2.24.0 - urllib3 1.25.11
[gallery-dl][debug] Starting DownloadJob for 'https://www.instagram.com/migichen_/'
[instagram][debug] Using InstagramUserExtractor for 'https://www.instagram.com/migichen_/'
[urllib3.connectionpool][debug] Starting new HTTPS connection (1): www.instagram.com:443
[urllib3.connectionpool][debug] https://www.instagram.com:443 "GET /migichen_/ HTTP/1.1" 200 49249
[urllib3.connectionpool][debug] https://www.instagram.com:443 "GET /p/CIP9dLAhkn3/?__a=1 HTTP/1.1" 302 0
[urllib3.connectionpool][debug] https://www.instagram.com:443 "GET /accounts/login/ HTTP/1.1" 200 12619
[instagram][warning] Unable to fetch data from 'https://www.instagram.com/p/CIP9dLAhkn3/':  JSONDecodeError: Expecting value: line 1 column 1 (char 0)

aeriessy · 2020-12-02T03:30:02Z

I'm also having the same issue. I used two accounts and one of them is now banned. I used a 10 second delay for sleep and sleep-request which got through maybe 50 files or something before it gave me the error. Before the account was banned, I was able to download in batches of 50 until it gave me the "something is wrong with your account, change your password" or phone verification. After doing that maybe 3 times, that account was banned outright.

Probably going to put this off until this is fixed or figured out.

UnforeseenOcean · 2020-12-02T15:13:33Z

I'm having better luck with setting the sleep time to 15, but that could change at any moment. I did get the Your Account Has Been Temporarily Locked message on my phone after using the cookie and forgetting to set the delay.

Note: If you get this error, your account might be locked. Unlock it and set the delay to something about 10 or 15 seconds longer. Oh and I'd recommend not using your primary account for this!

reallyuniquename · 2020-12-03T10:50:45Z

I think Instagram extractor needs a slight rewrite. It's inefficient and with new Instagram rate limits you get stuck with first few hundrends of images at best. I explained that here #1113 (comment).

Either that or gallery-dl have to detect ip ban and support proxy lists for quick address rotation.

iamleot · 2020-12-03T12:10:54Z

Mike writes:

I think Instagram extractor needs a slight rewrite. It's inefficient and with new Instagram rate limits you get stuck with first few hundrends of images at best. I explained that here #1113 (comment). Either that or gallery-dl have to detect ip ban and support proxy lists for quick address rotation.

Can you elaborate further how to make that more efficient? At least based on how it works - and AFAIK by scraping Instagram - I think you inevitably needs to scroll all the timeline.

(#1113, #1122, #1128, #1130, #1149) Rely on the results of GraphQL queries instead of requesting data for each post separately via '/p/<shortcode>/?__a=1'. This might result in some missing metadata, and there might be some issues for '/channel/' and '/saved/' URLs, but at least downloading from the regular post listings should work without issues and without getting users blocked/banned. TODO: reimplement support for stories

reallyuniquename · 2020-12-03T13:40:06Z

needs to scroll all the timeline

Correct but scrolling is querying graphql endpoint and that's like only 80 queries per 1000 images. Besides you could dump whole timeline once and keep reusing it until you download every picture.

What Instagram really doesn't like is when you start hammering /p/ABCDEFG123 pages. When rate limits hit gallery-dl has to either switch proxy or start scraping from the last downloaded image on the next run. None of that is properly supported by extractor, --range and --download-archive do not work with Instagram the way you expect it. Gallery-dl starts from the beginning of the timeline every time.

Also when I look at the log it seems that extractor just skips images it fails to download, no retries or pause. That's... not good.

mikf · 2020-12-03T13:42:40Z

Should be fixed with 447488f.

Querying /p/<shortcode>/__a=1 for each post is what gets one blocked/banned, and I would highly advice against using gallery-dl versions before 1.16.0 for Instagram or any other Instagram downloaders that do this (which are pretty much all of them from what I can tell).

The rewrite is still lacking support for stories, and post listings other than the regular one (e.g. instagram.com/instagram) might not work as before, but at least it won't get you banned anymore.

dsblack · 2020-12-03T18:45:27Z

I've been having this problem for weeks, so I'm very happy to see it being addressed.

Right now, this commit isn't in a full release, so I don't get the update yet using the pip install --upgrade method. Do you know when it will be in an official release?

Also, I was afraid instagram might be taking measures to block scripts like this. But even if adding a delay (as some people have tried) helps, their next step might be to detect scripts that hit at repeating intervals -- e.g., every 10 seconds. If it's too exact, I could see them detecting that and blocking you anyway.

One thing I wrote into a homespun crawler (which checks prices for items on a web site) several years ago was a an option to randomize the delay. You give it a low bound and high bound (in seconds) -- e.g., 1 to 8, or 3 to 15 -- and each request uses a new random delay within those bounds. That way, you look much more like a human clicking through at random intervals, pausing longer at some images than others. For something like this, maybe you'd even want to have a different (longer) range for videos than for images.

What do you think, would that be a worthwhile option to add?

If you really wanted to make it easier, you could even bundle some of these options together into a "typical" group of settings under a single parameter, maybe -human. I'd definitely still allow for the individual settings, but that could make it easier to get it running successfully.

I'd be tempted to try contributing to the project myself, but I don't really know python.

kattjevfel · 2020-12-03T18:52:13Z

@dsblack

Right now, this commit isn't in a full release, so I don't get the update yet using the pip install --upgrade method. Do you know when it will be in an official release?

As listed in the readme you can do python3 -m pip install --upgrade https://github.com/mikf/gallery-dl/archive/master.tar.gz to get the latest dev version.

UnforeseenOcean · 2020-12-04T10:58:22Z

I can say for certain Instagram is looking for this kind of activity because my account got suspended (but only for the /p/ action):

I will try the new version after the ban is lifted. Can't risk getting banned again!

xibr · 2020-12-06T05:09:46Z

[gallery-dl][error] No suitable extractor found for 'https://www.instagram.com/stories/et2k/2457611747557737659/'

latest dev 1.16.0-dev

phanirithvij · 2020-12-06T10:46:51Z

@xibr #1149 (comment) says

The rewrite is still lacking support for stories, and post listings other than the regular one (e.g. instagram.com/instagram) might not work as before, but at least it won't get you banned anymore.

mikf · 2020-12-07T14:01:59Z

@xibr 2b93515

xibr · 2020-12-07T16:11:22Z

Now it works well with stories. Thanks

xibr · 2020-12-08T04:59:35Z

A question: When trying to download Instagram story All stories download, not a single story. Is this expected?

TestPolygon · 2020-12-08T11:00:32Z

Well, is it possible to download a part of images and save the position to continue from it on the next launch?
For example, I have downloaded 1000 of 2000. Is it possible to continue from 1001 on the next launch? Currently the program performs requests for the first 1000 of images that were downloaded. Requests are performed one by one without pauses for the downloading that leads to the login page (the recheck of 1000 posts requires to perform 84 requests for a short time).

rivke41levp656 · 2020-12-09T21:42:00Z

@mikf The fullname filename field returns None on 1.16 for all users as far as I can tell.

reallyuniquename · 2020-12-10T08:28:08Z

@TestPolygon

is it possible to download a part of images and save the position to continue from it on the next launch?

You couldn't with old extractor and I don't think you can with the new one but I haven't checked that yet.

Try that yourself, you are looking for options -v --range 1000- and -v --download-archive history.sqlite.

TestPolygon · 2020-12-10T14:12:25Z

SQLite DB stores only node IDs, so it can be used only to check (if --range exists) the node with certain ID was downloaded or not. By default it checks the location where files would be downloaded and compares the expected filename with names of files are in this directory.

--verbose was useful to debug. I can say that it is possible to do.

It requires to add, for example --session flag.

With this flag the program should store (in a system file) the current parameters that are required for requesting the next "list page" with accociated url. For example: [{ur1: [param1, param2]}, {ur2: [param1, param2]}]. And use them if they are presented in this file to continue downloading from a certain possition. (If a user has interrupted the downloding via Ctrl+C (for this case it needed to store the params for requesting the current "list page" too), or he was faced with API limit exceed ("login page") when he has requested the next "list page".

A more complicated format example:
[{ url1: { current: { params: [], fullyDownloaded: false }, next: { params: [] }, date: 1607609281 } }]

For instagram it are: tracking_token, query_hash and id.

@mikf ?

mikaljan · 2020-12-10T19:25:26Z

Hi @mikf,

I tried the latest 1.16.0-dev version, and I would get some successful downloads in the beginning, and after a minute or so everything returns a warning, please check the TXT file I've attached:

instagram_log.txt

To enable at least 'some' way to continue downloading from the middle of a user profile listing.

mikf · 2020-12-11T13:18:54Z

@mikaljan This output isn't from the latest dev version. The Unable to fetch data from ... logging message was removed in the rewrite (447488f). Check gallery-dl --version to make sure you are actually using 1.16.0-dev.
I'll release a new version with the fix this weekend. You could just wait until then.

@TestPolygon b88c97b adds a way to at least manually input a cursor value and continue downloading from the current position. The cursor tokens get outputted as debug logging messages or when getting redirected to the login page.

This commit also increases the amount of requested posts per GraphQL from 12 to 50 (the maximum possible). Since the redirect to login page for not logged in users always happens after ~120 requests regardless of how many posts get fetched or how long of a wait time there is in between, this should allow for more posts to get downloaded.

TestPolygon · 2020-12-11T18:55:35Z

Hm, I used pip install --no-cache-dir --upgrade https://github.com/mikf/gallery-dl/archive/master.tar.gz, but I still have the old behavior ("first":+12 and no promt "Use '-o cursor=%s' to continue downloading " on the login page event)

Upd: use pip unistall gallery-dl

syntopikon · 2020-12-12T23:56:35Z

I was experiencing this error previously as well, but after upgrading to 1.16.0, I've yet to encounter it (working across several 2k+ mixed albums).

mikf · 2020-12-13T00:55:21Z

As omnicr0n said, v1.16.0 is out, which should at least somewhat mitigate any rate limit problems with Instagram.

@xibr this is expected and worked like that even before the rewrite. If you want to limit the download to only a specific story ID, use --filter "media_id == 'STORY ID'"

@rivke41levp656 Instagram removed those from all owner fields, it seems. This has nothing directly to do with the rewrite from 447488f. The fullname info was still available a month ago, but now the embedded data in user profile pages like https://www.instagram.com/instagram/ only has
"owner":{"id":"25025320","username":"instagram"}

mikf · 2020-12-13T00:58:46Z

@TestPolygon

$ pip install -U -I --no-deps --no-cache-dir https://github.com/mikf/gallery-dl/archive/master.tar.gz

should work without needing to uninstall.
(I've updated the instructions in the README accordingly)

xibr · 2020-12-13T14:49:35Z

@xibr this is expected and worked like that even before the rewrite. If you want to limit the download to only a specific story ID, use --filter "media_id == 'STORY ID'"

got it, thanks.

left1000 · 2020-12-16T20:11:22Z

So, instagram works, again, yeah! (at least on public follows).

Unfortunately it doesn't work for private accounts (that my account has access to), even having provided instagram with my username/password in the conf file... and I'm fairly sure I did it right because, well, it used to work just fine.

Hrxn · 2020-12-17T13:58:37Z

Does it work if you remove username/password authentication and try it with the exported cookies instead?

mikf · 2020-12-17T15:34:53Z

Forcing a re-login by clearing your cache with gallery-dl --clear-cache and then trying to download from Instagram again might also work.

This was referenced Dec 3, 2020

Instagram is completely broken #1122

Closed

Getting error in Instagram after few pics are downloaded. #1128

Closed

Instagram locked my account #1113

Closed

mikf added the fixed label Dec 3, 2020

mikf pinned this issue Dec 3, 2020

kattjevfel mentioned this issue Dec 4, 2020

JSONDecodeError: Expecting value: line 1 column 1 (char 0) #1155

Closed

mikf added a commit that referenced this issue Dec 5, 2020

[instagram] reimplement support for story highlights (#1149)

76285eb

mikf added a commit that referenced this issue Dec 7, 2020

[instagram] reimplement support for stories (#1149)

2b93515

mikf added a commit that referenced this issue Dec 11, 2020

[instagram] add 'cursor' option (#1149)

b88c97b

To enable at least 'some' way to continue downloading from the middle of a user profile listing.

mikf closed this as completed Dec 13, 2020

mikf unpinned this issue Dec 17, 2020

jl452 mentioned this issue Dec 9, 2021

[Instagram] [story] download story by url (convert story url to filter) #2088

Closed

dl21g5 mentioned this issue Mar 17, 2022

Downloading Instagram profile with 10,000+ posts. What is the optimal approach without getting banned? #2413

Open

AlttiRi mentioned this issue Apr 20, 2022

[question] How would I get cursor, I have tried verbose flag but I dont know what sting from it is cursor and none of them match length of one I got from gallery-dl in other previous operation #2516

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Instagram stopped working #1149

Instagram stopped working #1149

mikaljan commented Dec 1, 2020

iamleot commented Dec 1, 2020

iamleot commented Dec 1, 2020

aeriessy commented Dec 2, 2020 •

edited

Loading

UnforeseenOcean commented Dec 2, 2020 •

edited

Loading

reallyuniquename commented Dec 3, 2020

iamleot commented Dec 3, 2020 via email

reallyuniquename commented Dec 3, 2020

mikf commented Dec 3, 2020 •

edited

Loading

dsblack commented Dec 3, 2020 •

edited

Loading

kattjevfel commented Dec 3, 2020

UnforeseenOcean commented Dec 4, 2020 •

edited

Loading

xibr commented Dec 6, 2020

phanirithvij commented Dec 6, 2020

mikf commented Dec 7, 2020

xibr commented Dec 7, 2020

xibr commented Dec 8, 2020 •

edited

Loading

TestPolygon commented Dec 8, 2020 •

edited

Loading

rivke41levp656 commented Dec 9, 2020

reallyuniquename commented Dec 10, 2020

TestPolygon commented Dec 10, 2020

mikaljan commented Dec 10, 2020 •

edited

Loading

mikf commented Dec 11, 2020

TestPolygon commented Dec 11, 2020 •

edited

Loading

syntopikon commented Dec 12, 2020

mikf commented Dec 13, 2020

mikf commented Dec 13, 2020

xibr commented Dec 13, 2020

left1000 commented Dec 16, 2020

Hrxn commented Dec 17, 2020

mikf commented Dec 17, 2020

Instagram stopped working #1149

Instagram stopped working #1149

Comments

mikaljan commented Dec 1, 2020

iamleot commented Dec 1, 2020

iamleot commented Dec 1, 2020

aeriessy commented Dec 2, 2020 • edited Loading

UnforeseenOcean commented Dec 2, 2020 • edited Loading

reallyuniquename commented Dec 3, 2020

iamleot commented Dec 3, 2020 via email

reallyuniquename commented Dec 3, 2020

mikf commented Dec 3, 2020 • edited Loading

dsblack commented Dec 3, 2020 • edited Loading

kattjevfel commented Dec 3, 2020

UnforeseenOcean commented Dec 4, 2020 • edited Loading

xibr commented Dec 6, 2020

phanirithvij commented Dec 6, 2020

mikf commented Dec 7, 2020

xibr commented Dec 7, 2020

xibr commented Dec 8, 2020 • edited Loading

TestPolygon commented Dec 8, 2020 • edited Loading

rivke41levp656 commented Dec 9, 2020

reallyuniquename commented Dec 10, 2020

TestPolygon commented Dec 10, 2020

mikaljan commented Dec 10, 2020 • edited Loading

mikf commented Dec 11, 2020

TestPolygon commented Dec 11, 2020 • edited Loading

syntopikon commented Dec 12, 2020

mikf commented Dec 13, 2020

mikf commented Dec 13, 2020

xibr commented Dec 13, 2020

left1000 commented Dec 16, 2020

Hrxn commented Dec 17, 2020

mikf commented Dec 17, 2020

aeriessy commented Dec 2, 2020 •

edited

Loading

UnforeseenOcean commented Dec 2, 2020 •

edited

Loading

mikf commented Dec 3, 2020 •

edited

Loading

dsblack commented Dec 3, 2020 •

edited

Loading

UnforeseenOcean commented Dec 4, 2020 •

edited

Loading

xibr commented Dec 8, 2020 •

edited

Loading

TestPolygon commented Dec 8, 2020 •

edited

Loading

mikaljan commented Dec 10, 2020 •

edited

Loading

TestPolygon commented Dec 11, 2020 •

edited

Loading