Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[deviantart:gallery] Distracted into downloading other galleries by extractor.deviantart.extra="stash" #1387

Closed
rautamiekka opened this issue Mar 17, 2021 · 9 comments

Comments

@rautamiekka
Copy link
Contributor

Since 1.7.10 which enabled downloading at least embedded sta.sh/galleries, trying to download a specific gallery got distracted into downloading others once done with the intended gallery (or so it seemed, I didn't fully check) for some reason, but now the intended gallery gets ignored and some other gallery gets downloaded instead.

It seems that while extractor.deviantart.extra is not null it happens since going from "stash" to null fixed it.

Most prominent examples:

My config=https://paste.gg/p/rautamiekka/f63017b52a6b49dfb1a9d385661afcb0

The output with "stash" on the 1st link=https://paste.gg/p/rautamiekka/28538891d13e40d28dba6c93dcf8ae6b

The output with null on the 1st link=https://paste.gg/p/rautamiekka/3189af6a256b40099a175b91ab6e243c

@mikf
Copy link
Owner

mikf commented Mar 18, 2021

Are you using v1.17.0 or a dev snapshot after 5c32a7b or 83f465f? (Neither log specifies a version)

$ gallery-dl --version
1.17.1-dev

$ gallery-dl -o extra=stash https://www.deviantart.com/whoa-br0/gallery/all
/tmp/deviantart/Whoa-Br0/deviantart_873116635_Umiko Ahagon.jpg
/tmp/deviantart/Whoa-Br0/deviantart_872110825_Ellie and Mia.png

$ gallery-dl -o extra=all https://www.deviantart.com/whoa-br0/gallery/all
/tmp/deviantart/Whoa-Br0/deviantart_873116635_Umiko Ahagon.jpg
/tmp/deviantart/Magical-Icon/deviantart_627706034_Aoba Happy Icon.gif
/tmp/deviantart/Whoa-Br0/deviantart_872110825_Ellie and Mia.png

Also why do you have the folders option enabled even though you aren't using its provided metadata? That's one reason why it takes forever to fetch the post from Magical-Icon. You might want to restrict it to only gallery downloads, or disable it outright:

"deviantart": {
   "folders": false,
    "gallery": {
        "folders": true
    }
}

@rautamiekka
Copy link
Contributor Author

rautamiekka commented Mar 18, 2021

Dang, I knew I forgot something, the app version in this case. 1.17.0 off PyPi with pip, which I guess is 5cf593a:

# pip install --upgrade youtube-dl gallery-dl
Requirement already satisfied: youtube-dl in c:\programdata\anaconda3\lib\site-packages (2021.3.3)
Collecting youtube-dl
  Downloading youtube_dl-2021.3.14-py2.py3-none-any.whl (1.9 MB)
     |████████████████████████████████| 1.9 MB 3.2 MB/s
Requirement already satisfied: gallery-dl in c:\programdata\anaconda3\lib\site-packages (1.17.0)
Requirement already satisfied: requests>=2.11.0 in c:\programdata\anaconda3\lib\site-packages (from gallery-dl) (2.25.1)
Requirement already satisfied: idna<3,>=2.5 in c:\programdata\anaconda3\lib\site-packages (from requests>=2.11.0->gallery-dl) (2.10)
Requirement already satisfied: certifi>=2017.4.17 in c:\programdata\anaconda3\lib\site-packages (from requests>=2.11.0->gallery-dl) (2020.12.5)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in c:\programdata\anaconda3\lib\site-packages (from requests>=2.11.0->gallery-dl) (1.26.3)
Requirement already satisfied: chardet<5,>=3.0.2 in c:\programdata\anaconda3\lib\site-packages (from requests>=2.11.0->gallery-dl) (4.0.0)
Installing collected packages: youtube-dl
  Attempting uninstall: youtube-dl
    Found existing installation: youtube-dl 2021.3.3
    Uninstalling youtube-dl-2021.3.3:
      Successfully uninstalled youtube-dl-2021.3.3
Successfully installed youtube-dl-2021.3.14

# gallery-dl --version
1.17.0

Also why do you have the folders option enabled even though you aren't using its provided metadata? That's one reason why it takes forever to fetch the post from Magical-Icon. You might want to restrict it to only gallery downloads, or disable it outright:

"deviantart": {
   "folders": false,
    "gallery": {
        "folders": true
    }
}

EDIT: Fixed an accidental typo in the following sentence:
I aren't ? I compared my old config and extractor.deviantart.postprocessors was moved to extractor.postprocessors to apply to every extractor.

I don't want the gallery/favs folders to be created but I want the folder metadata saved; according to https://github.com/mikf/gallery-dl/blob/master/docs/configuration.rst#extractordeviantartflat the folders are by default not created, and I didn't have that option in the config.

Looks like I misunderstood the sub-category part of https://github.com/mikf/gallery-dl/blob/master/docs/configuration.rst#extractor-options if you can do

"deviantart": {
   "folders": false,
    "gallery": {
        "folders": true
    }
}

@mikf
Copy link
Owner

mikf commented Mar 18, 2021

version in this case. 1.17.0 off PyPi

The changes from #1356 haven't been "officially" released yet, and 1.17.0 only supports true (stash and regular posts) or false/null. Releases are usually every 2 or 3 weeks, and you can use pip install -U -I --no-deps --no-cache-dir https://github.com/mikf/gallery-dl/archive/master.tar.gz to install from the latest commit (from Installation-Pip)

I don't want the gallery/favs folders to be created but I want the folder metadata saved

Then enabling folders is the right thing to do, but your settings from https://paste.gg/p/rautamiekka/f63017b52a6b49dfb1a9d385661afcb0 don't write any folder metadata to a file, only {tags} and {description}.

@Twi-Hard
Copy link

Twi-Hard commented Mar 19, 2021

I'm using the dev version but "stash" is still grabbing a ton of random accounts. It's not as bad as when it's set to "true" though.

Edit: I'm trying to check where it happened.

@Twi-Hard
Copy link

It doesn't seem to be doing it now. I know it was up to date the last time I tried this. And there was a huge difference between "true" and "stash". I'm not sure what happened.

@Twi-Hard
Copy link

Actually it did download a ton of random accounts, I just didn't wait long enough. I can post logs later.

@Twi-Hard
Copy link

Twi-Hard commented Mar 19, 2021

I left it running longer than I should have so there's a lot of output to go through.
This is my config for deviantart:

"deviantart": {
      "archive": null,
      "extra": "stash",
      "cookies": "/mnt/main/temp/gallery-dl/cookies/cookies.txt",
      "include": [
        "gallery",
        "journal",
        "scraps"
      ],
      "folders": true,
      "journals": "html",
      "mature": true,
      "metadata": true,
      "original": true,
      "quality": 100,
      "retries": 30,
      "refresh-token": "cache",
      "oauth": {
        "browser": false,
        "cache": true,
        "port": 6414
      },
      "client-id": redacted,
      "client-secret": redacted,
      "base-directory": "$ROOT/archive/by-source/other/deviantart.com/archive/",
      "category-transfer": true,
      "parent-directory": false,
      "directory": [
        "_extra",
        "{author[username]}"
      ],
      "gallery": {
        "category-transfer": true,
        "directory": [
          "{author[username]}"
        ],
        "folders": true,
        "flat": true
      },
      "journal": {
        "category-transfer": true,
        "directory": [
          "{author[username]}",
          "Journal"
        ],
        "flat": true
      },
      "scraps": {
        "category-transfer": true,
        "directory": [
          "{author[username]}",
          "Scraps"
        ],
        "flat": true
      },
      "favorite": {
        "category-transfer": true,
        "directory": [
          "{author[username]}",
          "Favorites"
        ],
        "flat": false
      }
    },

It makes it put all the "extra" stuff in its own folder.
These are the accounts it went through (from the input list): https://termbin.com/fwg5 (37 accounts)
and these are all the accounts downloaded from because of "extra": https://termbin.com/c5rk (734 accounts)

I really don't know how to identify what's what in the log but the accounts that weren't in the input list seem to happen at the same time from what I've seen:

debug: https://www.deviantart.com:443 "GET /api/v1/oauth2/gallery/7DECDEC5-D4E1-C9DF-A553-41E588800C15?username=Artist-squared&offset=744&limit=24&mature_content=true&mode=newest HTTP/1.1" 200 6085
debug: https://www.deviantart.com:443 "GET /api/v1/oauth2/gallery/7DECDEC5-D4E1-C9DF-A553-41E588800C15?username=Artist-squared&offset=768&limit=24&mature_content=true&mode=newest HTTP/1.1" 200 1836
[deviantart] Collecting folder information for 'KirbyPhelpsPK'
debug: https://www.deviantart.com:443 "GET /api/v1/oauth2/gallery/folders?username=KirbyPhelpsPK&offset=0&limit=50&mature_content=true HTTP/1.1" 200 136
debug: https://www.deviantart.com:443 "GET /api/v1/oauth2/gallery/4C116471-B077-4628-41E0-BC578C10A9FF?username=KirbyPhelpsPK&offset=0&limit=24&mature_content=true&mode=newest HTTP/1.1" 200 7877
[deviantart] Collecting folder information for 'ta-vie-ma-vie'
debug: https://www.deviantart.com:443 "GET /api/v1/oauth2/gallery/folders?username=ta-vie-ma-vie&offset=0&limit=50&mature_content=true HTTP/1.1" 200 284
debug: https://www.deviantart.com:443 "GET /api/v1/oauth2/gallery/9B76EE14-47B4-F971-CBC7-A023CCB11E17?username=ta-vie-ma-vie&offset=0&limit=24&mature_content=true&mode=newest HTTP/1.1" 200 None
debug: https://www.deviantart.com:443 "GET /api/v1/oauth2/gallery/9B76EE14-47B4-F971-CBC7-A023CCB11E17?username=ta-vie-ma-vie&offset=24&limit=24&mature_content=true&mode=newest HTTP/1.1" 200 None
debug: https://www.deviantart.com:443 "GET /api/v1/oauth2/gallery/9B76EE14-47B4-F971-CBC7-A023CCB11E17?username=ta-vie-ma-vie&offset=48&limit=24&mature_content=true&mode=newest HTTP/1.1" 200 1136
debug: https://www.deviantart.com:443 "GET /api/v1/oauth2/gallery/32EB0827-265B-94A2-B53F-5EF1B0D32E22?username=ta-vie-ma-vie&offset=0&limit=24&mature_content=true&mode=newest HTTP/1.1" 200 2143
debug: https://www.deviantart.com:443 "GET /api/v1/oauth2/gallery/FE95278A-B5A6-52D6-0584-63CBC00B860A?username=ta-vie-ma-vie&offset=0&limit=24&mature_content=true&mode=newest HTTP/1.1" 200 2999
debug: https://www.deviantart.com:443 "GET /api/v1/oauth2/gallery/2598F109-A879-9C20-4271-F1AF133C8EF2?username=ta-vie-ma-vie&offset=0&limit=24&mature_content=true&mode=newest HTTP/1.1" 200 6678
[deviantart] Collecting folder information for 'zzpopzz'
debug: https://www.deviantart.com:443 "GET /api/v1/oauth2/gallery/folders?username=zzpopzz&offset=0&limit=50&mature_content=true HTTP/1.1" 200 424
debug: https://www.deviantart.com:443 "GET /api/v1/oauth2/gallery/CC8CF0CB-D18D-5949-1335-07293D35E2CF?username=zzpopzz&offset=0&limit=24&mature_content=true&mode=newest HTTP/1.1" 200 1681
debug: https://www.deviantart.com:443 "GET /api/v1/oauth2/gallery/67879786-A2FB-C7A0-546A-815E4B1176E8?username=zzpopzz&offset=0&limit=24&mature_content=true&mode=newest HTTP/1.1" 200 None

Here's the full log:
deviantart-log.txt

Edit: My version is 1.17.1-dev
Edit 2: I updated to the current most recent version and it's still doing it

mikf added a commit that referenced this issue Mar 19, 2021
@mikf
Copy link
Owner

mikf commented Mar 19, 2021

@Twi-Hard The core of your issue comes from https://www.deviantart.com/adventuretimeclub. (Not on your posted input list, so maybe there's something else really wrong here)
adventuretimeclub is a group, not a regular user, and includes works from 100s of different artists.

"extra": "stash" worked as expected. Your log only shows it spawning new extractors for sta.sh URLs (DeviantartStashExtractor) and none for regular deviations/posts (DeviantartDeviationExtractor).

Speaking of, I've decided to revert the change to extra from 1.17.0 entirely (b0438c8) (edit: "extra": "stash" has now the same effect as "stash": true before 1.17.0 and vice versa, no need to change it back). It caused more confusion and trouble than it was worth. Next time I'll simply use a different option name, e.g. extra_posts, instead of changing an existing option.

@Twi-Hard
Copy link

https://www.deviantart.com/adventuretimeclub was apparently in the actual input list. the termbin link I posted was the list of folders it created in the download directory (which shouldn't be accounts that aren't in the input list which is why I said it's from the input list).

@mikf mikf closed this as completed Apr 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants