Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metadata Postprocessor does not output correct filename #865

Closed
GiovanH opened this issue Jul 3, 2020 · 7 comments
Closed

Metadata Postprocessor does not output correct filename #865

GiovanH opened this issue Jul 3, 2020 · 7 comments

Comments

@GiovanH
Copy link
Contributor

GiovanH commented Jul 3, 2020

I'm seeing some cases where the pathfmt.filename for metadata files is blank, leaving stray .json files. Investigation ongoing.

In metadata.py, the filename method is only replaced with self._filename = self._filename_custom when there is a extension-format option set in the postprocessor configuration, even though the method replacement is necessary in all cases.

@GiovanH
Copy link
Contributor Author

GiovanH commented Jul 3, 2020

Okay, so abbd8fb changed behavior that the metadata postprocessor depended on. In order to keep the changes, pathfmt needs more information about whether its supposed to be naming an image or a pseudo file. Putting together a POC fix.

GiovanH added a commit to GiovanH/gallery-dl that referenced this issue Jul 3, 2020
GiovanH added a commit to GiovanH/gallery-dl that referenced this issue Jul 3, 2020
@GiovanH
Copy link
Contributor Author

GiovanH commented Jul 3, 2020

See PR #866 that fixes this bug

@GiovanH
Copy link
Contributor Author

GiovanH commented Jul 3, 2020

Now, there's actually still a problem there, because output filenames get filenames that look like .....json. This is because the json file, by default, is just {filename}.json, but the filename format string ends with .{extension}, but the extension the dummy file gets passed is empty. I'm not sure how to properly fix this yet?

mikf added a commit that referenced this issue Jul 4, 2020
This prevents pathfmt.filename from potentially being empty.
@mikf
Copy link
Owner

mikf commented Jul 9, 2020

Seems like some of the post["creator"]["image_url"] entries used to build a pseudo filename for metadata files don't have a filename extension, which didn't trigger a filename build and caused pathfmt.filename to be empty and your metadata files to be called .json ("" + ".json")

I think d5bfb0b is a simpler fix then your PR, but I might have "broken" something on your end by getting rid of a valid {filename} entry for metadata files. It works fine with the default settings, but let me know if I should change/revert anything.

In metadata.py, the filename method is only replaced with self._filename = self._filename_custom when there is a extension-format option set in the postprocessor configuration, even though the method replacement is necessary in all cases.

Well, no. Why would it be necessary in all cases? <regular filename> + ".json" works OK enough for a default setting, and you can quite easily modify this behavior by setting extension-format, as you said.

(using extension-format would have also "solved" this problem by forcing a filename creation)

@GiovanH
Copy link
Contributor Author

GiovanH commented Jul 10, 2020

Does d5bfb0b create ".metadata" files when this case hits, or does the name formatter end up removing that down the line?

Why would it be necessary in all cases?

I was wrong about "all cases"; none of the files I was testing had extensions, and so this issue happened with every file. That turned out to just be due to the test of tests.

<regular filename> + ".json" works OK enough for a default setting, and you can quite easily modify this behavior by setting extension-format, as you said.

(using extension-format would have also "solved" this problem by forcing a filename creation)

I wasn't familiar with the extension-format option, and looking at now, I think that it would have been a workaround, but it still seems that the current "normal method" of using metadata.extension is buggy. Perhaps metadata.extension-format should be the only option (with a default value), replacing metadata.extension completely, given the significant overlap?

@GiovanH
Copy link
Contributor Author

GiovanH commented Jul 10, 2020

I think I still am tentatively going to advocate for #866 because

  1. Given that some paths are going to have extensions and others aren't, logic like that added in _filename seems useful
  2. Having a flag for whether or not an object is a real file or a pseudofile seems potentially useful -- explicit flags are nicer than behavior based on pseudonames
  3. It bundles slightly more verbose debug statements

@mikf
Copy link
Owner

mikf commented Feb 21, 2021

See #866 (comment)

@mikf mikf closed this as completed Feb 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants