-
Notifications
You must be signed in to change notification settings - Fork 10
Better scene detection #76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
pythonbits/bb.py
Outdated
def prompt_yesno(question, default): | ||
while True: | ||
choices = '[Y/n]' if default else '[y/N]' | ||
choice = input('%s %s ' % (question, choices)) | ||
if not choice: | ||
return default | ||
elif choice.casefold() == 'y': | ||
return True | ||
elif choice.casefold() == 'n': | ||
return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should go into a separate module
log.error(str(error)) | ||
|
||
path = self['path'] | ||
modified = scene.check_integrity(path, on_error=handle_error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so if it thinks it's not a scene release, it queries the user every time?
The CRC check should stay as an option IMO, since it's the "safest". And it had the advantage of not querying the user if it wasn't found. My usecase for it is headless submissions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so if it thinks it's not a scene release, it queries the user every time?
No. If it finds nothing, it's not a scene release. This should work all the time unless the release group or title were changed. This is an improvement over the old algorithm that always asked the user unless it found the exact title.
The CRC check should stay as an option IMO
I don't object as long as I can disable it.
tests/test_scene.py
Outdated
|
||
import json | ||
import os | ||
import pickle |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for in fact writing tests, but I cannot accept pickle objects into the repository. This is inherently unsafe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's wrong with pickle? How do you propose we store API responses for testing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A pickled object is just that, an object, which can contain almost arbitrary code, i.e. it is a (remote/hidden) code execution vulnerability as you can't even inspect it properly. The API responses could just be stored in plaintext HTML or JSON. If there is HTML then the common structure should of course be factored out.
I'm sorry, but I really don't see the attack vector. We don't pickle arbitrary stuff from
the API, we pickle requests.Response objects that we already trust.
If the attacker controls the API, they would have to find a vulnerability in
requests.Response. And if they can do that, it doesn't matter if malicious objects are
pickled or not.
|
@plotski you as the committer can still hide almost arbitrary code in there, which will get executed if you run tests. It's like a proprietary software blob. This is what I disagree with. |
Oh, right. Good point. I didn't think of that.
I'll work something out without pickles.
|
API responses are now stored as plain text files. Are we good on the other points you raised? |
def _get_basename_noext(path): | ||
# os.path.splitext() removes anything after the rightmost "." which is no good for | ||
# scene release names. | ||
return re.sub(r'\.[a-z0-9]{1,3}$$', '', os.path.basename(path)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this instead of something like os.path.splitext(os.path.basename(path))[0]
? I guess because you want to use it for directories too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
>>> os.path.splitext(os.path.basename("The.Title.2000.x265-GRP"))
('The.Title.2000', '.x265-GRP')
>> os.path.splitext(os.path.basename("The.Title.2000.x265-GRP"))[0]
('The.Title.2000', '.x265-GRP')
|
if 'release_group' not in guess: | ||
log.debug('No release group found - cannot find scene release: {}', guess) | ||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So now the scene check completely fails when you have a renamed file with the group removed. This is a regression. It's also meant to help people discover scene files that they in fact have but that may have been removed due to some renaming script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you suggest? You can't really search for a release without the group and always asking the user is annoying.
To be honest, I've come to the conclusion that the scene rule is annoying. I've seen so many existing renamed and/or modified scene releases. I'm not sure if I really care about this pull request anymore. Nobody else seems to care about this tag.
pythonbits/scene.py
Outdated
def _check_file(relpath, fspath, info): | ||
""" | ||
Return True if `path` is unmodified or None if not sure | ||
|
||
relpath: Relative path that starts with the release name | ||
fspath: Path that exists in the file system | ||
|
||
It is important that `path` is relative and starts with the release/torrent name. | ||
|
||
Raise SceneError if `path` does not match `info` | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
call signature does not match documentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
A scene file in a directory with the wrong folder name yields "scene: False". This is another regression, while it would be possible to detect that this is basically equivalent to a renamed scene release, previously it at least queried the user. |
Can you provide an example?
|
See pm on bb.